active-set.tex 8.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234
  1. \documentclass{diary}
  2. \title{Active set solver for quadratic programming}
  3. \author{Alec Jacobson}
  4. \date{18 September 2013}
  5. \renewcommand{\A}{\mat{A}}
  6. \renewcommand{\Q}{\mat{Q}}
  7. \newcommand{\RR}{\mat{R}}
  8. \newcommand{\Aeq}{\mat{A}_\textrm{eq}}
  9. \newcommand{\Aieq}{\mat{A}_\textrm{ieq}}
  10. \newcommand{\Beq}{\vc{B}_\textrm{eq}}
  11. \newcommand{\Bieq}{\vc{B}_\textrm{ieq}}
  12. \newcommand{\lx}{\Bell}
  13. \newcommand{\ux}{\vc{u}}
  14. \newcommand{\lameq} {\lambda_\textrm{eq}}
  15. \newcommand{\lamieq}{\lambda_\textrm{ieq}}
  16. \newcommand{\lamlu} {\lambda_\textrm{lu}}
  17. \begin{document}
  18. Quadratic programming problems (QPs) can be written in general as:
  19. \begin{align}
  20. \argmin \limits_\z &
  21. \z^\transpose \A \z + \z^\transpose \B + \text{ constant}\\
  22. \text{subject to } & \Aieq \z ≤ \Bieq,
  23. \end{align}
  24. where $\z \in \R^n$ is a vector of unknowns, $\A \in \R^{n \times n}$ is a (in
  25. our case sparse) matrix of quadratic coefficients, $\B \in \R^n$ is a vector of
  26. linear coefficients, $\Aieq \in \R^{m_\text{ieq} \times n}$ is a matrix (also
  27. sparse) linear inequality coefficients and $\Bieq \in \R^{m_\text{ieq}}$ is a
  28. vector of corresponding right-hand sides. Each row in $\Aieq \z ≤ \Bieq$
  29. corresponds to a single linear inequality constraint.
  30. Though representable by the linear inequality constraints above---linear
  31. \emph{equality} constraints, constant bounds, and constant fixed values appear
  32. so often that we can write a more practical form
  33. \begin{align}
  34. \argmin \limits_\z &
  35. \z^\transpose \A \z + \z^\transpose \B + \text{ constant}\\
  36. \text{subject to } & \z_\text{known} = \Y,\\
  37. & \Aeq \z = \Beq,\\
  38. & \Aieq \z ≤ \Bieq,\\
  39. & \z ≥ \lx,\\
  40. & \z ≤ \ux,
  41. \end{align}
  42. where $\z_\text{known} \in \R^{n_\text{known}}$ is a subvector of our unknowns $\z$ which
  43. are known or fixed to obtain corresponding values $\Y \in \R^{n_\text{known}}$,
  44. $\Aeq \in \R^{m_\text{eq} \times n}$ and $\Beq \in \R^{m_\text{eq} \times n}$
  45. are linear \emph{equality} coefficients and right-hand sides respectively, and
  46. $\lx, \ux \in \R^n$ are vectors of constant lower and upper bound constraints.
  47. \todo{This notation is unfortunate. Too many bold A's and too many capitals.}
  48. This description exactly matches the prototype used by the
  49. \texttt{igl::active\_set()} function.
  50. The active set method works by iteratively treating a subset (some rows) of the
  51. inequality constraints as equality constraints. These are called the ``active
  52. set'' of constraints. So at any given iterations $i$ we might have a new
  53. problem:
  54. \begin{align}
  55. \argmin \limits_\z &
  56. \z^\transpose \A \z + \z^\transpose \B + \text{ constant}\\
  57. \text{subject to } & \z_\text{known}^i = \Y^i,\\
  58. & \Aeq^i \z = \Beq^i,
  59. \end{align}
  60. where the active rows from
  61. $\lx ≤ \z ≤ \ux$ and $\Aieq \z ≤ \Bieq$ have been appended into
  62. $\z_\text{known}^i = \Y^i$ and $\Aeq^i \z = \Beq^i$ respectively.
  63. This may be optimized by solving a sparse linear system, resulting in the
  64. current solution $\z^i$. For equality constraint we can also find a
  65. corresponding Lagrange multiplier value. The active set method works by adding
  66. to the active set all linear inequality constraints which are violated by the
  67. previous solution $\z^{i-1}$ before this solve and then after the solve
  68. removing from the active set any constraints with negative Lagrange multiplier
  69. values. Let's declare that $\lameq \in \R^{m_\text{eq}}$, $\lamieq \in
  70. \R^{m_\text{ieq}}$, and $\lamlu \in \R^{n}$ are the Lagrange multipliers
  71. corresponding to the linear equality, linear inequality and constant bound
  72. constraints respectively. Then the abridged active set method proceeds as
  73. follows:
  74. \begin{lstlisting}[keywordstyle=,mathescape]
  75. while not converged
  76. add to active set all rows where $\Aieq \z > \Bieq$, $\z < \lx$ or $\z > \ux$
  77. $\Aeq^i,\Beq^i \leftarrow \Aeq,\Beq + \text{active rows of} \Aieq,\Bieq$
  78. $\z_\text{known}^i,\Y^i \leftarrow \z_\text{known},\Y + \text{active indices and values of} \lx,\ux$
  79. solve problem treating active constraints as equality constraints $\rightarrow \z,\lamieq,\lamlu$
  80. remove from active set all rows with $\lamieq < 0$ or $\lamlu < 0$
  81. end
  82. \end{lstlisting}
  83. The fixed values constraints of $\z_\text{known}^i = \Y^i$ may be enforced by
  84. substituting $\z_\text{known}^i$ for $\Y^i$ in the energy directly during the
  85. solve. Corresponding Lagrange multiplier values $\lambda_\text{known}^i$ can
  86. be recovered after we've found the rest of $\z$.
  87. The linear equality constraints $\Aeq^i \z = \Beq^i$ are a little trickier. If
  88. the rows of
  89. $\Aeq^i \in \R^{(<m_\text{eq}+ m_\text{ieq}) \times n}$ are linearly
  90. independent then it is straightforward how to build a Lagrangian which enforces
  91. each constraint. This results in solving a system roughly of the form:
  92. \begin{equation}
  93. \left(
  94. \begin{array}{cc}
  95. \A & {\Aeq^i}^\transpose \\
  96. \Aeq^i & \mat{0}
  97. \end{array}
  98. \right)
  99. \left(
  100. \begin{array}{l}
  101. \z\\\lambda^i_\text{eq}
  102. \end{array}
  103. \right)
  104. =
  105. \left(
  106. \begin{array}{l}
  107. -2\B\\
  108. -\Beq^i
  109. \end{array}
  110. \right)
  111. \end{equation}
  112. \todo{Double check. Could be missing a sign or factor of 2 here.}
  113. If the rows of $\Aeq^i$ are linearly dependent then the system matrix above
  114. will be singular. Because we may always assume that the constraints are not
  115. contradictory, this system can still be solved. But it will take some care.
  116. Some linear solvers, e.g.\ \textsc{MATLAB}'s, seem to deal with these OK.
  117. \textsc{Eigen}'s does not.
  118. Without loss of generality, let us assume that there are no inequality
  119. constraints and it's the rows of $\Aeq$ which might be linearly dependent.
  120. Let's also assume we have no fixed or known values. Then the linear system
  121. above corresponds to the following optimization problem:
  122. \begin{align}
  123. \argmin \limits_\z &
  124. \z^\transpose \A \z + \z^\transpose \B + \text{ constant}\\
  125. \text{subject to } & \Aeq \z = \Beq.
  126. \end{align}
  127. For the sake of cleaner notation, let $m = m_\text{eq}$ so that $\Aeq \in \R^{m
  128. \times n}$ and $\Beq \in \R^m$.
  129. We can construct the null space of the constraints by computing a QR
  130. decomposition of $\Aeq^\transpose$:
  131. \begin{equation}
  132. \Aeq^\transpose \P = \Q \RR =
  133. \left(\begin{array}{cc}
  134. \Q_1 & \Q_2
  135. \end{array}\right)
  136. \left(\begin{array}{c}
  137. \RR\\
  138. \mat{0}
  139. \end{array}\right)=
  140. \Q_1 \RR
  141. \end {equation}
  142. where $\P \in \R^{m \times m}$ is a sparse permutation matrix, $\Q \in \R^{n
  143. \times n}$ is orthonormal, $\RR \in \R^{n \times m}$ is upper triangular. Let
  144. $r$ be the row rank of $\Aeq$---the number of linearly independent rows---then
  145. we split $\Q$ and $\RR$ into $\Q_1 \in \R^{n \times r}$, $\Q_2 \in \R^{n \times
  146. n-r}$, and $\R_1 \in \RR^{r \times m}$.
  147. Notice that
  148. \begin{align}
  149. \Aeq \z &= \Beq \\
  150. \P \P^\transpose \Aeq \z &= \\
  151. \P \left(\Aeq^\transpose \P\right)^\transpose \z &= \\
  152. \P\left(\Q\RR\right)^\transpose \z &= \\
  153. \P \RR^\transpose \Q^\transpose \z &= \\
  154. \P \RR^\transpose \Q \z &= \\
  155. \P \left(\RR_1^\transpose \mat{0}\right)
  156. \left(\begin{array}{c}
  157. \Q_1^\transpose\\
  158. \Q_2^\transpose
  159. \end{array}\right) \z &=\\
  160. \P \RR_1^\transpose \Q_1^\transpose \z + \P \0 \Q_2^\transpose \z &= .
  161. \end{align}
  162. Here we see that $\Q_1^\transpose \z$ affects this equality but
  163. $\Q_2^\transpose \z$ does not. Thus
  164. we say that $\Q_2$ forms a basis that spans the null space of $\Aeq$. That is, $\Aeq \Q_2 \w =
  165. \vc{0},
  166. \forall \w \in \R^{n-r}$. Let's write
  167. \begin{equation}
  168. \z = \Q_1 \w_1 + \Q_2 \w_2.
  169. \end{equation}
  170. We're only interested in solutions $\z$ which satisfy our equality constraints
  171. $\Aeq \z = \Beq$. If we plug in the above then we get
  172. % P RT Q1T z = Beq
  173. % RT Q1T z = PT Beq
  174. % Q1T z = RT \ PT Beq
  175. % w1 = RT \ PT * Beq
  176. % z = Q2 w2 + Q1 * w1
  177. % z = Q2 w2 + Q1 * RT \ P * Beq
  178. % z = Q2 w2 + lambda0
  179. % Aeq z = Beq
  180. % Aeq (Q2 w2 + Q1 w1) = Beq
  181. % Aeq Q1 w1 = Beq
  182. The following table describes the translation between the entities described in
  183. this document and those used in \texttt{igl/active\_set} and
  184. \texttt{igl/min\_quad\_with\_fixed}.
  185. \begin{lstlisting}[keywordstyle=,mathescape]
  186. $\A$: A
  187. $\B$: B
  188. $\z$: Z
  189. $\Aeq$: Aeq
  190. $\Aieq$: Aieq
  191. $\Beq$: Beq
  192. $\Bieq$: Bieq
  193. $\lx$: lx
  194. $\ux$: ux
  195. $\lamieq$: ~Lambda_Aieq_i
  196. $\lamlu$: ~Lambda_known_i
  197. $\P$: AeqTE
  198. $\Q$: AeqTQ
  199. $\RR$: AeqTR
  200. $\RR^\transpose$: AeqTRT
  201. $\Q_1$: AeqTQ1
  202. $\Q_1^\transpose$: AeqTQ1T
  203. $\Q_2$: AeqTQ2
  204. $\Q_2^\transpose$: AeqTQ2T
  205. $w_2$: lambda
  206. \end{lstlisting}
  207. % Resources
  208. % http://www.cs.cornell.edu/courses/cs322/2007sp/notes/qr.pdf
  209. % http://www.math.uh.edu/~rohop/fall_06/Chapter2.pdf
  210. % http://www.math.uh.edu/~rohop/fall_06/Chapter3.pdf
  211. \end{document}