Optimization
Optimization finds the “best” possible solutions from a set of feasible points.
An optimization problem takes the form of minimizing (or maximizing) an objective function subject to constraints:
$$ \begin{array}{ll}\mathrm{Minimize}&f(x) \\ \mathrm{subject~to}&x\in\Omega.\end{array} $$
- $x\in\mathbb{R}^n$ is called decision variables.
- $f$ is called the objective function.
- $\Omega\subseteq\mathbb{R}^n$ is called the constraint set / feasible set / feasible region.
The symmetry of $G$ can be assume without loss of generality:
for any $H\in\mathbb{R}^{n\times n}$ (no symmetric) and $x\in\mathbb{R}^n$, we have:
$$\lbrack x^THx\rbrack_{1\times 1}=\lbrack x^THx\rbrack^T_{1\times 1}=x^TH^Tx$$
so:
$$x^THx=x^TH^Tx=\frac12x^T(\underbrace{H^T+H}_{\text{symmetric matrix $G$}})x.$$
For $f(x)=\frac12x^TGx+c^Tx+\beta$ with $G\in\mathbb{R}^{n\times n}$ being symmetric, so:
$$\nabla f(x)=Gx+c.$$
Infimum
$l=\inf S, \inf\emptyset=\infty$
Let $S\subseteq\mathbb{R}$ be a nonempty set. We say that $l\in[-\infty, \infty)$ is the infimum of $S$, if:
- for every $s\in S, s \geq l$
- for every $\zeta>l(\zeta\in\mathbb{R})$, one can find $s\in S$ so that $s<\zeta$
infimum is the largest number that is smaller than everything in $S$.
refer to set the infimum value as the optimal value.
Norm
Common vector norms
- $l_1$ norm: $\vert\vert x\vert\vert_1:=\sum_{i=1}^{n}|x_i|\quad$ (sum of abs value)
- $l_2$ norm: $\vert\vert x\vert\vert_2:=\sqrt{\sum_{i=1}^{n}|x_i|^2}\quad$ ($\vert\vert x\vert\vert_2^2=\begin{bmatrix}x1&x2\end{bmatrix}\begin{bmatrix}x_1\\x_2\end{bmatrix}=x^Tx$)
- $l_\infty$ norm: $\vert\vert x\vert\vert_\infty:=\max_{i\leq i\leq n}|x_i|\quad$ (max abs value)
Common matrix norms
$\vert\vert A\vert\vert_1=\underbrace{\max_j\sum_{i=1}^{n}|a_{ij}|}_{\text{maximum of the $l_1$ norms of columns}}$
- induced matrix norm
$\underset{||x||_1=1}{\max}||Ax||_1$
$\vert\vert A\vert\vert_2=\underbrace{\sqrt{\lambda_{max}(A^TA)}}_{\text{maximum eigenvalue of $A^TA$}}$
- induced matrix norm
$\underset{||x||_2=1}{\max}||Ax||_2$
$\vert\vert A\vert\vert_\infty=\underbrace{\max_i\sum_{j=1}^{n}|a_{ij}|}_{\text{maximum of the $l_1$ norms of rows}}$
- induced matrix norm
$\max_{\vert\vert x\vert\vert_\infty=1}||Ax||_\infty$
$\vert\vert A\vert\vert_F=\underbrace{\sqrt{\sum_{i=1}^{n}\sum_{j=1}^{n}|a_{ij}|^2}}_{\text{known as the Frobenius norm}}$
For symmetric A
$$\underset{||x||=1}{\min}x^TAx=\lambda_{\min}(A)$$
$$\underset{||x||=1}{\max}x^TAx=\lambda_{\max}(A)\xlongequal[]{if A\succ0}||A||_2$$
Positive definite matrices
Positive semidefinite matrices
Let $A\in\mathbb{R}^{n\times n}$ be symmetric, we say that $A$ is positive semidefinite if $x^TAx\geq0$, for all $x\in\mathbb{R}^{n}$. $A\succeq0$
Example:
$$A=\left[\begin{matrix}3&1\\1&2\end{matrix}\right]$$
$$x^TAx=\left[\begin{matrix}x1&x2\end{matrix}\right]\left[\begin{matrix}3&1\\1&2\end{matrix}\right]\left[\begin{matrix}x1\\x2\end{matrix}\right]$$
$$=3x_1^2+2x_1x_2+2x_2^2$$
$$=2x_1^2+(x_1+x_2)^2+x_2^2\geq0$$
Positive definite matrices
Let $A\in\mathbb{R}^{n\times n}$ be symmetric, we say that $A$ is positive definite if $x^TAx>0$, for all $x\in\mathbb{R}^{n}$\{0}.
$A\succ0$
if $D\succ0$ (positive definite)
$d=-D\nabla f(x)$ is a descent direction
Taylor’s theorem
with remainder term
Suppose $f$ is (n+1) times differentiable on an open interval containing $[a,b]$. Then: $$ f(b) = f(a) + f’(a)(b-a) + \frac{f’’(a)}{2!}(b-a)^2 + \cdots + \frac{f^{(n)}(a)}{n!}(b-a)^n + \frac{f^{(n+1)}(\xi)}{(n+1)!}(b-a)^{n+1} $$ for some ${\xi \in (a,b)}$.
with remainder term high-dimensional
Let $f\in C^1(\mathbb{R}^n)$, $x$ and $y\in\mathbb{R}^n$. Then there exists $\xi\in{(1-s)x+sy:s\in(0,1)}$ such that $$f(y)=f(x)+[\nabla f(\xi)]^T(y-x).$$
Let $f\in C^2(\mathbb{R}^n)$, $x$ and $y\in\mathbb{R}^n$. Then there exists $\xi\in {(1-s)x + sy : s \in (0,1)}$ such that $$f(y) = f(x) + [\nabla f(x)]^T (y - x) + \frac{1}{2}(y - x)^T \nabla^2 f(\xi) (y - x). $$
Last modified on 2023-09-22