# Descent Direction Iteration

Class of optimization methods that rely on local information about the objective function. Suppose we are optimizing some function $$f : \mathbb{R}^d \rightarrow \mathbb{R}$$. The general framework is as follows: At iteration $$k$$,

• Check whether $$\mathbf{x}^{(k)}$$ satisfies the termination conditions.
• Determine a descent direction $$\mathbf{d}^{(k)}$$, this could be a unit vector, found via gradient/Hessian information.
• Determine a step size $$\alpha^{(k)}$$ (also called a learning rate).
• Compute the (hopefully better) input point $$\mathbf{x}^{(k+1)} \leftarrow \mathbf{x}^{(k)} + \alpha^{(k)}\mathbf{d}^{(k)}$$.

Some terminal conditions include:

• Number of maximum iterations - $$k : k < k_{\max}$$
• Absolute improvement - $$k : |f(\mathbf{x}^{(k)}) - f(\mathbf{x}^{(k+1)})|<\epsilon$$
• Relative improvement - $$k : |f(\mathbf{x}^{(k)}) - f(\mathbf{x}^{(k+1)})| < \epsilon|f(\mathbf{x}^{(k)})|$$
• Gradient magnitude - $$k : ||\nabla f(\mathbf{x}^{(k)})|| <\epsilon$$

