[[Gradient]] Descent is a type of iterative method for [[Optimization]] for a multivariable [[Derivative|Differentiable]] [[Function]].
$\huge
\mathbf{\vec x}_{n+1} =
\mathbf{\vec x}_{n} + \gamma \nabla f(\mathbf{\vec x}_{n})
$
$\huge \gamma \in \R $
>[!note] Note that $\gamma$ is often called the 'learning rate' when referring to [[Machine Learning]]
>[!example] Example: 2 Variables
>$\huge
\begin{align}
>z = f(x,y)
>\end{align} $
>
>Find $p_x,p_y$ such that $f(p_{x},p_{y})$ is [[Maxima|maximized]].
>
>We take the [[Gradient]] of $f$ and use it for each iteration.
>
>$\huge \begin{align}
>
>\mat{x_{i+1} \\ y_{i+1}} =
>
>\mat{x_{i} \\ y_{i}} +
>\gamma
>\nabla f\pa{
>\mat{x_{i} \\ y_{i}}
>}
>
>
>\end{align} $
>
>[!info] [[Alexander Young|Young's]] Favorite Method
A posible test for checking that your learn rate is too high, is to cut the learn rate $\gamma$ in half if $\nabla f(x_{i}) \cdot \nabla f(x_{i+1}) < -\iota$, where $\iota\in \R^{+}$ and is the cutting rate (ex. $\frac{1}{2}$).