Daniel R. Kim, MD Posts About

Gradient, Directional Derivatives, Steepest Ascent or Descent

24 Nov 2016

The gradient $\nabla f(x)$ is a vector of partial derivatives of $f$ at the point $x$ with respect to each coordinate. One reason the gradient is useful is that we can use it to figure out derivatives of $f$ along any direction $v$, not just along one of the coordinates $x_i$.

Suppose you wish to approximate the change in $f$ if we add some vector $v = (a_i)_{i=1}^n$ to $x,$ namely $\delta f = f(x+v) - f(x)$. Then we simply use Newton’s method component-wise; we sum the contributions $a_i \frac{\partial f}{\partial x_i}$ of each component $a_i$ to $f$, taking the dot product

\[\delta f = f(x+v) - f(x) \approx \nabla f(x) \cdot v .\]

The value of $\frac{\partial f}{\partial x_i}$ is really the change in $f$ with respect to a unit change along $x_i$. Likewise, if we divide $\delta f$ by the distance moved along $v$, we now have the directional derivative along $v$, namely

\[D_v f = \frac{\nabla f \cdot v}{\|v\|} = \frac{\|\nabla f\| \|v\| \cos \theta}{\|v\|} = \|\nabla f\| \cos \theta.\]

(We use $D_v f$ rather than $\frac{\partial f}{\partial v}$ because $v$ is a vector, not one of the coordinates.)

Clearly $D_v f$ is maximal when $\theta = 0$ (ie $v$ is in the direction of $\nabla f$) and minimal when $\theta = \pi$ (opposite direction). In general, a derivative along $v$ is the opposite of the derivative along $-v$.

$D_v f$ is always nonnegative, and of maximum value, along $v=\nabla f$. This makes sense because the sign of each partial derivative in $\nabla f$ matches the direction along the component that gives a positive change in $f$; ie $\frac{\partial f}{\partial x_i} < 0$ denotes a negative change in $f$ in the positive $x_i$ direction, or equivalently, a positive change in $f$ in the negative $x_i$ direction. The relative magnitudes of the components of $v=\nabla f$ are automatically giving the steeper directions more weight.