295. Newton's Method
295.1. Newton’s Method
Newton’s method is an iterative numerical technique to find roots of a function (i.e., solutions to ).
Step 1. At each step, approximate by its tangent line at the current guess :
Step 2. Set this approximation equal to 0:
Step 3. Solve for :
Step 4. This gives the update rule:
295.1.1. Newton’s Method for Nonlinear Optimization
Used to find the roots of the first derivative , which correspond to the local minima/maxima of the original function .
Finding stationary points of a function , i.e., points where the gradient (derivative) is zero:
- 1D:
- nD:
It can be derived from either linear approximation of the gradient or quadratic approximation of the function
295.1.1.1. 1D Linear Approximation
- Start with the first-order (linear) Taylor expansion of the derivative around :
- Stationary point condition: Set the derivative to zero to find a candidate minimum or maximum
- Solve for the next iterate :
295.1.1.2. nD Linear Approximation
- Start with the first-order (linear) Taylor expansion of the gradient around :
where is the Hessian matrix
- Stationary point condition: Set the gradient to zero to find a candidate minimum or maximum:
- Solve for the next iterate :
295.1.1.3. 1D Quadratic Approximation
- Start with the second-order (quadratic) Taylor expansion of the function around :
- Minimize the quadratic approximation: Set the derivative of to zero:
- Solve for the next iterate :
295.1.1.4. nD Quadratic Approximation
- Start with the second-order (quadratic) Taylor expansion of the function around :
- Minimize the quadratic approximation: Set the gradient of with respect to to zero:
- Solve for the step and next iterate :
295.1.1.5. Quadratic Approximation
Example
Hello
295.1.2. Multi-Dimensional
second-order iterative algorithm
uses both the
- first derivative (gradient)
- second derivative (Hessian)
Step 1. Start with initial guess
Step 2. At each iteration :
- Compute gradient
- Compute Hessian
- Update:
Instead of computing the inverse explicitly, we solve the linear system:
Then update:
Example
Step 1. Compute Gradient
Step 2. Compute Hessian
Step 0. Setup
We want to minimize a function
At some iteration , we compute:
- Gradient
- Hessian
We need the Newton step vector from:
so that the update is:
Step 1. We need the Newton step from:
Which is two equations:
Step 2. Solve without inversion
This is a simple linear system.
For example, solve the first equation for :
Plug into the second:
Then substitute back for :
Step 3. Newton update