Newton's Method

295. Newton's Method

295.1. Newton’s Method

Newton’s method is an iterative numerical technique to find roots of a function (i.e., solutions to ).

Step 1. At each step, approximate by its tangent line at the current guess :

Step 2. Set this approximation equal to 0:

Step 3. Solve for :

Step 4. This gives the update rule:

295.1.1. Newton’s Method for Nonlinear Optimization

Used to find the roots of the first derivative , which correspond to the local minima/maxima of the original function .

Finding stationary points of a function , i.e., points where the gradient (derivative) is zero:

It can be derived from either linear approximation of the gradient or quadratic approximation of the function

295.1.1.1. 1D Linear Approximation

Start with the first-order (linear) Taylor expansion of the derivative around :

Stationary point condition: Set the derivative to zero to find a candidate minimum or maximum

Solve for the next iterate :

295.1.1.2. nD Linear Approximation

Start with the first-order (linear) Taylor expansion of the gradient around :

where is the Hessian matrix

Stationary point condition: Set the gradient to zero to find a candidate minimum or maximum:

Solve for the next iterate :

295.1.1.3. 1D Quadratic Approximation

Start with the second-order (quadratic) Taylor expansion of the function around :

Minimize the quadratic approximation: Set the derivative of to zero:

Solve for the next iterate :

295.1.1.4. nD Quadratic Approximation

Start with the second-order (quadratic) Taylor expansion of the function around :

Minimize the quadratic approximation: Set the gradient of with respect to to zero:

Solve for the step and next iterate :

295.1.1.5. Quadratic Approximation

Example

Hello

295.1.2. Multi-Dimensional

second-order iterative algorithm

uses both the

first derivative (gradient)
second derivative (Hessian)

Step 1. Start with initial guess

Step 2. At each iteration :

Compute gradient
Compute Hessian
Update:

Instead of computing the inverse explicitly, we solve the linear system:

Then update:

Example

Step 1. Compute Gradient

Step 2. Compute Hessian

Step 0. Setup

We want to minimize a function

At some iteration , we compute:

Gradient

Hessian

We need the Newton step vector from:

so that the update is:

Step 1. We need the Newton step from:

Which is two equations:

Step 2. Solve without inversion

This is a simple linear system.

For example, solve the first equation for :

Plug into the second:

Then substitute back for :

Step 3. Newton update