Optimization Methods

Steepest Descent

Function of One Variable

Search for the minimum of function [Graphics:Images/index_gr_1.gif]  by doing steps proportional to [Graphics:Images/index_gr_2.gif] in the direction, opposite to [Graphics:Images/index_gr_3.gif]

[Graphics:Images/index_gr_4.gif]
[Graphics:Images/index_gr_5.gif]
[Graphics:Images/index_gr_6.gif]
[Graphics:Images/index_gr_7.gif]
[Graphics:Images/index_gr_8.gif]

[Graphics:Images/index_gr_9.gif]

[Graphics:Images/index_gr_10.gif]
[Graphics:Images/index_gr_11.gif]
[Graphics:Images/index_gr_12.gif]

[Graphics:Images/index_gr_13.gif]

[Graphics:Images/index_gr_14.gif]
[Graphics:Images/index_gr_15.gif]
[Graphics:Images/index_gr_16.gif]
[Graphics:Images/index_gr_17.gif]
[Graphics:Images/index_gr_18.gif]
[Graphics:Images/index_gr_19.gif]
[Graphics:Images/index_gr_20.gif]
[Graphics:Images/index_gr_21.gif]

Function of Several Variables

Here we just use the gradient [Graphics:Images/index_gr_22.gif]instead of the derivative

[Graphics:Images/index_gr_23.gif]
[Graphics:Images/index_gr_24.gif]
[Graphics:Images/index_gr_25.gif]
[Graphics:Images/index_gr_26.gif]
[Graphics:Images/index_gr_27.gif]
[Graphics:Images/index_gr_28.gif]
[Graphics:Images/index_gr_29.gif]
[Graphics:Images/index_gr_30.gif]
[Graphics:Images/index_gr_31.gif]
[Graphics:Images/index_gr_32.gif]
[Graphics:Images/index_gr_33.gif]

Newton's Method

Function of One Variable

In order to find minimum of [Graphics:Images/index_gr_34.gif], solve the equation [Graphics:Images/index_gr_35.gif] using the approximation:
[Graphics:Images/index_gr_36.gif]  
                             ==>  [Graphics:Images/index_gr_37.gif]
Therefore.
[Graphics:Images/index_gr_38.gif]  ==>  [Graphics:Images/index_gr_39.gif]  
                            ==>  [Graphics:Images/index_gr_40.gif]  

[Graphics:Images/index_gr_41.gif]
[Graphics:Images/index_gr_42.gif]
[Graphics:Images/index_gr_43.gif]
[Graphics:Images/index_gr_44.gif]
[Graphics:Images/index_gr_45.gif]

[Graphics:Images/index_gr_46.gif]

Function of Several Variables

The case is similar to the above, only we use the gradient [Graphics:Images/index_gr_47.gif]instead of the derivative, and matrix of second derivatives (kinda "gradient of a gradient") instead of the second derivative.

[Graphics:Images/index_gr_48.gif]
[Graphics:Images/index_gr_49.gif]
[Graphics:Images/index_gr_50.gif]
[Graphics:Images/index_gr_51.gif]
[Graphics:Images/index_gr_52.gif]
[Graphics:Images/index_gr_53.gif]

Gauss-Newton's Method

Suppose there is a vector-valued function [Graphics:Images/index_gr_54.gif] and we need to minimize the value of [Graphics:Images/index_gr_55.gif].
Linearizing [Graphics:Images/index_gr_56.gif] we get that for each [Graphics:Images/index_gr_57.gif]
[Graphics:Images/index_gr_58.gif][Graphics:Images/index_gr_59.gif][Graphics:Images/index_gr_60.gif]
which in vector form is:
[Graphics:Images/index_gr_61.gif]    where [Graphics:Images/index_gr_62.gif] is the Jacobian matrix of [Graphics:Images/index_gr_63.gif] and [Graphics:Images/index_gr_64.gif]

To minimize [Graphics:Images/index_gr_65.gif] we solve [Graphics:Images/index_gr_66.gif] using the obtained approximation:
[Graphics:Images/index_gr_67.gif]
[Graphics:Images/index_gr_68.gif]

Expressing [Graphics:Images/index_gr_69.gif] we get the step of the algorithm:
[Graphics:Images/index_gr_70.gif]

[Graphics:Images/index_gr_71.gif]
[Graphics:Images/index_gr_72.gif]
[Graphics:Images/index_gr_73.gif]
[Graphics:Images/index_gr_74.gif]
[Graphics:Images/index_gr_75.gif]
[Graphics:Images/index_gr_76.gif]
[Graphics:Images/index_gr_77.gif]
[Graphics:Images/index_gr_78.gif]
[Graphics:Images/index_gr_79.gif]
[Graphics:Images/index_gr_80.gif]
[Graphics:Images/index_gr_81.gif]
[Graphics:Images/index_gr_82.gif]
[Graphics:Images/index_gr_83.gif]
[Graphics:Images/index_gr_84.gif]

© Konstantin Tretyakov


Converted by Mathematica      March 13, 2003
Original file is available here.