## The Moe Nonlinear Optimization LibraryP. LabuteChemical Computing Group Inc.1010 Sherbrooke Street W, Suite 910; Montreal, Quebec; Canada H3A 2R7
An important SVL program library provided with MOE is the unconstrained non-linear optimization library. Among the methods contained in the library are Steepest Descent, Conjugate Gradient, and Truncated Newton. Each of the methods is suitable for large-scale problems; that is, each of the methods requires memory linear in the number of free variables. In this article we present the methods as well as the results of an experiment to compare the methods. Each of the optimization methods has the same basic structure.
Let **Convergence Test.**If convergence tests pass then terminate with*x*[*k*] as the solution. For pre-set convergence parameters*a*and*b*the algorithm terminates if all three of the following conditions are satisfied (for*k*> 1):- |
*f*(*x*[*k*]) -*f*(*x*[*k*] +*t*[*k*]*p*[*k*]) | <*a*{ 1 + |*f*(*x*[*k*])| } - |
*t*[*k*]*p*[*k*] | < sqrt*a*{ 1 + |*f*(*x*[*k*] +*t*[*k*]*p*[*k*] )| } - | grad
*f*(*x*[*k*] +*t*[*k*]*p*[*k*]) | < cbrt*a*|*f*(*x*[*k*])|
or if (for *k*> 0)| grad *f*(*x*[*k*] +*t*[*k*]*p*[*k*]) | <*b*{ 1 + |*f*(*x*[*k*] +*t*[*k*]*p*[*k*])| }where *t*[*k*] and*p*[*k*] are described below.- |
**Search Direction.**Determine a descent direction*p*[*k*]; i.e., a direction such that*g*(*t*) =*f*(*x*[*k*] +*t**p*[*k*])
is locally decreasing. (The determination of the search direction is the only step in which the various methods differ.) **Line Search.**Determine a step size*t*[*k*] that minimizes the function*g*(*t*) =*f*(*x*[*k*] +*t**p*[*k*]). A cubic spline interpolation search procedure is used, safe-guarded with a step size limit and interval bisection.**Update.**Set*x*[*k*+1] =*x*[*k*] +*t*[*k*]*p*[*k*] and go to Step 1.
- Acceptability of the solution;
- Unreasonably slow progress;
- Unreasonable computing resources have been used.
- |
*f*(*x*[*k*]) -*f*(*x*[*k*] +*t*[*k*]*p*[*k*]) | <*a*{ 1 + |*f*(*x*[*k*])| } and |*t*[*k*]*p*[*k*] | < sqrt*a*{ 1 + |*f*(*x*[*k*] +*t*[*k*]*p*[*k*] )| } are designed to test whether the sequence of minimizer estimates is converging. - | grad
*f*(*x*[*k*] +*t*[*k*]*p*[*k*]) | < cbrt*a*|*f*(*x*[*k*])| is designed to test the necessary optimality condition of a vanishing gradient.
These conditions apply mostly for ill-conditioned problems and the usual gradient test will apply in well-conditioned problems.
The Truncated Newton method uses an approximate solution to the
Newton equations to determine a search direction. If, on iteration
f (x[k] + tp) = f
(x[k]) + t g[k]^{T}
p + (1/2) t^{2} p^{T}
G[k] pwe obtain a local quadratic model. The critical points of the
model with respect to *t*= -*g*[*k*]^{T}*p*/*p*^{T}*G*[*k*]*p**G*[*k*]*p*= -*g*[*k*]
G[k] is positive definite.
The equations in (2) are called the Newton Equations.
Newton's Method attempts to directly solve the equations for the
search direction (e.g., by matrix inversion); however, there are a
number of problems with this approach:
- Far away from the local minimum,
*G*[*k*] may not be positive definite. This case must be detected and special techniques must be used to determine the descent direction (the simplest being the use of the Steepest Descent direction). - The memory requirements are large (since
*G*[*k*] must be stored). For a problem of*n*variables,*G*[*k*] requires*n*^{2}words of memory. - The inversion of
*G*[*k*] at each step is computationally expensive. For a problem of*n*variables, inversion of*G*[*k*] requires O(*n*^{3}) operations.
The MOE Truncated Newton method uses two techniques to overcome these problems: - An iterative equation solver is used to solve the Newton
Equations at each step. The iterative method has the advantage that
it can be stopped at any iteration and a search direction can be
reported at any time. This means that, far away from the minimum, a
less accurate solution can be used. Furthermore, since only
matrix-vector multiplies are required of
*G*[*k*], it need not be stored explicitly: a finite difference technique is used to evaluate Hessian-vector products. - The Hessian,
*G*[*k*], is replaced with an approximation matrix that is guaranteed to be positive definite. This approximation will approach the Hessian as the minimum is reached.
As an example, we ran the mentioned optimization methods on the protein 6RXN from the Brookhaven Protein Data Bank. The AMBER 1989 paramter set was used with a cutoff of 7.5A and a distance-based dielectric. Each method began with 5 steps of Steepest Descent and was terminted when the RMS gradient fell below 0.0002. The line search methodology was held fixed for all of the tests; that is, only the selection of descent direction differed. The following table summarizes the results of the experiments. For each method, we report the total number of iterations required, the total number of function evaluations, and the normalized elapsed time required.
Since the library is written entirely in SVL, it can be used to optimize functions written in SVL. If an objective function calculates the function value and gradient it can be optimized with the Optimization Library. The MOE Optimization Library is a powerful tools that has been used in a number of applications including energy minimization, Implicit Euler dynamics, partial charge methods and nonlinear fitting. |