BIB-VERSION:: CS-TR-v2.0
ID:: CORNELLCS//TR93-1385
ENTRY:: 1993-10-14
ORGANIZATION:: Cornell University, Computer Science Department
LANGUAGE:: English
TITLE:: Centering, Trust Region, Reflective Techniques for Nonlinear 
        Minimization Subject to Bounds
AUTHOR:: Li, Yuying
DATE:: September 1993
PAGES:: 16
ABSTRACT::
Bound-constrained nonlinear minimization problems occur frequently in 
practice. Most existing methods belong to an active set type which can be slow 
for large scale problems. Recently, we proposed a new approach [7,6,8] which 
generates iterates within the strictly feasible region. The method in [8] is a 
trust region type and, unlike the existing trust region method for 
bound-constrained problems, the conditions for its strong convergence 
properties are consistent with algorithm implementation. A reflective 
technique can be included in the method. In this paper, we motivate 
techniques which are important for our new approach. Numerical experience on 
some medium size problems is included. 
END:: CORNELLCS//TR93-1385
BODY::
Centering, Trust Region, Reflective
Techniques for Nonlinear Minimization
Subject to Bounds*
Yuying Li**
TR 93-1385
September 1993
Department of Computer Science
Cornell University
Ithaca, NY 14853-7501
*This paper was written in preparation for the Sixth Leslie Fox Prize presentation in
June of 1993 in Oxford, England. A condensed version of the paper appeared in the
proceedings of the 1993 Conference on Scientific and Engineering Computing for
Chinese Young Scientists.
** Research partially supported by the Applied Mathematical Sciences Research
Program (KC-04-02) of the Office of Energy Research of the U.S. Department of
Energy under grant DE-FGO2-86ER25013.AOOO, and in part by NSF, AFOSR, and
ONR through grant DMS-8920550, and by the Cornell Theory Center which receives
major funding from the National Science Foundation and IBM Corporation, with
additional support from New York State and members of its Corporate Research
Institute.
Centering, Trust Region, Reflective Techniques for Nonlinear Miniinization
Subject to Bounds'
Yuying Li2
Computer Science Department
and
Advanced Computing Research Institute3
Cornell University
Ithaca, New York 14853
September 17, 1993
Abstract. Bound-constrained nonlinear minimization problems occur frequently in practice. Most
existing methods belong to an active set type which can be slow for large scale problems. Recently, we
proposed a new approach [7,6, 8] which generates iterates within the strictly feasible region. The method
in [8] is a trust region type and, unlike the existing trust region method for bound-constrained problems,
the conditions for its strong convergence propenies are consistent with algorithm implementation. A
reflective technique can be included in the method. In this paper, we motivate techniques which are
impoflant for our new approach. Numerical experience on some medium size problems is included.
1 This paper was written in preparation for the Sixth Leslie Fox Prize presentation in June of 1993 in Oxford, England. A
condensed version of the paper appeared in the proceedings of the 1993 Conference on Scientific and Engineering Computing
for Chinese Young Scientists.
2 Research partially supported by the Applied Mathematical Sciences Research Frogram (KC-O4?2) of the Office of Energy
Research of the U.S. l)epartsnent of Energy under grant DE-FGO2-86ER2Sol3Aooo, and in pan by NSF, AFOSR, and ONR
through grant DM5-8920550, and by the Cornell Theory Center which receives major funding from the National Science
Foundation and IBM Corporation, with additional support from New York State and members of its Corporate Research Institute.
? The Advanced Computing Research Institute is a unit of the Cornell Center for Theory and Simulation in Science and
Engineering Crheory Center).
1. Introduction. We consider an importantclass ofnoniinearminimizationproblems: minimization
ofa smoothnonlinear flinction subject to bound constraints on the variables. A bound constrained problem
can be described as
(l.1)			xW'? f(x) 1< x <u,
wherel E f? U ?--HcoJJ?, t E f? U [ooJ1?, 1 < t, and f: ?? ?. We denote the feasible set
?d=ef ?: 1< x < uJ andthestrictinteriorint(T) def ?:I < x <ttJ.
Solving this problem involves two tasks: one is to efficiently handle bounds imposed on the variables.
The other is the nonlinear aspect of the problem. The presence ofbound restrictions adds a combinatorial
nature to the problem and makes this problem more complicated but interesting.
Recently we have developed a new approach which has good computational performance and strong
theoretical properties [7, 6, 8]. The purpose of this paper is to discuss important techniques which
are essential to the development of our methods. The presentation of the paper is divided into three
components. In 2, we first indicate that for large problems, many existing approaches can be inefficient
because of the way bounds are handled. By examing a simpler problem, namely a linear program, we
indicate possible techniques to achieve better efficiency. Then, in 3, we will review the trust region
idea which is an appealing way to handle the nonlinear aspect of a problem. Moreover, we will illustrate
the gap, between the convergence theory and implementation, which exists for other trust region type
methods (e.g., [9]). Finally, in 4, we motivate and describe in detail our new approach which, we
believe, is superior to existing approaches in both regards. Some numerical experience is reported in 5
and conclusion remarks are given in 6.
The following assumption is made throughout the presentation:
(AS.l) Given an initial point xo E ?, we assume that ? is compact where ? is the level set,
i.e.,  = fx: x E ? and f(x) < f(x0)).
2. Techniques for Handling Linear Inequalities. There are many special algorithms tailored for
minimizing noniinear flinctions with bound constraints (e.g., [3, 7, 9, 10, 13, 15, 20]). Most existing
methods handle bounds on variables using an active set strategy.
Minimization algorithms typically generate a sequence of points fxkY. Each new point Xk+1 is a
step from the current point Xk Xh+1 = Xk + Sk. The step Sk is usually a good step which can result in
sufficient decrease of f(x).
An active set method for (1.1) seeks an improved point Xk+1 E ? by keeping the variables currently
at their bounds fixed, when possible. At the end of each iteration, a new variable may reaches one of its
bounds (and thus becomes a binding variable). If it is not possible to reduce f(x) under the requirement
that all the binding variables remain at their bounds, e.g., at a vertex, a decision is made to leave the
binding bound of a variable. If this is not possible, a local minimumof(l.l) is found. The combinatorial
nature of the active set technique makes it unsuitable for large-scale problems. Although techniques
allowing for more rapid change of the active set have been considered for recent active set type methods
(e.g.,[4, 9, 10]), the sequential boundary following nature of the active set approach remains. Therefore,
the number of iterations required to solve aproblem typically grows as the dimensionofthe problem (1.1)
increases.
In order to see how bound restrictions can be dealt with more efficiently, we temporarily simplify
the problem (1.1) by assuming that its objective flinction is linear. Under this assumption we have a very
simple linear programming problem.
Until about a decade ago, the only well-accepted efficient method to solving a general linear program-
ming problem had been the simplex method. The simplex method is essentially an active-set method. In
2
lilterior McthodThj?ctcry
Simplex Me?hodTra?ory
FIG. 1. Thajedortes ofthe Intertor Potnt Method and Simplex Method
1984, Karmaikar proposed a projective scaling method which has generated tremendous interest in op-
timization community. In addition to its polynomial complexity, Karmarkar's projective scaling method
has an important and attractive propefly: it typically reqwres a small (hetween 30-70) numberofiterations
to solve a large problem (tens of thousands of variables). This feature makes the method particularly
suitable for large-scale problems.
The simplex method and projective scaling method are very different. While the former is a finite
iterative method, the latter an infinite iterative method. Geometrically, the projective scaling method
looks distinctively different from the traditional simplex method: a solution is approximated from the
points within the center of the feasible region rather than following its boundaries, seeHG. 1.
Karmarkar's projective scaling algorithin has promoted a class of interior point algorithms for linear
programming problems. The aliine scaling method is an example. In fact, in practice, it is the afline
scaling method which has enjoyed the most use. The affine scaling method was proposed and analyzed
by Barnes in 1986 [2] and Vanderbei, Meketon and Freeman (1986) [23]. This method is a resurrection
of an old method, first suggested and analyzed by Dikin in 1967 [11]. The iterates generated by the alfine
scaling method stay within the strictly feasible region. Compared to the Karmarkar's projective scaling
method, the affine scaling method is simple, intuitive and works on the problem formulation directly.
Despite the lack ofpolynomial complexity [18], an alfine scaling method usually performs well in practice
[1, 23] and has the similar properties that a small number of iterations is typically needed to solve a large
problem.
Since our problem (1.1) is nonconvex and we are not concemed with its complexity issues, we are
interested in alline scaling techniques here. We use the box constraint to illustrate the basic idea of the
alfine scaling method: the centering scheme.
Assume that we have a strictly feasible Xk E int(Y), i.e., 1 < Xk < U.
As indicated in HG. 2, the sides of the box restrict the movement of the current point xk. Hence it
is reasonable to change units of each variable so that, in the new coordinates, the current feasible point
Xk becomes equally distant to all the nearest sides of the box: Xk --H D?emin(xk --H 1k,?k --H Xk) =
D??emin(xk --H 1k,1tk --H xk) = ewhere
= diag(min(x? --H 1k, Uk --H
The benefit of centering is that, in the new coordinates, sufficient progress can be made to reduce the
objective ftmction before a bound of a variable is reached.
3
xk
x - space
FIG. 2. Effect ofAffine Scaimg Thansfor?atton z =			x
A??space
In particular, the best direction to take, in the new coordinates, is the steepest descent: dk
(D??c)?lc. This corresponds to the scaled steepest direction dk = ?(D?e)?2c in the original
variable space. This direction is geometrically angled away from the approaching bound, see HO. 2.
Moving along dk, a step is determined from the current point to the nearest boundary and a large fraction
(e.g.,0.9) of this step is taken to stay strictly feasible.
From dk = ?(D??e)?2c, it is clear that, while the active set methods follow the boundary exactly
in a sequential fashion, the affine scaling method traces the boundary approximately but simultaneously.
This explains why the aliine scaling method typically takes fewer iterations to reach the neighborhood
of a solution (the total number of iterations is usually relatively insensitive to the problem size). For
nonlinear minimization problems, curvature information is a crucial factor in determining a good step.
Restricting the iterates to stay on the boundary may fail to produce such a step.
Although the aiiiine scaling method can approach the neighborhood of a solution quickly, it is a
linearly convergent method.
Our first objectives is to adapt the aifine scaling technique to the problem of minimizing a nonlinear
function subject to bounds in a manner that quadratic convergence can be achieved.
3. ?ust Region Idea. In this section, we concentrate on the other impoflant aspect of a nonlinear
bound constrained problem: the nonlinear function f(x). The nonlinearaspect ofan optimizationproblem
can be successftilly dealt with by the trust region idea.
The trust region idea dates back to Levenberg (l944,[l6j) and Marquadt (l963,[l7]) for nonlinear
least squares problems. Since then, the trust region methods have been developed for unconstrained
irni??nnn?1lneari			ineilnonili			uiTii7Mii			nil--H			if??-?tic			tinondiffe
II11rL.??.?????.??y consw??.? o+.????aedr mII?....????n, flOL?.ll?dt Sy6Wffl U t?qu??Jns, w? .????.???r-
entiable minimization.
Trust region methods form a respected class of algorithrns for solving unconstrained minimization
problems. Their popularity comes from the naturalness of the approach, strong convergence properties
and the development of reliable, efficient software.
The motivation of a trust region method for in??xE?n f(x) is very simple. Following Taylor's
theorem, within a sufficiently small neighborhood IIDksII < Ak (trust region) of the current point
xk, the nonlinear function f(xk + s) can be well approximated by the quadratic f(xk) + `kk(5) where
def T5+ lsTHks. Let us express the neighborhood by aboundA? on the steps. Then the original
= 9k 2
problem min? f(x) can be approximated solved by the following trust region subproblem:
(3.1)			def g?T? + 1 T
Bks : IIDksJI < A&1.
Here Bk is an approximation to the Hessian matrix ?f(xh), Dk is a scaling matrix and Ak is a trust region
size. The scaling matrix hk is not essential to the basic trust region idea. However, it is important if a
4
problem is badly scaled. The strong convergence properties are achieved when Sk is a good approximation
to a global solution of (3.1).
To determine the step 8k completely, we need to specily the trust region size Ak. The trust region size
Ak is updated in a simple manner based on the agreement between the nonlinear lunction f(x) --H f(xk)
and the quadratic approximation `kk(x). This agreement is measured by the ratio
J def
Pk = (f(xk + sk) --H f(xk))/?(sk).
f
An iteration with Pk > 1 15 said to be successful. Otherwise, the iteration is unsuccessful. The aim of trust
region size updating is to force P?k > ? and hence ensure sufficient reduction of the objective flinction. A
general trust region scheme for unconstrained minimization of f(x) is described in HG. 3.
Algorithm 0: Unconstrained Trust Region Method
Fork =0,1,..
1. Compute f(xk) and the model ?k
2. Define an ayProximate solution Sk to subproblem (3.1).
3. Compute Pk = (f(xk + sk) --H f(xk))/?(Sk).
4. ?P?k > ? then Xk+l = Xk + Sk. Otherwise Xk+1 = Xk.
5. Update the scaling matrix Dk and Ak.
Updating Trust Region Size
LetO<?<"< land?1 <1 <?begiven
1. 1?P?k < ithenA?+i E (O,?Ak].
2. If pf E (i,n)thenA?+i E [?lAk,Ak].
3. If P?k > ?thenA?+i E [Ak,?Ak].
?o.3. TruslRegwnMethodfor UnconsLramedMin?iza1ion
There are a few choices for the norm used to bound a step in the trust region subproblem (3.1). The
usual choice is 2-norm. Though it is not easy to compute a global solution to the trust region subproblem
with a ball constraint, many techniques are available to compute a sufficiently good approximation
[5, 7, 14, 19, 21, 22].
Trust region idea has been used in the context of bound constraints as well. The co-norm is a
natural choice for a linearly constrained problem, [9, 12]. In [9], the following quadratic programming
subproblem is solved:
?s??f?k(S) IIDksIIoo < Ak, I < ?k + S < uJ.
Unfortunately, there is a problem with this norm selection. The strong convergence properties
established for trust region type methods hinge upon a good approximation to a global solution of a trust
region subproblem. Since the computational methods for a quadratic programming subproblem usually
?niarantee a local solution at best. a wide ?an exists between the conver?nce theorv and imolementatioi
of trust region methods for bound constrained problems.
4. An Interior and Trust Region Approach. We have analyzed possible techniques for handling
bounds and the nonlinear aspect of(l.l). We have demonstrated some problems associated with the most
existing methods:
5
o+ possible inefficiency due to sequential boundary following behavior of active set methods for
large-scale problems;
o+ slow local convergence of the affine scaling method for linear programs;
o+ discrepancy between the theory and implementation of the trust region method type method for
(1.1).
We have also illustrated how the first problem can be overcome with the affine scaling technique. Based
on the above analysis, we have the following objectives:
o+ adapt the affine scaling technique to nonlinear minimization with bound constraints;
o+ achieve quadratical convergence;
o+ use the trust region idea to obtain strong convergence properties which are achievable in imple-
mentation.
First, we define a vector flinction v(x): ?fl			?? as follows.
DEHNITION 4.1. The vector v(x) E ?? is defined:
(i).			ffvAx)i<oandni<oothenvi def Xj?Uj.
(ti).			IfVf(x)i > Oandl?> --Hoo then Vj def Xj --H ij.
(tit).			IfvAx)i<oandtti=oothenvi def
def
(iv). ffvAx)i>?oandI?=--Hoothenv? = 1.
Following Matlab notation, for any s E ??, diag(s) denotes an n-by-n diagonal matrix with the
vector s defining the diagonal entries in their natural order. In addition, we define
d			1
(4.1)			D(x) = diag(?v()??2)
i.e., D-2 is a diagonal matrix with the jth diagonal component equal to Ivi(x)I.
We consider the following diagonal system of nonlinear equations:
(4.2)			D(x)?2Vf(x) = 0.
In all subsequent discussions, H(x) d=d ?f(x) and g(x) def Vf(x). Assume that x* satisfies the
equation (4.2). The nonlinearsystem (4.2) demands that ifa component ofy* is positive, the corresponding
variable is at its lower bound. If a component of Y* is negative, the corresponding variable is at its upper
bound. Hence, (4.2) is the first order necessary conditions for a minimizer of (1.1). Unlike the KKT
conditions used in many interior point methods, the system (4.2) does not involve strictly feasible dual
variables. System (4.2) is continuous but not everywhere differentiable. Nondifferentiability occurs when
Vj = 0; we avoid such points by restricting Xk E int(?).
Assume that Xk E int(?). A Newton step for (4.2) satisfies
(4.3)			Mkd%N=?gk
where is a Newton step in the new coordinates under the affine scaling transformation x def Dkx
(hence dN --H DkdNk) and
g?d?f D?'gk = diag(iv?i21)g?,
Mk ?? D?1VfkDk' + diag(g?)J??.
6
Here Jv(x) E ?nxn corresponds to the Jacobian of Iv(x)I. It is easy to see that J? is a diagonal matrix.
For example, if all the components of 1 and u are finite, J? diag(sgn(Vf)). If variable Xj has a finite
lower bound and an infinite upper bound (or vice-versa) and Vfj = 0, we define J2Vj = 0 at such a point.
The following lemma can be easily proved.
LEMMA 1. Assume that x* E ?.
(a) If x* ts a localminimizer of(1.1) then 9* = 0
(b) If x* is a local minimizer then M* is positive semi-definite and 9* = 0.
(c) If M* is positive definite and 9* = 0 then x* is a local minimizer of(1 .1).
The results of Lemma 1 indicate that computing a local minimizer of a bound constrained problem
(1.1) is the same as locating a point such that 9* = 0 and M* positive semidefinite. Therefore, loosely
speaking, we have transformed abound-constrained problem (1.1) to a problem offinding alocal minimizer
for some unconstaained problem.
Though 9 and M do not correspond to the gradient and Hessian of a specific nonlinear function,
Lemma 1 implies that a solution of the following trust region subproblem is a reasonable step
(4.4)			min???(s) : IIs112 < Ak),
where
`kk(s???9kTS?+IsTMks.
2
Let s = Dk?1s. Subproblem (4.4) is equivalent to the following problem in the original variable
space:
(4.5)			minf??(s) : IIDksII2 < Ak),
where
def 5T			+ 1 T
=			9k
def
Ck = D?diag(??)J??D?,
Mk def ?fk + Ck.
It is clear that Ck is a positive semi-definite diagonal matrix. This matrix contains the constraint
information. Moreover, in the neighborhood of a local minimizer, the Newton step to (4.2) is a solution
to the trust region subproblem (4.5) if the trust region size Ak is sufficiently large.
def
For the subsequent discussion, we use the notation: Pk = argn1ins????('k(s): IIDksII < Ak).
We have derived a trust region subproblem (4.5) to compute a good step. Now we discuss how to
choose the trust region size.
Since our quadratic model `kk(5) includes the constraint information and is not an approximation to
f(x) --H f(xk), we need to extend the definition of P?k By observing that `[)k(5) is an approximation to
f(xk + s) --H f(xk) + 2isTCks, we extend P1k as follows
J def f(xk+5k)--Hf(xk)+2lsTkC(xk)sk
Pk =			`t)k(5k)
7
Similar to unconstrained trust region methods, Pk1 measures the agreement between the nonlinear flinction
f(x) and its quadratic approximation. If P?k > ? for some'i > 0, the nonlinear function is reduced by at
least a fraction ? of that of the quadratic model. In this case, we say a step is successful. Otherwise a step
is unsuccessful.
Algorithm 1: An Interior Trust Region Method for Bound Constrained Problems
xo E int(?)
Fork =Ql,..
1. Compute A? g?, Hk, and C?; Detinethequadraticmodel `kk(8) g?Ts+ 2isT(Hk +Ck)s.
2. Compute Sk an approximate solution based on (4.5) with Xk + Sk E int(?).
3. Compute
j _ f(xk+sk)--Hf(xk)+2iskTcksk
--H			`kk(sk)
4. If P?k > ij then set xk+1 = xk + sk. Otherwise set xk+1 = Xk
5. Update the scaling matrix D? and Ak as specified.
Updating Trust Region Size Ak
LetO < it < ?< 1,?? <1 <?and0 <Atbegiven
1. 1?P?k <itthensetA?+i E (O,?Ak].
2. If P1k E (it? n) then set Ak+i E [?1Ak,Ak].
3. 11P1k > ?then
setA?+i E [Ak,?2Ak].
FIG. 4. An Interior Thust Region Melhodfor Mmurn?ation Subject to Bound?
In FIG. 4, we describe our trust region method. Comparing AIgorithm 1 with AIgorithm 0, the
descriptions ofthe two algorithms are almost the same except the definition ofPk?' Dk and the requirement
Xk + Sk E int(?). The requirement that xk + Sk E int(?) is necessary because we want approximations
?xk1 to approach a solution from int(?). Our new algorithm looks simple and elegant for a fairly
complicated problem. But will this algorithm compute a solution? To answer this question, we need
to demonstrate a step Sk, which satisfies the requirement Xk + Sk E int(F) and produces a sufficient
decrease of ?k(s), can be obtained based on the subproblem (4.5).
We answer this question next.
Affine Scaling Transformation x = Dkx and Reflection. In order to see why a reasonable step Sk,
Xk + sk E int(?) can be obtained subproblem (4.5), let us examine the propeflies of the global solutions
?Pk1 to (4.5).
We analyze the ith component of Pk as an example.
Assume that 1 Vki 1 = min(xkj --H I, t --H Xki). The optimality conditions say that, if Xj is at a boundary,
Vj = 0. This implies that, locally speaking, Xk is approaching the `correct' bound according to the
prediction of the gradient. In this case, (Dk))ii = ((D??e)2l )??. Since Pk is a global solution of (4.5), Pk
satisfies
(Mk + Ak)Dk?'pk = Dk?'gk.
(Dk?'CkDk?' + AkI)pk = Dk2(9k + Hkpk).
8
Written in another form
--H I.
FIG. 5. Centuing
It is easy to verify from the above equation that ?(D2k)iiPkiJ is bounded underthe assumption that Ak < Au
for some A? > 0. Hence, in the ith component, the effect of the scaling matrix Dk is similar to that of
D???. Therefore sufficient progress can he made along Pk in the ith component, e.g., FIG. 5.
Now suppose that IVkiI $ min(r?j --H ii, Uj --H Xkj). Locally, this implies that Xki is approaching an
`incorrect' bound according to g?? From the definition of Dk, the shape ofthe trust region allows a large
increment in the variable Xj. Moreover, the scaled steepest descent D??2g? usually directs away from
the bound of Xj in a sharp angle (unless the corresponding gradient is small which is a near degenerate
case). Since Pk and Dk?2Yk has an acute angle and, when the trust region size converges to zero, the
trust region solution converges to the scaled steepest descent direction D?2g?, the trust region solution
Pk will likely he pointing away from the closest bound of Xj as well. FIG. 6 displays this situation. Agaln,
the bound of Xj will not prevent a sufficient decrease of `kk(8).
Nonetheless, it is possible, that Pk points directly towards the closest bound of Xj as illustrated in
FIG. 7. From IVkiI $ min(xkj --H ij? flj --H Xki) and deflnition of Vk, Vkjdkj > 0. But YkiVki > 0 always.
Hence we have
(4.6)			?kjdkj > 0
The above inequality is important hecause it implies that, if only a small step is possible following Pk
the quadratic function ?k(s) can he further decreased by following the reflection pRk of Pk? i.e., pkR = Pk
R
except Pki = ?Pki The reflection technique is important under this circumstances hecause it guides the
iterations towards the center of the feasible region and leaves the `incorrect' bound (see FIG. 7).
Our alline scaling transformation x = Dkx differs from the affine scaling transformation used for
linear programming problems in two regards: Dk is based on the current gradient gk and a diagonal
component is the squareroot of the distance of the corresponding variable to its closest bound, if this
bound is correct according to 9k Our choice of Dk comes naturally from the Newton step ofthe nonlinear
systems characterizing the first order optimality conditions of (1.1). Therefore a global step blends
automatically into a Newton step locally and achieves fast local convergence. The advantage of defining
the affine scaling transformation based on the gradient 9k is also indicated by FIG. 6 and FIG. 7. The shape
9
?G. 6. Leave an Incorrect Bowid
FIG.?. Reflection
10
of trust region allows a step to leave the `incorrect' hound based on the current gradient and a reflection
technique can be incorporated to achieve this if necessary. For a nonlinear minimization problem, a
starting point far away from the center of the feasible region is often provided. Hence it is important for
an interior approach to quickly leave the `incorrect' constraint when applied to a nonlinear problem. This
suggests that the scaling matrix diag(iv?i) is more reasonable than diag(min(x? --H 1, u --H xk)) because the
latter forces the iterates to follow the `incorrect' bound. We have done some computational experiments
to investigate the choice between D??e2 and Dk. Our experience clearly indicates that Dk is the winner.
A reflection path is formally described as follows. We emphasize, however, a couple of reflections
are sufficient in practice.
Define the vector4
(4.7)
BRk = max[(l --H xk) .1 dk, (u --H xk) ./ dk)],
where the notation" .1" indicates component-wise division. Component i of vector BRk records the
positive distance form Xk to the breakpoint corresponding to variable Xkj in the direction dk. The reflection
path is defined in FIG. 8. Since only a single outer iteration is considered, we do not include the subscript
k with the variables in our description - dependence on k is assumed.
Definition: Reflection Path p(d)
[Let P0 --H 0 d' --H d set b0 --H xk.]
Fori=l,2,???,oo
1. Let P? be the distance to the nearest breakpoint along d':
P?=rmnfBR: B1?>O1
2. Define jth breakpoint: bi = b?-1 + (p? --H Pi-1)di.
3. Reflect to get new dir'n and update BR:
(a) di+1 = dt
(b) For each j such that (bi)j = Uj (or (b?)j = ii)
o+ BR(j) = BR(j)+ I?4j'?I
o+ (d?+1) --H
flo. 8. A Reflection Path p(d)
Conditions on a Step Sk. We have discussed why a good step can be obtained based on a solutionp?
of (4.5). It is not reasonable, however, to assume that the exact global solution is available, particularly for
large scale problems. Instead, approximation to a global solution can be computed. What are conditions
Sk need to satisfy in order to guarantee required optimality conditions? We address this issue here.
For simplicity we assume the quadratic model ?k(s) = vf(xk)Ts + 2? sT(?f(xk) + C(xk))s. Given
any step dk, a possible step-back may be necessary to stay strictly feasible. We use ?k* [dk] to denote the
step obtained from dk with a possible step-back. The exact definition of 0k* [dk] is given below. Let Tk*
For the purpose of computing BR we assume the following rules reg&ding arithmetic with infinities. tf a is a finite scalar
then a + oo = 00, a --H 00 = --H00, ?"a = 00 sgn(a), --H? = --H00 sgn(a), 0a = sgn(a) 00, --H? = 00, and = --H00,
wheresgn(a) = +1 if a> O,sgn(a) <Oif a <0.
11
a point satisfying (As.3)
a point satisfying (As.3) and (As.4)
a point satisfying (As.4)
FIG. 9. An Era??pIe: Sattsfactton ofBoth Condtttons on Steps
denote the minimizer along dk within the feasible trust region, i.e., Tk* = ar?fnin?f?5?(Td?): IIrDkdkII <
Ak,xk + rdk E F?, 0k ? [Oi, 1] for some 0 < Oj < 1 and 0k --H 1 = O(IIdkII). Then
(4.8)			def 0k7k*4k def			rk*dk			if xk + rk*dk E int(?),
OkTk*dk otherwise.
The above definition implies that 0k = 1 if xk + r?dk E int(f).
In [8], the convergence results of the trust region methods for (1.1) have been established under the
assumptions that a step 8k satisfies the following: given positive constants P Po, pq and p0q,
(AS.3)
(AS.4)
?k?(sk) <P'k?*[--HD?2g?]
IIDkskII < PoAk, Xk + Sk E int(?).
#)k(sk) < ????*[p?]
IIDkskII < p0?A?, xk + Sk E int(f).
The condition (AS.3) is necessary for the first order convergence and (AS.4) is necessary for the
second order convergence. Both are extensions of the convergence conditions of unconstrained trust
region methods. In particular, when 1 = --Hoo and n = cc, these two assumptions are exactly the same as
the ones required by trust region methods for unconstrained problems. However, there is an significant
difference: an exact trust region subproblem solution does not necessarily satisfy the condition (AS.3).
Nonetheless, this does not imply that our model problem is inappropriate. As discussed before, a
good approximation of a global solution of (4.5) usually satisfies (AS.3). Moreover, there are many ways
to compute steps satisfying both conditions. As an example, assuming that X9k satisfies (AS.3) and x?k
satisfies (AS.4), then the minimizer Xk* of ?k(s) along the path connecting the two points satisfies the
both conditions, see HG. 9. In [8], it has been indicated that the trust region size can be updated to help
satisfy (AS.3). The reflection technique described in 4 can also be used.
The following convergence propenies have been established in [8].
THEoREM2. Let the level set = fx E ?? : f(x) < f(xo), x E Tjbeconpactandf: ?
be twtce conttnuously dijferentiable on . Let fxkj be the sequence generated by Algonth'n 2 under
assu,nption `kk(s) def vJ(xk)Ts + 2lsT(Hk + Ck)sk on the model, and condttions (AS3) and (AS.4) on
the step Sk. Then
(a) The sequence tg?j converges to zero.
12
R
Pk
FIG. 10. Sattsfactton ofBoth Condtttons on Steps Through a Reflectton-Dogteg Path
(b) If every limit point is nondegenerate, then there is a limit point x* with Mt positive
semidefinite.
(c) If x* is an isolated nondegenerate limitpoint, then M* is positive semidefinite.
(d) If M* is nonsingularfor some limitpoint x* of fxk 1 then
o+ M* is positive definite,
o+ ?xk1 converges to x*,
o+ all iterations are eventually successful, and tAk) is bounded awayfrom zero,
o+ fxkj converges to x* quadratically is S& = ?kt[sNk] whenever IIDksNkII <Ak.
5. Implementation for Dense Problems. In this section, we repon some of our experiments with
small to medium dense problems. The method implemented is exactly as described as follows.
At each iteration, a step is computed in the following fashion: first a step of (4.3) is computed by
solving the flIll trust region problem. Then a minimum of the quadratic `kk(s) is located by following the
refiection-dogleg path p?: Xk Xk --H Q?*[D2kgka] xk + Q%[pk] Xk + Qk*[Pk] + pRk. See HG. 10.
Though this implementation is not practical for large problems (because it requires a solution to
the trust region subproblem), it indicates that the method works and the number of iteration does not
seem to increase with the problem size. For the collection of our testing problems, Qk*[--HD2kgk] is never
taken. Unlike the aliine scaling method for a linear programming problem, our method is quadratically
convergent.
The performance of the method for large scale problems (including subspace methods and iterative
approaches) is undercurrent investigation. The preliminary results are very encouraging. In [7], numerical
results of a line search based method with the same scaling matrix Dk is repofled for medium and large
ii'			iirniz?tii			F ?			tic flincti			hoiind?
SC,Lj mir???.????.?n 0 ?encr& Auaur?			?.Jns su?ject to
6. Conclusion. We have proposed a promising approach for solving large-scale nonlinear bound-
constrained problems. By staying within the strictly feasible region, we overcome two difficulties
associated with the existing active set methods: the gap between the theory and implementation of the
trust region type method for (1.1) and slow changes of active set for large scale problems. In addition,
13
Number of Function/Oradient Evaluations
PROB			n <50			n = 100
GENROSE U			43			23
GENROSEC			12			11
GENSINO U			27			22
GENSING C			17			16
CHAINSING U			26			21
CHAINSING C			17			16
DEGENSING U			26			21
DEGENSING C			24			30
GENWOOD U			71			65
GENWOODC			10			9
CHAINWOOD U			60			84
CHAINWOOD C			10			9
BROYDEN1AU			14			11
BROYDENlAC			10			9
BROYDEN1BU			7			7
BROYDEN1BC			9			9
BROYDEN2A U			20			19
BROYDEN2A C			20			17
BROYDEN2B U			9			9
BROYDEN2B C			16			15
TOINTBROYU			8			8
TOINTBROYC			10			10
CRAGGLEVY U			33			27
CRAGGLEVY C			28			25
AUGMLAGN U			26			59
AUGMLAGN C			29			31
BROWN1U			21			20
BROWN1C			31			30
BROWN3 U			9			9
BROWN3 C			10			9
BVPU			21			15
BVPC			20			16
VARU			12			9
VARC			12			23
T?? 1
Experiments with an Interior, Trust Region and Reflective Methodfor Boimd-consfrainedProbiems
14
our approach has the following important properties.
Firstly, the method is a natural generalization of the trust region method for unconstrained problems.
This is possible because of the consideration of the trust region subproblem (4.5). When 1 --Hoo and
u = oc, it becomes a trust region method for unconstrained problem. Hence techniques developed for
trust region methods of unconstrained problems are applicable for our method.
Secondly, the iterates generated by the algorithm stay in the strict feasible region and a centering
technique is incorporated. The reflection technique makes it easier for the iterates to stay in the strictly
feasible region and allows for the second order steps to be taken.
Our computational experience indicates, for the limited testing problems at least, the method works
well and has potential for large scale problems.
The basic idea of the approach can be extended to handle additional equality constraints.
15
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
I. ADiER? N. KARMA?eA?, M. RESENDE, AND G. VEIGA, An implementation of karmarkar's algorithmfor linear pro-
gramming, Math. Prog., 44(1989).
E. BARNES, A variotion on karmarkar's algorithmfor solving linearprogrammingproblems, Mathematical Programming,
36 (1986), pp. 17?182.
D. P. B?RThI?AS, Constrained cotimization and Lagrange multplier methods, Academic Press, London and New York,
1982.
Projected Newton methods for optimization problems with simple constraints, SlAM Journal on Control and
Optimization, 20(1982), pp. 221--H246.
R. H. BYRD AND R. B. ScHNABEL, Aporoximate solution ofthe trust regionproblem by minimization over two-dimensional
subspaces, Mathematical Programming, 40(1988), pp. 247--H263.
T. E CoLMAN AND Y It, On the convergenceofreflectiveNewtonmethodsfor large-scale nonlinearminimization subject
to bounds, Tech. Rep. TR 92-1314, Computer Science Eepanrnen? Cornell University, 1992.
A reflective Newton methodfor minimizing a quadraticfunct ion subject to bounds on the variables, Tech. Rep.
TR 92-1315, Computer Science Depanunent, Cornell University, 1992.
An interior, trust regionapproachfor nonlinearminimization subject to bounds, Tech. Rep. TR93-1342, Computer
Science Enpanment, Cornell University, 1993.
A. R. CoNN, N. I. M. Gouu), AND P. L. TorNT, Testing a class ofmethodsfor solving minimization problems with simple
bounds on the variables, Mathematics of Computation, 50(1988), pp.399430.
R. 5. DEMBO AND U. TLiLowi??ci, On the minimization of quadraticfunctions subject to box constraints, Tech. Rep.
Working paper series B 71, School of Organization and Management, Yale University, 1983.
I. DIKIN, Iterative solution ofproblems of linear and quadratic programming, Doklady Akademiia Nauk SSSR, 174
(1967), pp. 747--H748.
R. FLErcHER, An algorithmfor solving linearly constrained optimizationproblems, mprog, 2 (1972), pp. 133--H165.
R. FurrcHER AND M. P. JACKSON, Minnnization ofa quadraticfiwction ofmany variables subject only to lower and upper
bounds, Journal of the Institute for Mathematics and its Applications, 14 (1974), pp. 159--H174.
D. M. GAY, Computing optimal locally constrainedsteps, SlAM Journal on Scientific and Statistical Computing, 2(1981),
pp. 186197.
P. Giu. AND W. MuRRAY, Mintmization subject to bounds on the variables, Tech. Rep. Report NAC 71, National Physical
Laboratory, England, 1976,
K. LEvnNnERG,A methodfor the solution ofcertainproblemsin least squares, Quart. AppI. Math., 2(1944), pp. 16?168.
D. MARQUARDT, An algorithmfor least-squares estimation of nonlinear parameters, SlAM Journal on Applied Mathe-
matics, 11(1963), pp.431441.
N. MEGIDDO AND M. SHUB, Boundary behavior of interior point algorithms in linear programming, mor, 14 (1989).
pp. 97--H146.
J. J. MOR? AND D. SoRENSEN, Computing a trust region step, SlAM Journal on Scientific and Statistical Computing, 4
(1983), pp. 553--H572.
J. J. Mo? AND G. ToRAUX), Algorithmsfor bound constrained quadratic programmingproble ms, Numerische Mathe-
matik, 55 (1989), pp. 377400.
GA. ScHuLTz? R. B. ScHNABEL, AND R. H. BYRD, Afamily of trust-region-based algorithmsfor unconstrainedmini-
mization with strong global convergenceproperties, SlAM Journal on Numerical Analysis, 22(1) (1985), pp. 47?7.
D. SoRENsEN, Trust region methods for unconstrained optimization, SlAM Journal on Numerical Analysis, 19 (1982),
pp.409426.
R. J. VANDERBEt, MS. ME?ErroN, ANDB. A. FREEDMAN,A modtjcationofKarmarkar'slinearprngrammingalgorithm,
Algorithmica, 1(1986), pp. 395407.
16
