A Robust Modification of Hestenes–Stiefel Conjugate Gradient Method with Strong Wolfe Line Search

Corresponding Author: Awad Abdelrahman Department of Mathematics, Faculty of Mathematical and Computer Sciences, University of Gezira, Wad Madani, Sudan Email: awad.abdalla26@yahoo.com Abstract: Nonlinear Conjugate Gradient (CG) methods are extensively used for solving large-scale unconstrained optimization problems. Numerous of studies constructed scales and modifications have been conducted recently to improve (CG) methods. In this paper, a simple modified by its conjugate gradient method was proposed. In addition to, established global convergence property and sufficient descent condition, under Strong Wolfe line search. Numerical result shows that the proposed formula is competitive when compared to other well-known (CG) parameters.


Introduction
The nonlinear Conjugate Gradient methods (CG) are utilized to find the minimum value of function for unconstrained optimization problems. In general, the method has the following form: where, f: R n  R is continuous differentiable nonlinear function and which gradient denotes by g (x). The CG methods are given by an iterative method of the form: 1 , 0,1,2, ., where, xk is the kth iterative point, the k > 0 is a steplength and dk is the called conjugate gradient search direction with: where, k is a scalar, the step-length k >0 is obtained by attainment a one dimensional search, known as the 'line searches'. The most common line searches are exact and inexact line searches. The inexact line search has many types of methods known as Armijo (Fletcher, 1997), Wolfe (1969), Goldstein (1965) and Strong Wolfe (Dai and Yuan, 1999;Hilstrom, 1977). In this paper, utilized strong Wolfe line search for computed k and defined as follows: where, 0<<<1 are two constants. There are at least six well-known formulas fork, which are given below:  (Dai and Yuan, 1999).
These parameters k (Hestenes and Stiefel, 1952;Fletcher and Reeves, 1964;Polak and Ribiere, 1969;Hestenes and Stiefel, 1952;Liu and Storey, 1992;Dai and Yuan, 1999) are equivalent, when f is a strong convex quadratic function with an exact line search. If f is nonquadratic functional, each choice for the k leads to very different performance and convergence of the corresponding algorithms (Andrei, 2011). The behavior of convergence of the k's formulas with some line search conditions has been established by many authors for many years ago (Rivaie et al., 2012;Dolan and More, 2002;Goldstein, 1965;Powell, 1977;Powell, 1984;Dai and Yuan, 1999;Andrei, 2011;Hilstrom, 1977;Fletcher, 1997;Fletcher and Reeves, 1964;Liu and Storey, 1992;Hestenes and Stiefel, 1952;Andrei, 2008;Zhang, 2009;Zoutendijk, 1970). Until recently, they are seeking for convenient k's that are efficient in a numerical performance, possessed global convergence and sufficient decent condition. The numerical performance of the FR conjugate gradient method has often been much slower than that of the PRP conjugate gradient method and fewer cases it is faster than PRP. The global convergence of the FR method with exact line search was fulfilled by Zouttendijk (1970) and also Al-Baali (1985) established the FR method is globally convergent under strong Wolfe Condition when <0.5, overtake Liu and Storey (1992) expanded that result to 0.5. The CD and DY methods established global convergence under Strong Wolfe line search (Dai and Yuan, 1999;Hilstrom, 1977) and they have the same numerical performance as FR under exact line search. The PRP conjugate gradient method has a good numerical performance (Powell, 1977), but does not have good convergence property (Powell, 1984) and so as HS method. The global convergence of the PRP method for convex objective function under exact line search was established by Polak and Ribiere (1969). Powell (1986) proposed a counterexample and showed the non-convergent sequence of the PRP method for non-convex function, which also applied to the HS method. Different modifications of the HS method building a nice basis for an upgrade the performance and convergence property. For good news that, the convergence of the standard HS method with various inexact line search does not yet established. One important notice is that, when ||xk-xk-1|| is, being small, both the nominator and denominator of HS k  become small so that HS k  might be unbounded (Dai, 2010). Qi et al. (1999) established the global convergence of a modified HS method, where k takes the form: Perry (1977) observed that the search direction in the HS method can be written as: Noting that T k P yk-1 = 0, Pk is an affine transformation that transforms R n into the null space of T k P yk-1. This paper is organized as follows. In Section 2, is presented the underlying idea of modification and present the modification of the HS conjugate gradient method and algorithm. In Section3, we established sufficient descent property and global convergence property with the strong Wolfe line search, if the parameter  = 0.1. In Section 4, preliminary numerical results can be introduced.

The Modified Method
The nonlinear conjugate gradient method for unconstrained optimizations is uncomplicated and has abated memory requirement properties and is very effective for large-scale optimization problems, which the HS method is one of the most efficient methods. However, the standard HS method has nice numerical performance but fails in the convergence of non-convex functions under the inexact line search technique. Therefore, include a simple medication to overcome this deficiency, a modified HS formula is defined by:

then stop
Step 6: Set k = k +1 and go to step 3.

Convergence Analysis
In this section, we will analyze and study the convergence properties of * MN k  .

Sufficient Descent Conditions
Before giving the sufficient descent conditions, are needed the following assumptions.

Global Convergence Properties
The following lemma, called the Zoutendijk condition, is usually used to prove global convergence of CG method. It was given by Zoutendijk (1970).

Lemma 3.2
Suppose that x0 is an initial point for which Assumption A holds. Consider any method in the form (1.2) and (1.3), where dk is a descent direction and k satisfies the strong

Proof
We get from the second inequality of (1.4): This together with the Lipchitz condition (3.1), implies: