Memoryless Modified Symmetric Rank-One Method for Large-Scale Unconstrained Optimization

: Problem statement: Memoryless QN methods have been regarded effective techniques for solving large-scale problems that can be considered as one step limited memory QN methods. In this study, we present a scaled memoryless modified Symmetric Rank-One (SR1) algorithm and investigate the numerical performance of the proposed algorithm for solving large-scale unconstrained optimization problems. Approach: The basic idea is to apply the modified Quasi-Newton (QN) equations, which uses both the gradients and the function values in two successive points in the frame of the scaled memoryless SR1 update, in which the modified SR1 update is reset, at every iteration, to the positive multiple of the identity matrix. The scaling of the identity is chosen such that the positive definiteness of the memoryless modified SR1 update is preserved. Results: Under some suitable conditions, the global convergence and rate of convergence are established. Computational results, for a test set consisting of 73 unconstrained optimization problems, show that the proposed algorithm is very encouraging. Conclusion/Recommendations: In this study a memoryless QN method developed for solving large-scale unconstrained optimization problems, in which the SR1 update based on the modified QN equation have applied. An important feature of the proposed method is that it preserves positive definiteness of the updates. The presented method owns global and R-linear convergence. Numerical results showed that the proposed method is encouraging comparing with the methods MMBFGS and FRCG.


INTRODUCTION
We consider the following unconstrained optimization problem: n x f (x) min ∈ℜ (1) where, n f : ℜ → ℜ is a twice continuously differentiable nonlinear function and n, the number of variables, is large. There are various iterative methods for solving problem (1); Quasi-Newton (QN) methods are one of the most exploited methods.
The following iterative method can be seen as the general QN procedure to solve the problem (1).
Calculate the search direction p k by solving the equation then set x k+1 = x k +α k p k at the k th iteration, where g k denotes the gradient vector of f at x k , B k is the secant approximation to the ∇ 2 f(x k ) and α k is the step length that is updated by line search. The matrix B k is usually required to be positive definite to ensure a descent direction for f. B k is updated at every iteration to a new Hessian approximation B k+1 for which the general QN equation: B k+1 s k = y k Where: s k = x k+1 -x k y k = g k+1 -g k is satisfied.
We are interested in elaborating an algorithm for solving large-scale problems for which we could use both function and gradient values to possess more accurate information. Memoryless QN methods were firstly introduced by Perry [10] and Shanno [12] . They can be considered as the QN methods for which the approximation to the inverse of the Hessian is taken as the identity matrix at every iteration. Limited memory BFGS (LBFGS) method [9] and conjugate gradient method are two important classes of methods to solve large-scale unconstrained optimization problems. Wei et al. [13] proposed the modified QN equation: Where: and A k is a simple symmetric and positive definite matrix. The modified QN equation uses not only the gradient but also function value information in order to get the higher order accuracy in approximating the curvature of the objective function.
In this study, we consider the famous SR1 method: which makes a rank-one modification to the previous Hessian approximation B k and hence it is a simpler update to use and it requires less computational effort per iteration. Conn et al. [4] and Khalfan et al. [7] have investigated the computational and numerical results of the SR1 methods. The results showed that in practice when the SR1 update solves a given problem, its efficiency is at least, if not better, as good as other QN methods. With these encouragement, it seems reasonable to extend the modified QN equations of Wei et al. [13] to the memoryless method and obtain the memoryless modified SR1 update.

Modified QN equation:
We begin by reviewing the modified QN equation. Li and Fukushima [8] proposed the modified QN equation: where, * k k k k k y = y t g s + , with T k k k 2 k s y t = 1 max ,0 s The modified QN equation takes advantage of not only the gradients but also the function values. The proposed equation guarantees the global convergence without using convexity assumption but this modification does not outperform the BFGS update with general QN equation B k+1 s k = y k . Inspired by this Wei et al. [13] proposed the similar modified QN equation: Where: and u k ∈R n is any vector such that T k k s u 0 ≠ . By substituting u k with s k in (6), we obtain the following modified QN equation: Where: (7), the SR1 update has the form: Where: This modified SR1 update efficiently exploits both gradient and function information.
Since memoryless SR1 formula updated from modified QN Eq. 7 may lose positive definiteness of updated matrix in the following we present a scaling factor to update modified SR1 formula from a scaled identity matrix.
Scaling factor: In 1993 Dennis and Wolkowicz [5] suggested the following measure: Where: A = An n×n positive definite matrix ζ A = The largest eigenvalue of A Note that finding the optimal scaling factor for the modified SR1 update in the measure given by (8) is easier than to find it in the l 2 -norm condition number. Hence in the following theorem we try to find the 'best' modified SR1 update from a positive multiple of identity matrix that satisfies the modified secant Eq. 7 and preserves positive definiteness of the update.
Then the modified SR1 matrix updated from is the unique solution of: Proof: The proof of this theorem is a result of Theorem 1 of [14] which H k and y k is replaced by the identity matrix and the modified k y ɶ , respectively. Note that since the proof in [14] is not depended to y k , the result is true after such replacement.

Corollary 1: Let:
Then the scaled modified memoryless SR1 update: is the unique solution of: direction, for k = 0, let p 0 = -g 0 and for k>0 compute p k by p k = -H k g k : where, k 1 − λ ɶ and H k is given by (12) and (13) and: • Find the step-length β k such that β k satisfy Wolfe conditions: where, • Set k = k+1 and go to the step 1

Convergence result:
We establish our analysis for the convergence result of MMSR1 Algorithm. We apply our results to obtain globally convergence and Rlinearly result under some standard conditions. For this purpose, we present the following assumption: Assumptions: is contained in a convex set D.
(ii) Let G be the matrix of second derivatives of f then there exists constant L 1 and L 2 such that: for all z∈R n and all x∈D.
Theorem 2: Let f satisfy the Assumptions and the sequences {x k } be generated by MMSR1 Algorithm. Then the sequences {x k } converge globally to x*.

Proof: The Wolfe conditions in step 3 of MMSR1
Algorithm and the positive definiteness and boundedness of the memoryless modified SR1 matrix implies that: for some positive constant q. Therefore for all k and since f is bounded below, it follows that: , as a consequence ||g k || goes to zero, i.e., {x k } converges to x*.
Theorem 3: Let f satisfies in the Assumptions and the sequences {x k } generated by MMSR1 Algorithm. Then there exists a constant 0≤t<1 such that: hence we can deduce that {x k } converges R-linearly to x*.
Proof: Using Theorem 3.1 and Corollary 3.1 of [14] , one can show that:  By using Assumptions and the Wolfe conditions in step 3 of MMSR1 Algorithm, we can show that there exists a positive constant c such that: See for example [11] . Applying (24) recursively together (23), we obtain (20). Finally, from (18): Hence the sequence {x k } is also R-linearly convergent.  Numerical result: Now we present the computational performance of MMSR1 Algorithm. We use the following existing packages: • The limited memory QN package by Liu and Nocedal [9] was modified to fit with the modified QN equation. The package has the variable storage capability controlled by the parameter m, where m is the number of stored updates. For our case, m = 1 is chosen. Therefore this method is called memoryless modified BFGS (MMBFGS) method in this study • The conjugate gradient algorithm by Birgin and Martínez [2] (FRSCG), which is mainly a scaled variant of Perry's [10] . The algorithm is preserving the nice geometrical properties of Perry's direction and uses Fletcher-Reeve formula. It is implemented in such a manner in which the parameter scaling the gradient defining the search direction is selected by means of a spectral formula.
All the algorithms are implemented in Fortran 77 and stopped whenever k g ≤ ε where 5 = 10 − ε . Seventythree large-scale unconstrained optimization functions has been tested from Cute collection [3] , along with other large-scale optimization problems from [1] . The problems tested have variable dimensions and we tested our algorithms with dimensions 1000≤n≤10000. This has resulted in a total of 730 runs.
In Table 1 and 2, we summarize our numerical experiments by using the Geometric and Arithmetic means of number of iterations and function/gradient evaluations requires to solve these problems by the MMSR1 Algorithm to the corresponding mean for the MMBFGS and FRSCG method.
Also in order to access the performances of these algorithms, the performances have been evaluated using the performance profiling of Dolan and Moré [6] . In Fig. 1  and 2 we compare the performance of MMSR1, MMBFGS and FRSCG referring to the number of iteration, function/gradient calls, respectively.

DISCUSSION
In this study we presented a new method, in which the well known SR1 update is computed in the framework of memoryless method through modified secant condition. Memoryless method is very powerful method for solving large-scale unconstrained optimization problems characterized by low memory requirements and strong local and global convergence properties. The method takes both the available gradient and the function values information in two successive iteration points and achieves high-order accuracy in approximating the second-order curvature of the minimizing function. Under very mild assumptions, the method is proved to be globally convergent. It is shown that the convergence of the MMSR1 algorithm is Rlinear. Also the results presented in Table 1 and 2 imply that MMSR1 Algorithm improved significantly over the performance of MMBFGS and FRSCG methods. Thus, the improvement of Algorithm MMSR1 over MMBFGS is 12-40%, in average, respectively in terms of the number of iterations and function/gradient calls.
Similarly, the improvement of algorithm MMSR1 over FRSCG is 40-50%, in average, respectively, in terms of the number of iterations and function /gradient calls. Therefore the algorithm MMSR1 is 12-50% in average faster and cheaper than MMBFGS and FRSCG methods. Moreover, the experimental results also show that MMSR1 solves 80% of all problems while MMBFGS and FRSCG can solve 70% and 72% of all problems, respectively.
From Fig. 1 and 2, we observe that, MMSR1 performs better than the MMBFGS and FRSCG methods. From the numerical results, MMSR1 Algorithm is clearly better than MMBFGS algorithm and vast superior than FRSCG algorithm.

CONCLUSION
We have presented a scaled memoryless modified SR1 algorithm, MMSR1, based on the modified QN equation by employing modified SR1 formula within the memoryless QN framework for solving large scale unconstrained optimization problems. We use the positive multiple of identity matrix to update the modified SR1 matrix. MMSR1 algorithm can be computed without the storage of matrices, namely n 2 storages. Therefore we conclude that by incorporating a simple scaling factor into the modified memoryless SR1 update we can preserve positive definiteness of the modified SR1 matrix. Moreover the numerical results show that, the MMSR1 method performs significantly better than the other two methods and the improvements exhibited by the MMSR1 algorithm over MMBFGS and FRCS are evident in both function/gradient evaluations and iterations. Also we have considered the global and local behavior of the scaled memoryless modified SR1 algorithm with some standard conditions. Specifically, we have showed Rlinear convergence property of our method.