The long time behavior and the rate of convergence of symplectic convex algorithms obtained via splitting discretizations of inertial damping systems

Abstract

In this paper we propose new numerical algorithms in the setting of unconstrained optimization problems and we give proof for the rate of convergence in the iterates of the objective function.

Furthermore, our algorithms are based upon splitting and symplectic methods and they preserve the energy properties of the inherent continuous dynamical system that contains a Hessian perturbation.

At the same time, we show that Nesterov gradient method is equivalent to a Lie-Trotter splitting applied to a Hessian driven damping system.

Finally, some numerical experiments are presented in order to validate the theoretical results.

Authors

Cristian Daniel Alecsa
Tiberiu Popoviciu Institute of Numerical Analysis Romanian Academy Cluj-Napoca, Romania
Romanian Institute of Science and Technology, Cluj-Napoca, Romania

Keywords

unconstrained optimization; rate of convergence; inertial algorithms; Nesterov; convex function; Hessian; dynamical system; splitting operator; vanishing damping; Lie-Trotter.

Paper coordinates

Cristian-Daniel Alecsa, The long time behavior and the rate of convergence of symplectic convex algorithms obtained via splitting discretizations of inertial damping systems, arXiv:2001.10831

PDF

About this paper

Journal
Publisher Name
DOI
Print ISSN
Online ISSN

google scholar link

[1] C.D. Alecsa, S.C. Laszlo, T. Pinta, An extension of the second order dynamical system that models Nesterov’s convex gradient method, 2019,arxiv:1908.02574

[2] C.D. Alecsa, I. Boros, T. Pinta, New optimization algorithms for neural network training using operator splitting techniques, 2019, arxiv:1904.12952

[3] F. Alvarez, H. Attouch, J. Bolte, P. Redont, A second-order gradient like dissipative dynamical system with Hessian-driven damping. Application to optimization and mechanics, J. Math. Pures Appl. 81 (2002), no. 8, pp. 747-779.

[4] H. Attouch, Z. Chbani, J. Fadili, H. Riahi, First-order optimization algorithms via inertial systems with Hessian driven damping, 2019, arxiv:1907.10536

[5] H. Attouch, A. Cabot, Asymptotic stabilization of inertial gradient dynamics with time-dependent viscosity, J. Differential Equations 263 (2017), pp. 5412-5458, doi:10.1016/j.jde.2017.06.024

[6] H. Attouch, A. Cabot, Convergence rates of inertial forward-backward algorithms, SIAM J. Optim (28) 2018, no.1, pp. 849-874, https://doi.org/10.1137/17M1114739

[7] H. Attouch, X. Goudou, P. Redont, The heavy ball with friction method. I. The continuous dynamical system: global exploration of the local minima of a real-valued function by asymptotic analysis of a dissipative system, Commun. Contemp. Math. 2 (2000), pp. 1-34.

[8] H. Attouch, J. Peypouquet, P. Redont, Fast convex minimization via inertial dynamics with Hessian driven damping, J. Differential Equations 261 (2016), pp. 5734-5783.

[9] N. Bansal, A. Gupta, Potential-function proofs for gradient methods, Theory of Computing (2019), pp. 1-32.

[10] A. Batkai, P. Csomos, B. Farkas, G. Nickel, Operator splitting for non-autonomous evolution equations, J. Functional Analysis 260 (2011), pp. 2163-2190.

[11] A. Beck, Introduction to Nonlinear Optimization: Theory, Algorithms, and Applications with MATLAB, vol. 19, SIAM, 2014.

[12] S. Blanes, F. Casas, A. Murua, Splitting and composition methods in the numerical integration of differential equations, Bol. Soc. Esp. Mat. Apl. 45 (2008), pp. 89-145.

[13] R.I. Bot, R. Csetnek, S.C. Laszlo, A second order dynamical approach with variable damping to nonconvex smooth minimization, Applic. Analysis, 2018, doi:10.1080/00036811.2018.1495330

[14] A. Cabot, H. Engler, S. Gadat, On the long time behavior of second-order differential equations with asymptotically small dissipation, Trans. Amer. Math. Soc. 361 (2009), pp. 5983-6017.

[15] A. Cabot, H. Engler, S. Gadat, Second order differential equations with asymptotically small dissipation and piecewise flat potentials, Electron J. Differential Equations 17 (2009), pp. 33-38.

[16] C. Castera, J. Bolte, C. Fevotte, E. Pauwels, An inertial Newton algorithm for deep learning, 2019, arXiv:1905.12278.

[17] A. Chambolle, C. Dossal, On the convergence of the iterates of FISTA, HAL Id : hal-01060130, 2014, https://hal.inria.fr/hal-01060130v3https://hal.inria.fr/hal-01060130v3.

[18] L. Chen, H. Luo, First order optimization methods based on Hessian-driven Nesterov accelerated gradient flow, 2019, arxiv:1912.09276.

[19] A. Defazio, On the curved geometry of accelerated optimization, Adv. Neural Inf. Processing Syst. 33 (NIPS 2019), 2019.

[20] K. Feng, On difference schemes and symplectic geometry, Proceedings of the 5-th Intern. Symposium on differential geometry & differential equations, August 1984, Beijing (1985), pp. 42-58.

[21] G. Franca, D.P. Robinson, R. Vidal, Gradient flows and accelerated proximal splitting methods, 2019, arXiv:1908.00865.

[22] G. Franca, J. Sulam, D.P. Robinson, R. Vidal, Conformal symplectic and relativistic optimization, 2019, arXiv:1903.04100.

[23] E. Hairer, C. Lubich, G. Wanner, Geometric Numerical Integration. Structure-Preserving Algorithms for ODE’s, second edition, Springer-Verlag Berlin Heidelberg, 2006, doi:10.1007/3-540-30666-8.

[24] E. Hairer, G. Wanner, Euler methods, explicit, implicit and symplectic. In : Encyclopedia of Applied and Computational Mathematics, B. Engquist ed., Springer, Berlin Heidelberg, 2015, pp. 451-455.

[25] E. Hansen, F. Kramer, A. Ostermann, A second-order positivity preserving scheme for semilinear parabolic problems, Appl. Num. Math. 62 (2012), pp. 1428-1435.

[26] E. Hansen, A. Ostermann, Exponential splitting for unbounded operators, Math. of Comput. 78 (2009), no. 267, pp. 1485-1496.

[27] S.C. Laszlo, Convergence rates for an inertial algorithm of gradient type associated to a smooth nonconvex minimization, 2018, arXiv:1807.00387.

[28] G. Marchuk, Some applications of splitting-up methods to the solution of mathematical physics problems, Aplikace Matematiky 13 (1968), pp. 103-132.

[29] M. Muehlebach, M.I. Jordan, A dynamical systems perspective on Nesterov acceleration, 2019, arXiv:1905.07436.

[30] Y. Nesterov, A method for solving a convex programming problem with convergence rate O(1/k 2 ) (in Russian), Soviet Math. Dokl. 27 (1983), no. 2, pp. 543-547.

[31] Y. Nesterov, Introductory lectures on convex optimization : A basic course, vol. 87 of Applied Optimization, Kluwer Academic Publishers, Boston, MA, 2004.

[32] N.C. Nguyen, P. Fernandez, R.M. Freund, J. Peraire, Accelerated residual methods for iterative solution of systems of equations, SIAM J. Sci. Computing 40 (2018), pp. A3157-A3179.

[33] M. Laborde, A.M. Oberman, A Lyapunov analysis for accelerated gradient methods : from deterministic to stochastic case, 2019, arXiv:1908.07861.

[34] B.T. Polyak, Some methods of speeding up the convergence of iteration methods, USSR Comput. Math. Math. Phys. (4) 1964, pp. 1-17.

[35] J.M. Sanz-Serna, Symplectic methods. In : Encyclopedia of Applied and Computational Mathematics, B. Engquist ed., Springer, Berlin Heidelberg, 2015, pp. 1451-1458.

[36] B. Shi, S.S. Du, W.J. Su, M. I. Jordan, Acceleration via symplectic discretization of high-resolution differential equations, 2019, arXiv:1902.03694.

[37] B. Shi, S.S. Iyengar, A conservation law method based on optimization. In : Mathematical Theories of Machine Learning – Theory and Applications, Springer, Cham, 2019, pp. 63-85 https://doi.org/10.1007/978-3-030-17076-9_8.

[38] G. Strang, On the construction and comparison of difference schemes, SIAM J. Numer. Anal. 5 (1968), no. 3, pp. 506-517.

[39] W.J. Su, S. Boyd, E.J. Candes, A differential equation for modeling Nesterov’s accelerated gradient method : Theory and insights, J. Machine Learning Res. (17) 2016, pp. 1-43.

[40] T. Sun, P. Yin, D. Li, C. Huang, L. Guan, H. Jiang, Non-ergodic convergence analysis of heavy-ball algorithms, AAAI, 2019.

[41] H.F. Trotter, On the product of semi-groups of operators, Proc. Am. Math. Soc. 10 (1959), no. 4, pp. 545-551.

[42] P. Tseng, On accelerated proximal-gradient methods for convex-concave optimization, Manuscript, May 21 2018, mit.edu/~dimitrib/PTseng/papers/apgm.pdf.

[43] R. de Vogelaere, Methods of integration which preserve the contact transformation property of the Hamiltonian equations, Report no. 4, Dept. Math., Univ. of Notre Dame, Notre Dame, Ind., 1956.

2020

Related Posts