... From Zero to Hero: How local curvature at artless initial conditions leads away from bad minima | Tony Bonnaire

From Zero to Hero: How local curvature at artless initial conditions leads away from bad minima

(Left) Phases of the gradient flow dynamics in the phase retrieval loss landscape for N going to infinity with a pictural representation of the Hessian eigenvalue distribution when varying the signal-to-noise ratio lpha. The red bar shows when an outlier exists in this distribution. (Right) Evolution of the local curvature: dynamics projected in the direction of least stability of the Hessian matrix (black arrows) in the intermediate (orange) regime of signal-to-noise ratio. Starting from an artless initial condition, gradient descent reaches a bad minimum. The green arrows indicate downward directions towards the good solution during the dynamics. At the end, the local curvature has become positive (red arrows).

Abstract

We provide an analytical study of the evolution of the Hessian during gradient descent dynamics, and relate a transition in its spectral properties to the ability of finding good minima. We focus on the phase retrieval problem as a case study for complex loss landscapes. We first characterize the high-dimensional limit where both the number M and the dimension N of the data are going to infinity at fixed signal-to-noise ratio alpha=M/N. For small alpha, the Hessian is uninformative with respect to the signal. For alpha larger than a critical value, the Hessian displays at short-times a downward direction pointing towards good minima. While descending, a transition in the spectrum takes place. The direction is lost and the system gets trapped in bad minima. Hence, the local landscape is benign and informative at first, before gradient descent brings the system into an uninformative maze. Through both theoretical analysis and numerical experiments, we show that this dynamical transition plays a crucial role for finite (even very large) N. It allows the system to recover the signal well before the algorithmic threshold corresponding to the infinite limit. Our analysis sheds light on this new mechanism that facilitates gradient descent dynamics in finite dimensions, and highlights the importance of a good initialization based on spectral properties for optimization in complex high-dimensional landscapes.

Publication
arXiv:2403.02418
Tony Bonnaire
Tony Bonnaire
Postdoctoral researcher