...
Amongst the recent breakthroughs of machine learning these past years is the field of generative models, able to create always more realistic samples from various finite datasets of images, videos, or sounds. At the forefront of this revolution are Diffusion Models (DMs) exploiting the gradient of the log-probability distribution to generate new samples. However, the reasons for their success still lack a theoretical understanding. In this talk, I will give a brief introduction to diffusion models and then delve into the analysis of a well-defined high-dimensional model: a mixture of two Gaussians. Using methods from statistical physics, we will exhibit the various transitions taking place during the generation dynamics. In particular, we first identify a ‘speciation’ transition where the generated sample acquire its structure, later followed by a second transition, called ‘collapse’, where the trajectories become attracted to one of the training point. These theoretical findings, which we establish in the high-dimensional limit of the Gaussian mixture model, will then be generalised and validated by experiments on realistic datasets.