Diffusion Models

In a nutshell:

Repetitive refining of a sample to get a final good image.

This repetitive sampling makes the generation process slow.

Forward Process

They take the input image $x_0$ and gradually add Gaussian noise to it through a series of $T$ steps. We will call this the forward process. Notably, this is unrelated to the forward pass of a neural network.

Basically, at each time step you add some noise to the sample. So you start with the image, and then keep adding Gaussian noise until it becomes just noise.

Reverse Process

In this, a neural network is trained to reverse the noise-adding process. In this the network is trained to predict a "less" noise sample given a noise sample, basically the opposite of the forward process at a time $t$ .

Resources

Very good way to understand image generation from auto-regression to diffusion point of view.

Diffusion model from the perspective of samples lying on data manifold and probablility density assosiated with them

Diffusion model from the perspective of score matching, langevin-dynamics,

Spectrum of diffusion vs Auto-regression

What are Diffusion Models?Lil'Log

PreviousV-JEPA / V-JEPA 2 NextImageGen

Last updated 28 days ago