Diffusion Models

In a nutshell:

Repetitive refining of a sample to get a final good image.

  • This repetitive sampling makes the generation process slow.

Forward Process

They take the input image x0x_0 and gradually add Gaussian noise to it through a series of TT steps. We will call this the forward process. Notably, this is unrelated to the forward pass of a neural network.

Basically, at each time step you add some noise to the sample. So you start with the image, and then keep adding Gaussian noise until it becomes just noise.

Reverse Process

In this, a neural network is trained to reverse the noise-adding process. In this the network is trained to predict a "less" noise sample given a noise sample, basically the opposite of the forward process at a time tt.

Resources

Very good way to understand image generation from auto-regression to diffusion point of view.

Last updated