YouTube video summary

Stanford CS236: Deep Generative Models I 2023 I Lecture 17 - Discrete Latent Variable Models

Artificial intelligence

06 May 20243 min summaryFrom Stanford Online

Stanford CS236: Deep Generative Models I 2023 I Lecture 17 - Discrete Latent Variable Models

Stanford Online

Save to your library

Chat with this summary

Score-Based Diffusion Models (SDMs)

SDMs are closely connected to noising diffusion probabilistic models (DDPMs).
DDPMs can be interpreted as a VAE where the encoder adds noise to the data and the decoder denoises it.
Optimizing the evidence lower bound in DDPMs corresponds to learning a sequence of denoisers, similar to noise conditional score models.
The diffusion version of DDPMs considers a continuous spectrum of noise levels, allowing for more efficient sampling and likelihood evaluation.
The process of adding noise is described by a stochastic differential equation (SDE).
The drift term in the SDE becomes important when reversing the direction of time.
The reverse SDE has a drift term that is the score of the corresponding perturbed data density at time T.
Both the forward and reverse SDEs describe the same kind of trajectories, and the only difference is the direction of time.
Score-based models can be used to learn generative models by estimating score functions using a neural network.
The score-based MCMC method uses Langevin dynamics to generate samples from a density corresponding to a given time.
Discretizing the time axis in the score-based SDE leads to numerical errors, which can be reduced by using larger Langevin dynamics steps.
Score-based models can be converted into flow models by eliminating noise at every step, resulting in an infinitely deep continuous time normalizing flow model.

Efficient Sampling Techniques

The sampling process in SDMs can be reinterpreted as solving an ODE, where the dynamics of the ODE are defined by the score function of the diffusion model.
This perspective allows for leveraging techniques from numerical analysis and scientific computing to improve sampling efficiency and generate higher-quality samples.
Consistency models are neural networks that directly output the solution of the ODE, enabling fast sampling procedures.
Parallel-in-time methods can further accelerate the sampling process by leveraging multiple GPUs to compute the solution of the ODE in parallel.
Distillation techniques can be used to train student models that can approximate the solution of the ODE in fewer steps, leading to even faster sampling.

Stable Diffusion

Stable Diffusion uses a latent diffusion model, which adds an extra encoder and decoder layer at the beginning of the model.
This allows for faster training on low-resolution images or low-dimensional data.
Stable Diffusion pre-trains the outer encoder and then keeps it fixed while training the diffusion model over the latent space.
To incorporate text into the model, a pre-trained language model is used to map the text to a vector representation, which is then fed into the neural network architecture.

Conditional Generation

To control the generation process without training a different model, the prior distribution of the generative model is combined with a classifier's likelihood to sample from the conditional distribution of images given a specific label.
Computing the denominator of the posterior distribution is intractable, making it difficult to directly sample from the posterior.
Working at the level of scores simplifies the computation of the posterior score, allowing for easy incorporation of pre-trained models and classifiers.
By modifying the drift in the SDE or ODE to include the score of the classifier, one can steer the generative process towards images that are consistent with a desired class or caption.
Classifier-free guidance is a technique that avoids explicit classifier training by taking the difference of two diffusion models, one conditioned on side information and the other not.

Made with Recall · in 3 seconds

Get a summary like this for anything you read, watch or save.

Recall summarizes any link you paste, then keeps it in your personal library so you can search, chat with it, and never lose a key idea again.

YouTube videosArticlesPodcastsPDFsAnything else

Save this summary

Keep it in your library.

Save to your library

Browse all from Stanford Online →

Stanford CS153 Frontier Systems | The Road Ahead: Resilience Required

Stanford CS153 Frontier Systems | The Road Ahead: Resilience Required

YouTube02 Jun 2026

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 7 - Evaluation

Artificial Intelligence

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 7 - Evaluation

YouTube02 Jun 2026

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 8 - Trending Topics

Artificial Intelligence

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 8 - Trending Topics

YouTube02 Jun 2026

Stanford CS153 Frontier Systems | The AI Native Company: How One Founder Becomes a 1000x Engineer

Entrepreneurship

Stanford CS153 Frontier Systems | The AI Native Company: How One Founder Becomes a 1000x Engineer

YouTube25 May 2026

Stanford CS547 HCI Seminar | Spring 2026 | HCI and Human-Centered AI for Digital Health

Health & Medicine

Stanford CS547 HCI Seminar | Spring 2026 | HCI and Human-Centered AI for Digital Health

YouTube25 May 2026

Stanford CS25: Transformers United V6 I Distinct Modes of Generalization from Parameters and Context

Artificial Intelligence

Stanford CS25: Transformers United V6 I Distinct Modes of Generalization from Parameters and Context

YouTube25 May 2026

Ready to get started?

Save, summarize and chat with your content.

IT'S FREE

No credit card required · 30 Day Refund on Premium · 24 Hour Support

Recall web app on laptop, personal AI knowledge base for summarizing and chatting with your content