YouTube video summary

Stanford CS236: Deep Generative Models I 2023 I Lecture 12 - Energy Based Models

Artificial intelligence

06 May 20242 min summaryFrom Stanford Online

Stanford CS236: Deep Generative Models I 2023 I Lecture 12 - Energy Based Models

Stanford Online

Save to your library

Chat with this summary

Energy-Based Models (EBMs)

EBMs use an energy function to represent the probability distribution of data.
Sampling from EBMs is challenging, especially in high dimensions, making training computationally expensive.
Alternative training methods for EBMs are needed that do not require sampling during training.

Score Matching

The score function provides an alternative view of the original function by looking at things from the perspective of the gradient instead of the likelihood itself.
The Fisher divergence between two probability densities can be used as a loss function for training EBMs.
The Fisher divergence can be expressed in terms of the difference between the gradients of the log data density and the log model density.
This results in a loss function that can be evaluated and optimized as a function of the model parameters.
The loss function encourages the data points to be local maxima of the log-likelihood, ensuring a good fit of the model to the data.

Contrastive Learning

An alternative training method for EBMs involves contrasting data to samples from a noise distribution rather than directly to samples from the model.
By parameterizing the discriminator in terms of an energy-based model, the optimal discriminator will force the energy-based model to match the data distribution.
Contrastive learning with EBMs involves distinguishing between real data and fake samples generated from a fixed noise distribution.
The noise distribution should be close to the data distribution for effective learning.
Sampling during inference is not necessary as the trained model can be used as an energy-based model.

Noise Contrastive Estimation (NCE)

NCE is similar to AGN in that it uses binary cross-entropy loss and is a likelihood-free method.
Unlike AGN, NCE does not involve a Minimax optimization and is more stable to train.
NCE requires the ability to evaluate the likelihood of contrastive samples, while AGN only requires the ability to sample from the generator.
In NCE, the discriminator is trained to distinguish between real and noisy samples, and the energy function derived from the discriminator defines an energy-based model.

Flow Contrastive Estimation (FCE)

FCE is a variant of NCE where the noise distribution is defined by a normalizing flow model.
The flow model is trained adversarially to confuse the discriminator, making the classification problem harder and the noise distribution closer to the data distribution.
FCE provides both an energy-based model and a flow model, with the choice of which to use depending on the specific task.

Made with Recall · in 3 seconds

Get a summary like this for anything you read, watch or save.

Recall summarizes any link you paste, then keeps it in your personal library so you can search, chat with it, and never lose a key idea again.

YouTube videosArticlesPodcastsPDFsAnything else

Save this summary

Keep it in your library.

Save to your library

Browse all from Stanford Online →

Stanford CS153 Frontier Systems | The Road Ahead: Resilience Required

Stanford CS153 Frontier Systems | The Road Ahead: Resilience Required

YouTube02 Jun 2026

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 7 - Evaluation

Artificial Intelligence

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 7 - Evaluation

YouTube02 Jun 2026

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 8 - Trending Topics

Artificial Intelligence

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 8 - Trending Topics

YouTube02 Jun 2026

Stanford CS153 Frontier Systems | The AI Native Company: How One Founder Becomes a 1000x Engineer

Entrepreneurship

Stanford CS153 Frontier Systems | The AI Native Company: How One Founder Becomes a 1000x Engineer

YouTube25 May 2026

Stanford CS547 HCI Seminar | Spring 2026 | HCI and Human-Centered AI for Digital Health

Health & Medicine

Stanford CS547 HCI Seminar | Spring 2026 | HCI and Human-Centered AI for Digital Health

YouTube25 May 2026

Stanford CS25: Transformers United V6 I Distinct Modes of Generalization from Parameters and Context

Artificial Intelligence

Stanford CS25: Transformers United V6 I Distinct Modes of Generalization from Parameters and Context

YouTube25 May 2026

Ready to get started?

Save, summarize and chat with your content.

IT'S FREE

No credit card required · 30 Day Refund on Premium · 24 Hour Support

Recall web app on laptop, personal AI knowledge base for summarizing and chatting with your content