Skip to content

soarxyn/diffusion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Diffusion

Header

Inspired by Yang Song's Generative Modeling by Estimating Gradients of the Data Distribution.

A clean PyTorch implementation of Denoising Diffusion Probabilistic Models (DDPM) with DDIM sampling and Classifier-Free Guidance (CFG), built on top of PyTorch Lightning.

Note: This repository is a personal study project aimed at reproducing and understanding diffusion models from the ground up. The goal is to stay in sync with the literature and build intuition for the moving parts, not to provide a production-ready library.

Results

CelebA

CelebA samples

ResUNet (hidden_dim=192, depth=4, attention at 16×16 and 8×8), cosine schedule, 1000 timesteps, trained for 105k steps on 64×64 crops.

CIFAR-10

CIFAR-10 denoising

ResUNet (hidden_dim=128, depth=4, attention at 16×16 and 4×4), cosine schedule, 1000 timesteps, trained for 400 epochs on 32×32 images.

Imagenette (class-conditional)

Imagenette samples

ResUNet (hidden_dim=192, depth=4, attention at 16×16 and 8×8), cosine schedule, 1000 timesteps, trained for 120k steps on 64×64 images with CFG (guidance scale=4, class dropout=0.15).

Features

  • ResUNet backbone with time-conditioned residual blocks and multi-head self-attention
  • DDPM ancestral sampling and DDIM deterministic sampling
  • Classifier-Free Guidance (CFG) for class-conditional generation
  • EMA weight averaging for stable, high-quality samples
  • Cosine and linear noise schedules
  • WandB integration with automatic sample logging during training
  • Config-driven training via LightningCLI — no code changes needed to swap datasets or architectures

Datasets

Dataset Resolution Conditional
FashionMNIST 28×28 No
CIFAR-10 32×32 No
Imagenette 64×64 Yes (10 classes)
CelebA 64×64 No

Installation

Requires Python 3.14+ and uv.

git clone https://github.com/Soarxyn/diffusion
cd diffusion
uv sync

For GPU support (CUDA 13.0):

uv sync --extra-index-url https://download.pytorch.org/whl/cu130

Training

Training is fully config-driven. Pick a preset or supply your own YAML:

# FashionMNIST (default, quick experiment)
uv run python -m diffusion fit --config config/default.yaml

# CIFAR-10
uv run python -m diffusion fit --config config/cifar10.yaml

# Imagenette with CFG
uv run python -m diffusion fit --config config/imagenette_cond.yaml

# CelebA
uv run python -m diffusion fit --config config/celeba.yaml

You can override any parameter directly from the command line:

uv run python -m diffusion fit --config config/celeba.yaml --model.lr 1e-4 --trainer.max_epochs 200

Checkpoints are saved to results/ and samples are logged to WandB every N steps.

Training loss

Architecture

The model is a ResUNet with:

  • Time embeddings injected into each residual block via scale-shift normalization
  • Self-attention at configurable resolution levels
  • Pixel-shuffle downsampling / nearest-neighbour upsampling
  • Class embeddings added to the time embedding for conditional generation (CFG uses a null token with dropout during training)

The diffusion process supports both linear and cosine noise schedules. At inference, both DDPM (full 1000-step) and DDIM (configurable steps, default 50) samplers are available.

References

Acknowledgements

About

PyTorch implementation of DDPM with DDIM sampling and Classifier-Free Guidance, supporting CelebA and Imagenette

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages