Text-to-Image
Diffusers
English

DDPM Project

This repository contains the implementation of Denoising Diffusion Probabilistic Models (DDPM).

Table of Contents

Introduction

Denoising Diffusion Probabilistic Models (DDPM) are a class of generative models that learn to generate data by reversing a diffusion process. This repository provides a comprehensive implementation of DDPM.

Installation

To install the necessary dependencies, run:

pip install -r requirements.txt

Usage

To train the model, use the following command:

python train.py

To generate samples, use:

python generate.py

Game

To understand the model and it's workings, we're working on a cool cute little game where the user is the UNET reverser/diffusion model and is tasked to denoise the images with noise made of grids of lines.

Use learndiffusion.vercel.app to access the primitive version of the game. You can also contribute to the game by checking out at the diffusion_game branch. A new model showcase will also be added such that the model's weights are loaded from the internet, model's files are installed and loaded into a gradio interface for direct use/inference on the vercel. Feel free to make changes for the same, issue is opened.

Explanations and Mathematics

  • slides from presentation :
  • notes/explanations : HERE
  • a cute lab talk ppt:
  • plato's allegory : <link to REPUBLIC>

Resources

Papers for background

  • UNET Paper for Biomedical Segmentation
  • Autoencooder
  • Variational Autoencoder
  • Markov Hierarchical VAE
  • Introductory Lectures on Diffusion Process

Youtube videos and courses

Mathematics

  • Outliers
  • Omar Jahil

Pytorch Implementation

Pretrained Weights

weights from the model can be found in pretrained_weights

For loading the pretrained weights:

model2 = SimpleUnet()
model2.load_state_dict(torch.load("/content/drive/MyDrive/Research Work/mlsa/DDPM/model_weights.pth"))
model2.eval()

For making inferences TODO: Errors in the sampling function, boolean errors and etc. Will open issues for solving by others as exercise if needed.

num_samples = 8  # Number of images to generate
image_size = (3, 32, 32)  # Example for CIFAR10
noise = torch.randn(num_samples, *image_size).to("cuda")

model2.to("cuda")
# Generate images by denoising
with torch.no_grad():
    generated_images = model2.sample(noise)

# Save the generated images
save_image(generated_images, "generated_images.png", nrow=4, normalize=True)

Contributing

Contributions are welcome! Please open an issue or submit a pull request.

Future Ideas

  • Make the model onnx compatible for training and inferencing on Intel GPUs
  • Build a Stable Diffusion model Text2Img using CLIP implementationnnnn !!!
  • Train the current model for a much larger dataset with more generalizations and nuances
Downloads last month
0
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Datasets used to train aharshit123456/learn_ddpm