ylecun/mnist
Viewer • Updated • 70k • 77k • 243
Task: Number-To-Image
Dataset: ylecun/mnist
Total training time: ~10 minutes
Inputs: Number (0-9)
Outputs: 32x32 image
Params: ~391k
Framework: PyTorch, diffusers
Author: Paul Courneya (Harley-ml)
MNiST-IMG-390k is an ~390k parameter model trained to generate an image based on an input number (0-9).
| Parameter | Value |
|---|---|
image_size |
32 |
in_channels |
1 |
out_channels |
1 |
num_classes |
10 |
block_out_channels |
[12, 16, 20] |
layers_per_block |
8 |
norm_num_groups |
4 |
Tiny diffusion gremlin architecture. Compact enough to run on mortal hardware instead of a datacenter powered by melting glaciers ðŸ«
MNiST-IMG was trained on Google Colaboratory (NVIDA Tesla T4) for ~10 minutes with a batch size of 64 for 10 epochs.
Loss ended at ~0.39.
Note: I can't provide the raw training logs as I loss it somehwere after training. Sorry!
At 1000 decoding steps:
At 200 decoding steps:
Use the script in the repo. inference.py
@misc{mnist-img-390k,
title = {MNIST-IMG-390k: a Tiny Diffusion Model for Generating Handwritten Digits},
author = {Paul Courneya; Harley-ml},
year = {2026},
url = {https://huggingface.co/Harley-ml/MNIST-IMG-390k}
}