This is a severely undertrained research network as a POC for the architecture. It was trained on ~700 example images for 2000 epochs reaching a minimal MSE loss of ~0.06. The generation is unconditioned (No text knowledge yet, simply generates something plauible from the flow objective.) This repo is meant only as a demo of a strong, <100M parameter example model that can achieve strong color balance and achieve low loss on pixel diffusion. The next step is scaling up the data.
A semi custom network based on the follow paper Simpler Diffusion (SiD2)
This network uses the optimal transport flow matching objective outlined Flow Matching for Generative Modeling
xATGLU Layers are used instead of linears for entry into the transformer MLP layer Expanded Gating Ranges Improve Activation Functions
python train.py
will train a new image network on the provided dataset (Currently the dataset is being fully rammed into GPU and is defined in the preload_dataset function)
python test_sample.py step_1799.safetensors
Where step_1799.safetensors is the desired model to test inference on. This will always generate a sample grid of 16x16 images.