cloneofsimo
/

lavenderflow-5.6B

Model card Files Files and versions Community

cloneofsimo commited on May 22

Commit

bd69d39

•

1 Parent(s): ff36cb0

Update README.md

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -34,6 +34,11 @@ Foundation models seems to only belong in companies. I tell you, while it is tem
 "You can't do this, you need team of engineers and researchers to make billion-scale foundation models"
 it is not the case, and one bored grad student can obviously very easily pull this off. Now you know, you know.
 # What did you do and how long did it take?
 1. My first steps were to implement everything in torch and see if it works in MNIST, CIFAR-10. This took about 8 hours, including reading papers, implementing DiT, etc, everything.

 "You can't do this, you need team of engineers and researchers to make billion-scale foundation models"
 it is not the case, and one bored grad student can obviously very easily pull this off. Now you know, you know.
+# Warning: This is extremely undertrained model
+This is very early preview of my attempt to recreate SD3, trained on single-node, with batch size of 128, for only ~550k steps. Don't expect SoTA SD3, midjourney, dalle3 quality images with this model.
+I am only one person, with one 8xH100 node. *They have clusters of gpus*.
 # What did you do and how long did it take?
 1. My first steps were to implement everything in torch and see if it works in MNIST, CIFAR-10. This took about 8 hours, including reading papers, implementing DiT, etc, everything.