cloneofsimo
commited on
Commit
•
bd69d39
1
Parent(s):
ff36cb0
Update README.md
Browse files
README.md
CHANGED
@@ -34,6 +34,11 @@ Foundation models seems to only belong in companies. I tell you, while it is tem
|
|
34 |
"You can't do this, you need team of engineers and researchers to make billion-scale foundation models"
|
35 |
it is not the case, and one bored grad student can obviously very easily pull this off. Now you know, you know.
|
36 |
|
|
|
|
|
|
|
|
|
|
|
37 |
# What did you do and how long did it take?
|
38 |
|
39 |
1. My first steps were to implement everything in torch and see if it works in MNIST, CIFAR-10. This took about 8 hours, including reading papers, implementing DiT, etc, everything.
|
|
|
34 |
"You can't do this, you need team of engineers and researchers to make billion-scale foundation models"
|
35 |
it is not the case, and one bored grad student can obviously very easily pull this off. Now you know, you know.
|
36 |
|
37 |
+
# Warning: This is extremely undertrained model
|
38 |
+
|
39 |
+
This is very early preview of my attempt to recreate SD3, trained on single-node, with batch size of 128, for only ~550k steps. Don't expect SoTA SD3, midjourney, dalle3 quality images with this model.
|
40 |
+
I am only one person, with one 8xH100 node. *They have clusters of gpus*.
|
41 |
+
|
42 |
# What did you do and how long did it take?
|
43 |
|
44 |
1. My first steps were to implement everything in torch and see if it works in MNIST, CIFAR-10. This took about 8 hours, including reading papers, implementing DiT, etc, everything.
|