rwightman HF staff commited on
Commit
427ed3f
1 Parent(s): dc526d9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -3
README.md CHANGED
@@ -83,7 +83,7 @@ This model was trained with one of (see table in intro):
83
 
84
  All models were trained with a global batch size of 81920 for 64 checkpoint intervals of 203.7M samples for a total of ~13B samples seen over training.
85
 
86
- For 256x256 models, a slurm script w/ srun below was used on 20 8-GPU nodes (Stability), switching to 40 4-GPU nodes for time on JUWELS.
87
 
88
  ```
89
  /opt/slurm/sbin/srun --cpu_bind=v --accel-bind=gn python -m training.main \
@@ -129,6 +129,8 @@ The models achieve between 70.8 and 71.7 zero-shot top-1 accuracy on ImageNet-1k
129
 
130
  An initial round of benchmarks have been performed on a wider range of datasets, to be viewable at https://github.com/LAION-AI/CLIP_benchmark/blob/main/benchmark/results.ipynb
131
 
 
 
132
  # Acknowledgements
133
 
134
  Acknowledging [stability.ai](https://stability.ai/) and the Gauss Centre for Supercomputing e.V. (http://gauss-centre.eu) for funding this part of work by providing computing time through the John von Neumann Institute for Computing (NIC) on the GCS Supercomputer JUWELS Booster at Jülich Supercomputing Centre (JSC).
@@ -137,8 +139,31 @@ Acknowledging [stability.ai](https://stability.ai/) and the Gauss Centre for Sup
137
 
138
  **BibTeX:**
139
 
140
- In addition to forthcoming LAION-5B (https://laion.ai/blog/laion-5b/) paper, please cite:
141
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
142
 
143
  OpenCLIP software
144
  ```bibtex
 
83
 
84
  All models were trained with a global batch size of 81920 for 64 checkpoint intervals of 203.7M samples for a total of ~13B samples seen over training.
85
 
86
+ For 256x256 models, a slurm script w/ srun below was used on 20 8-GPU (A100 40GB) nodes (Stability), switching to 40 4-GPU nodes for time on JUWELS.
87
 
88
  ```
89
  /opt/slurm/sbin/srun --cpu_bind=v --accel-bind=gn python -m training.main \
 
129
 
130
  An initial round of benchmarks have been performed on a wider range of datasets, to be viewable at https://github.com/LAION-AI/CLIP_benchmark/blob/main/benchmark/results.ipynb
131
 
132
+ As part of exploring increased augmentation + regularization, early evalations suggest that `augreg` trained models evaluate well over a wider range of resolutions. This is especially true for the 320x320 LAION-A model, where the augreg run was lower than the non-augreg when evaluated at the train resolution of 320x320 (71.3 vs 71.7), but improves to 72.2 when evaluated at 384x384 (the non-augreg drops to 71.0 at 384x384).
133
+
134
  # Acknowledgements
135
 
136
  Acknowledging [stability.ai](https://stability.ai/) and the Gauss Centre for Supercomputing e.V. (http://gauss-centre.eu) for funding this part of work by providing computing time through the John von Neumann Institute for Computing (NIC) on the GCS Supercomputer JUWELS Booster at Jülich Supercomputing Centre (JSC).
 
139
 
140
  **BibTeX:**
141
 
142
+ LAION-5B
143
+ ```bibtex
144
+ @inproceedings{schuhmann2022laionb,
145
+ title={{LAION}-5B: An open large-scale dataset for training next generation image-text models},
146
+ author={Christoph Schuhmann and
147
+ Romain Beaumont and
148
+ Richard Vencu and
149
+ Cade W Gordon and
150
+ Ross Wightman and
151
+ Mehdi Cherti and
152
+ Theo Coombes and
153
+ Aarush Katta and
154
+ Clayton Mullis and
155
+ Mitchell Wortsman and
156
+ Patrick Schramowski and
157
+ Srivatsa R Kundurthy and
158
+ Katherine Crowson and
159
+ Ludwig Schmidt and
160
+ Robert Kaczmarczyk and
161
+ Jenia Jitsev},
162
+ booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
163
+ year={2022},
164
+ url={https://openreview.net/forum?id=M3Y74vmsMcY}
165
+ }
166
+ ```
167
 
168
  OpenCLIP software
169
  ```bibtex