laion
/

CLIP-convnext_base_w-laion2B-s13B-b82K-augreg

Zero-Shot Image Classification

OpenCLIP

TensorBoard

Safetensors

clip

Model card Files Files and versions Metrics Training metrics Community

rwightman HF staff commited on Jan 25, 2023

Commit

d5be347

•

1 Parent(s): 3e473ec

Update README.md

Browse files

Files changed (1) hide show

README.md +26 -4

README.md CHANGED Viewed

@@ -83,7 +83,7 @@ This model was trained with one of (see table in intro):
 All models were trained with a global batch size of 81920 for 64 checkpoint intervals of 203.7M samples for a total of ~13B samples seen over training.
-For 256x256 models, a slurm script w/ srun below was used on 20 8-GPU nodes (Stability), switching to 40 4-GPU nodes for time on JUWELS.
 ```
 /opt/slurm/sbin/srun --cpu_bind=v --accel-bind=gn python -m training.main \
@@ -129,7 +129,7 @@ The models achieve between 70.8 and 71.7 zero-shot top-1 accuracy on ImageNet-1k
 An initial round of benchmarks have been performed on a wider range of datasets, to be viewable at https://github.com/LAION-AI/CLIP_benchmark/blob/main/benchmark/results.ipynb
-As part of exploring increased augmentation + regularization, more analysis is required but early tests indicate the `augreg` models evaluate well over a wider range of resolutions than the non augreg models. Especially the 320x320 LAION-A model, where the augreg disappointed at 320x320 w/ 71.3, but passes the non augreg 71.7 w/ a 72.2 when evaluated at 384x384 (non augreg drops to 71.0 at 384x384).
 # Acknowledgements
@@ -139,8 +139,30 @@ Acknowledging [stability.ai](https://stability.ai/) and the Gauss Centre for Sup
 **BibTeX:**
-In addition to forthcoming LAION-5B (https://laion.ai/blog/laion-5b/) paper, please cite:
 OpenCLIP software
 ```bibtex

 All models were trained with a global batch size of 81920 for 64 checkpoint intervals of 203.7M samples for a total of ~13B samples seen over training.
+For 256x256 models, a slurm script w/ srun below was used on 20 8-GPU (A100 40GB) nodes (Stability), switching to 40 4-GPU nodes for time on JUWELS.
 ```
 /opt/slurm/sbin/srun --cpu_bind=v --accel-bind=gn python -m training.main \
 An initial round of benchmarks have been performed on a wider range of datasets, to be viewable at https://github.com/LAION-AI/CLIP_benchmark/blob/main/benchmark/results.ipynb
+As part of exploring increased augmentation + regularization, early evalations suggest that `augreg` trained models evaluate well over a wider range of resolutions. This is especially true for the 320x320 LAION-A model, where the augreg run was lower than the non-augreg when evaluated at the train resolution of 320x320 (71.3 vs 71.7), but improves to 72.2 when evaluated at 384x384 (the non-augreg drops to 71.0 at 384x384).
 # Acknowledgements
 **BibTeX:**
+```bibtex
+@inproceedings{schuhmann2022laionb,
+  title={{LAION}-5B: An open large-scale dataset for training next generation image-text models},
+  author={Christoph Schuhmann and
+          Romain Beaumont and
+          Richard Vencu and
+          Cade W Gordon and
+          Ross Wightman and
+          Mehdi Cherti and
+          Theo Coombes and
+          Aarush Katta and
+          Clayton Mullis and
+          Mitchell Wortsman and
+          Patrick Schramowski and
+          Srivatsa R Kundurthy and
+          Katherine Crowson and
+          Ludwig Schmidt and
+          Robert Kaczmarczyk and
+          Jenia Jitsev},
+  booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
+  year={2022},
+  url={https://openreview.net/forum?id=M3Y74vmsMcY}
+}
+```
 OpenCLIP software
 ```bibtex