Update README.md
Browse files
README.md
CHANGED
@@ -21,15 +21,11 @@ pipeline_tag: image-text-to-text
|
|
21 |
|
22 |
We train and release "Cerule", a tiny yet powerful Vision Lanuage Model based on the newly released Google's [Gemma-2b](https://huggingface.co/google/gemma-2b) and Google's [SigLIP](https://huggingface.co/google/siglip-so400m-patch14-384).
|
23 |
|
24 |
-
We utilise highly efficient data selection techniques with:
|
25 |
```
|
26 |
- Pretraining stage : 650K images (A LAION Subset)
|
27 |
-
- Finetuning stage : 695K images (SVIT-mix-665K modified
|
28 |
```
|
29 |
-
The training setup was `4xA100's 80GB` and took ~6 hours to pretrain and ~13 hours to finetune. We modify and adapt the training code from [
|
30 |
-
|
31 |
-
🚨 Training code, Data and more details to release soon!
|
32 |
-
|
33 |
|
34 |
---
|
35 |
| Image | Example |
|
|
|
21 |
|
22 |
We train and release "Cerule", a tiny yet powerful Vision Lanuage Model based on the newly released Google's [Gemma-2b](https://huggingface.co/google/gemma-2b) and Google's [SigLIP](https://huggingface.co/google/siglip-so400m-patch14-384).
|
23 |
|
|
|
24 |
```
|
25 |
- Pretraining stage : 650K images (A LAION Subset)
|
26 |
+
- Finetuning stage : 695K images (SVIT-mix-665K - Bunny mix modified by BAAI)
|
27 |
```
|
28 |
+
The training setup was `4xA100's 80GB` and took ~6 hours to pretrain and ~13 hours to finetune. We modify and adapt the training code from [Bunny](https://github.com/BAAI-DCAI/Bunny).
|
|
|
|
|
|
|
29 |
|
30 |
---
|
31 |
| Image | Example |
|