PleIAs
/

Pleias-3b-Preview

Model card Files Files and versions Community

Pclanglais commited on 19 days ago

Commit

aa2e23e

•

1 Parent(s): cc381a1

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ language:
   <img src="https://raw.githubusercontent.com/Pleias/logos/d6152d7943905da32a1e04fdfd7708ed9c7eed5e/PleIAs%201_0%20Full%20Logo%20(Black).png" style="width: 80%; margin: 0 auto; display: inline-block;"/>
 </div>
-**Pleias-3b-Preview** is an early preview of a 3 billion parameters base model trained by [Pleias](https://huggingface.co/PleIAs) on [Common Corpus](https://huggingface.co/datasets/PleIAs/common_corpus).
 Like all the base and specialized models from Pleias, Pleias-3b-Preview has only been trained on open data out of copyright (public domain) or under a permissible license.
@@ -40,7 +40,7 @@ Text generation is currently able to support a range of creative writing tasks i
 Pleias-3b-Preview has been successfully adapted for continuous pretraining and full-fine-tuning on document processing tasks such as RAG, translation or OCR correction. Given the small size of the model we do not recommend fine-tuning methods based on LORA.
 ## Training
-Pleias-3b-Preview was fully pretrained at Jean Zay on 192 h100s for about 20 days (compute grant n°GC011015451). Training code relied on Nanotron, the open source library from HuggingFace. We provide the complete settings as a yaml file as part of our release.
 Training schedule includes 518,000 steps (batch size 1,024) on a filtered and enhanced version of Common Corpus (1,086,324,736,000 tokens).

   <img src="https://raw.githubusercontent.com/Pleias/logos/d6152d7943905da32a1e04fdfd7708ed9c7eed5e/PleIAs%201_0%20Full%20Logo%20(Black).png" style="width: 80%; margin: 0 auto; display: inline-block;"/>
 </div>
+**Pleias-3b-Preview** is an early preview of a 3 billion parameters base model trained by [Pleias](https://huggingface.co/PleIAs) on [Common Corpus](https://huggingface.co/datasets/PleIAs/common_corpus). Pleias-3b-Preview was pretrained at Jean Zay (compute grant n°GC011015451) with support from [Etalab](https://www.etalab.gouv.fr/)
 Like all the base and specialized models from Pleias, Pleias-3b-Preview has only been trained on open data out of copyright (public domain) or under a permissible license.
 Pleias-3b-Preview has been successfully adapted for continuous pretraining and full-fine-tuning on document processing tasks such as RAG, translation or OCR correction. Given the small size of the model we do not recommend fine-tuning methods based on LORA.
 ## Training
+Pleias-3b-Preview was fully pretrained at Jean Zay on 192 h100s for about 20 days (compute grant n°GC011015451) with support from [Etalab](https://www.etalab.gouv.fr/). Training code relied on Nanotron, the open source library from HuggingFace. We provide the complete settings as a yaml file as part of our release.
 Training schedule includes 518,000 steps (batch size 1,024) on a filtered and enhanced version of Common Corpus (1,086,324,736,000 tokens).