Pclanglais
commited on
Commit
•
aa2e23e
1
Parent(s):
cc381a1
Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,7 @@ language:
|
|
17 |
<img src="https://raw.githubusercontent.com/Pleias/logos/d6152d7943905da32a1e04fdfd7708ed9c7eed5e/PleIAs%201_0%20Full%20Logo%20(Black).png" style="width: 80%; margin: 0 auto; display: inline-block;"/>
|
18 |
</div>
|
19 |
|
20 |
-
**Pleias-3b-Preview** is an early preview of a 3 billion parameters base model trained by [Pleias](https://huggingface.co/PleIAs) on [Common Corpus](https://huggingface.co/datasets/PleIAs/common_corpus).
|
21 |
|
22 |
Like all the base and specialized models from Pleias, Pleias-3b-Preview has only been trained on open data out of copyright (public domain) or under a permissible license.
|
23 |
|
@@ -40,7 +40,7 @@ Text generation is currently able to support a range of creative writing tasks i
|
|
40 |
Pleias-3b-Preview has been successfully adapted for continuous pretraining and full-fine-tuning on document processing tasks such as RAG, translation or OCR correction. Given the small size of the model we do not recommend fine-tuning methods based on LORA.
|
41 |
|
42 |
## Training
|
43 |
-
Pleias-3b-Preview was fully pretrained at Jean Zay on 192 h100s for about 20 days (compute grant n°GC011015451). Training code relied on Nanotron, the open source library from HuggingFace. We provide the complete settings as a yaml file as part of our release.
|
44 |
|
45 |
Training schedule includes 518,000 steps (batch size 1,024) on a filtered and enhanced version of Common Corpus (1,086,324,736,000 tokens).
|
46 |
|
|
|
17 |
<img src="https://raw.githubusercontent.com/Pleias/logos/d6152d7943905da32a1e04fdfd7708ed9c7eed5e/PleIAs%201_0%20Full%20Logo%20(Black).png" style="width: 80%; margin: 0 auto; display: inline-block;"/>
|
18 |
</div>
|
19 |
|
20 |
+
**Pleias-3b-Preview** is an early preview of a 3 billion parameters base model trained by [Pleias](https://huggingface.co/PleIAs) on [Common Corpus](https://huggingface.co/datasets/PleIAs/common_corpus). Pleias-3b-Preview was pretrained at Jean Zay (compute grant n°GC011015451) with support from [Etalab](https://www.etalab.gouv.fr/)
|
21 |
|
22 |
Like all the base and specialized models from Pleias, Pleias-3b-Preview has only been trained on open data out of copyright (public domain) or under a permissible license.
|
23 |
|
|
|
40 |
Pleias-3b-Preview has been successfully adapted for continuous pretraining and full-fine-tuning on document processing tasks such as RAG, translation or OCR correction. Given the small size of the model we do not recommend fine-tuning methods based on LORA.
|
41 |
|
42 |
## Training
|
43 |
+
Pleias-3b-Preview was fully pretrained at Jean Zay on 192 h100s for about 20 days (compute grant n°GC011015451) with support from [Etalab](https://www.etalab.gouv.fr/). Training code relied on Nanotron, the open source library from HuggingFace. We provide the complete settings as a yaml file as part of our release.
|
44 |
|
45 |
Training schedule includes 518,000 steps (batch size 1,024) on a filtered and enhanced version of Common Corpus (1,086,324,736,000 tokens).
|
46 |
|