louisbrulenaudet
commited on
Commit
•
0becd46
1
Parent(s):
07cb9c2
Update README.md
Browse files
README.md
CHANGED
@@ -18,9 +18,9 @@ tags:
|
|
18 |
---
|
19 |
<img src="assets/thumbnail.webp">
|
20 |
|
21 |
-
# Romulus,
|
22 |
|
23 |
-
Romulus is a series of
|
24 |
|
25 |
The training corpus is made up of around 34,864,949 tokens (calculated with the meta-llama/Meta-Llama-3.1-8B-Instruct tokenizer).
|
26 |
|
@@ -210,7 +210,7 @@ If you use this code in your research, please use the following BibTeX entry.
|
|
210 |
```BibTeX
|
211 |
@misc{louisbrulenaudet2024,
|
212 |
author = {Louis Brulé Naudet},
|
213 |
-
title = {Romulus,
|
214 |
year = {2024}
|
215 |
howpublished = {\url{https://huggingface.co/datasets/louisbrulenaudet/Romulus-cpt-fr}},
|
216 |
}
|
|
|
18 |
---
|
19 |
<img src="assets/thumbnail.webp">
|
20 |
|
21 |
+
# Romulus, continually pre-trained models for French law.
|
22 |
|
23 |
+
Romulus is a series of continually pre-trained models enriched in French law and intended to serve as the basis for a fine-tuning process on labeled data. Please note that these models have not been aligned for the production of usable text as they stand, and will certainly need to be fine-tuned for the desired tasks in order to produce satisfactory results.
|
24 |
|
25 |
The training corpus is made up of around 34,864,949 tokens (calculated with the meta-llama/Meta-Llama-3.1-8B-Instruct tokenizer).
|
26 |
|
|
|
210 |
```BibTeX
|
211 |
@misc{louisbrulenaudet2024,
|
212 |
author = {Louis Brulé Naudet},
|
213 |
+
title = {Romulus, continually pre-trained models for French law},
|
214 |
year = {2024}
|
215 |
howpublished = {\url{https://huggingface.co/datasets/louisbrulenaudet/Romulus-cpt-fr}},
|
216 |
}
|