Update README.md
Browse files
README.md
CHANGED
@@ -16,8 +16,7 @@ The model was pre-trained on a English and Dutch mC4 cleaned.
|
|
16 |
|
17 |
## Finetuning
|
18 |
|
19 |
-
The model was finetuned on
|
20 |
-
|
21 |
-
|
22 |
-
Note: multi-direction. Prepend either `translate Dutch to English: `
|
23 |
-
or `translate English to Dutch: `
|
|
|
16 |
|
17 |
## Finetuning
|
18 |
|
19 |
+
The model was finetuned on CCMatrix, validated on Tatoeba.
|
20 |
+
* **128-max token length**
|
21 |
+
* Only the first 25M sentences of CCMatrix were used, both en->nl and en->nl (total 50M sentences).
|
22 |
+
* Note: multi-direction. Prepend either `translate Dutch to English: ` or `translate English to Dutch: `
|
|