Update README.md
Browse files
README.md
CHANGED
@@ -47,7 +47,7 @@ set a seed for reproducibility:
|
|
47 |
|
48 |
## Dataset
|
49 |
|
50 |
-
The training data is the respective subset of the data used for [occiglot-7b-eu5](https://huggingface.co/occiglot/occiglot-7b-eu5), i.e.
|
51 |
|
52 |
The data distribution by language (estimated) is as follows:
|
53 |
- English: ~34%
|
|
|
47 |
|
48 |
## Dataset
|
49 |
|
50 |
+
The training data is the respective subset of the data used for [occiglot-7b-eu5](https://huggingface.co/occiglot/occiglot-7b-eu5), i.e. French plus English and Code.
|
51 |
|
52 |
The data distribution by language (estimated) is as follows:
|
53 |
- English: ~34%
|