Update README.md
Browse files
README.md
CHANGED
@@ -175,7 +175,7 @@ The dataset has the following language distribution:
|
|
175 |
|Es|41.38%|
|
176 |
|Ca|41.79%|
|
177 |
|
178 |
-
Note:
|
179 |
|
180 |
## Training procedure
|
181 |
|
@@ -204,8 +204,8 @@ The training lasted a total of 320 hours on 8 NVIDIA H100 GPUs with 80GB RAM.
|
|
204 |
|
205 |
### Framework versions
|
206 |
|
207 |
-
- Transformers 4.30.2
|
208 |
- Pytorch 2.0.0
|
|
|
209 |
- Datasets 2.13.1
|
210 |
- Tokenizers 0.13.3
|
211 |
|
@@ -218,7 +218,7 @@ The Language Technologies Unit from Barcelona Supercomputing Center.
|
|
218 |
For further information, please send an email to <langtech@bsc.es>.
|
219 |
|
220 |
### Copyright
|
221 |
-
Copyright
|
222 |
|
223 |
### License
|
224 |
[Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
|
|
175 |
|Es|41.38%|
|
176 |
|Ca|41.79%|
|
177 |
|
178 |
+
Note: A small amount of English data was kept to avoid catastrophic forgetting.
|
179 |
|
180 |
## Training procedure
|
181 |
|
|
|
204 |
|
205 |
### Framework versions
|
206 |
|
|
|
207 |
- Pytorch 2.0.0
|
208 |
+
- Transformers 4.30.2
|
209 |
- Datasets 2.13.1
|
210 |
- Tokenizers 0.13.3
|
211 |
|
|
|
218 |
For further information, please send an email to <langtech@bsc.es>.
|
219 |
|
220 |
### Copyright
|
221 |
+
Copyright(c) 2023 by Language Technologies Unit, Barcelona Supercomputing Center.
|
222 |
|
223 |
### License
|
224 |
[Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|