versae commited on
Commit
e3f2e3d
1 Parent(s): 6a06b78

Update README

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -10,8 +10,8 @@ widget:
10
  ---
11
 
12
  - [Version beta](https://huggingface.co/bertin-project/bertin-roberta-base-spanish/tree/beta): July 15th, 2021
13
- - [Version 1.0](https://huggingface.co/bertin-project/bertin-roberta-base-spanish/tree/1.0) (current): July 26th, 2021
14
-
15
 
16
  # BERTIN
17
 
@@ -252,7 +252,7 @@ In addition to the tasks above, we also trained the [`beta`](https://huggingface
252
 
253
  Results for PAWS-X seem surprising given the large differences in performance. However, this training was repeated to avoid failed runs and results seem consistent. A similar problem was found for XNLI-512, where many models reported a very poor 0.3333 accuracy on a first run (and even a second, in the case of BSC-BNE). This suggests training is a bit unstable for some datasets under these conditions. Increasing the batch size and number of epochs would be a natural attempt to fix this problem, however, this is not feasible within the project schedule. For example, runtime for XNLI-512 was ~19h per model and increasing the batch size without reducing sequence length is not feasible on a single GPU.
254
 
255
- We are also releasing the fine-tuned models for `Gaussian`-512 and making it our version 1.0 (current).
256
 
257
  - POS: [`bertin-project/bertin-base-pos-conll2002-es`](https://huggingface.co/bertin-project/bertin-base-pos-conll2002-es/)
258
  - NER: [`bertin-project/bertin-base-ner-conll2002-es`](https://huggingface.co/bertin-project/bertin-base-ner-conll2002-es/)
10
  ---
11
 
12
  - [Version beta](https://huggingface.co/bertin-project/bertin-roberta-base-spanish/tree/beta): July 15th, 2021
13
+ - [Version v1](https://huggingface.co/bertin-project/bertin-roberta-base-spanish/tree/v1): July 26th, 2021
14
+ - [Version v1-512](https://huggingface.co/bertin-project/bertin-roberta-base-spanish/tree/v1-512): July 26th, 2021
15
 
16
  # BERTIN
17
 
252
 
253
  Results for PAWS-X seem surprising given the large differences in performance. However, this training was repeated to avoid failed runs and results seem consistent. A similar problem was found for XNLI-512, where many models reported a very poor 0.3333 accuracy on a first run (and even a second, in the case of BSC-BNE). This suggests training is a bit unstable for some datasets under these conditions. Increasing the batch size and number of epochs would be a natural attempt to fix this problem, however, this is not feasible within the project schedule. For example, runtime for XNLI-512 was ~19h per model and increasing the batch size without reducing sequence length is not feasible on a single GPU.
254
 
255
+ We are also releasing the fine-tuned models for `Gaussian`-512 and making it our version v1 default to 128 sequence length since it experimentally shows better performance on fill-mask task, while alse releasing the 512 sequence length version (v1-512) for fine-tuning.
256
 
257
  - POS: [`bertin-project/bertin-base-pos-conll2002-es`](https://huggingface.co/bertin-project/bertin-base-pos-conll2002-es/)
258
  - NER: [`bertin-project/bertin-base-ner-conll2002-es`](https://huggingface.co/bertin-project/bertin-base-ner-conll2002-es/)