julien-c HF staff commited on
Commit
02977b6
1 Parent(s): bc27c62

Migrate model card from transformers-repo

Browse files

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/ixa-ehu/berteus-base-cased/README.md

Files changed (1) hide show
  1. README.md +28 -0
README.md ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: eu
3
+ ---
4
+
5
+ # BERTeus base cased
6
+
7
+ This is the Basque language pretrained model presented in [Give your Text Representation Models some Love: the Case for Basque](https://arxiv.org/pdf/2004.00033.pdf). This model has been trained on a Basque corpus comprising Basque crawled news articles from online newspapers and the Basque Wikipedia. The training corpus contains 224.6 million tokens, of which 35 million come from the Wikipedia.
8
+
9
+ BERTeus has been tested on four different downstream tasks for Basque: part-of-speech (POS) tagging, named entity recognition (NER), sentiment analysis and topic classification; improving the state of the art for all tasks. See summary of results below:
10
+
11
+
12
+ | Downstream task | BERTeus | mBERT | Previous SOTA |
13
+ | --------------- | ------- | ------| ------------- |
14
+ | Topic Classification | **76.77** | 68.42 | 63.00 |
15
+ | Sentiment | **78.10** | 71.02 | 74.02 |
16
+ | POS | **97.76** | 96.37 | 96.10 |
17
+ | NER | **87.06** | 81.52 | 76.72 |
18
+
19
+
20
+ If using this model, please cite the following paper:
21
+ ```
22
+ @inproceedings{agerri2020give,
23
+ title={Give your Text Representation Models some Love: the Case for Basque},
24
+ author={Rodrigo Agerri and I{\~n}aki San Vicente and Jon Ander Campos and Ander Barrena and Xabier Saralegi and Aitor Soroa and Eneko Agirre},
25
+ booktitle={Proceedings of the 12th International Conference on Language Resources and Evaluation},
26
+ year={2020}
27
+ }
28
+ ```