finiteautomata commited on
Commit
d383d61
1 Parent(s): 353662c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +65 -0
README.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - es
4
+
5
+ tags:
6
+ - twitter
7
+ - sentiment-analysis
8
+
9
+ ---
10
+ # POS Tagging model for Spanish/English
11
+ ## robertuito-pos
12
+
13
+ Repository: [https://github.com/pysentimiento/pysentimiento/](https://github.com/finiteautomata/pysentimiento/)
14
+
15
+
16
+ Model trained with the Spanish/English split of the [LinCE NER corpus](https://ritual.uh.edu/lince/), a code-switched benchmark . Base model is [RoBERTuito](https://github.com/pysentimiento/robertuito), a RoBERTa model trained in Spanish tweets.
17
+
18
+
19
+
20
+ ## Results
21
+
22
+ Results are taken from the LinCE leaderboard
23
+
24
+ | Model | Sentiment | NER | POS |
25
+ |:-----------------------|:----------------|:-------------------|:--------|
26
+ | RoBERTuito | **60.6** | 68.5 | 97.2 |
27
+ | XLM Large | -- | **69.5** | **97.2** |
28
+ | XLM Base | -- | 64.9 | 97.0 |
29
+ | C2S mBERT | 59.1 | 64.6 | 96.9 |
30
+ | mBERT | 56.4 | 64.0 | 97.1 |
31
+ | BERT | 58.4 | 61.1 | 96.9 |
32
+ | BETO | 56.5 | -- | -- |
33
+
34
+
35
+
36
+ ## Citation
37
+
38
+ If you use this model in your research, please cite pysentimiento, RoBERTuito and LinCE papers:
39
+
40
+ ```
41
+ @misc{perez2021pysentimiento,
42
+ title={pysentimiento: A Python Toolkit for Sentiment Analysis and SocialNLP tasks},
43
+ author={Juan Manuel Pérez and Juan Carlos Giudici and Franco Luque},
44
+ year={2021},
45
+ eprint={2106.09462},
46
+ archivePrefix={arXiv},
47
+ primaryClass={cs.CL}
48
+ }
49
+ @misc{perez2021robertuito,
50
+ title={RoBERTuito: a pre-trained language model for social media text in Spanish},
51
+ author={Juan Manuel Pérez and Damián A. Furman and Laura Alonso Alemany and Franco Luque},
52
+ year={2021},
53
+ eprint={2111.09453},
54
+ archivePrefix={arXiv},
55
+ primaryClass={cs.CL}
56
+ }
57
+
58
+ @inproceedings{aguilar2020lince,
59
+ title={LinCE: A Centralized Benchmark for Linguistic Code-switching Evaluation},
60
+ author={Aguilar, Gustavo and Kar, Sudipta and Solorio, Thamar},
61
+ booktitle={Proceedings of the 12th Language Resources and Evaluation Conference},
62
+ pages={1803--1813},
63
+ year={2020}
64
+ }
65
+ ```