pysentimiento
/

robertuito-pos

Token Classification

Inference Endpoints

Model card Files Files and versions Community

robertuito-pos / README.md

finiteautomata's picture

Update README.md

c65b4a1 over 1 year ago

|

3.29 kB

	---
	language:
	- es

	tags:
	- twitter
	- pos-tagging

	---
	# POS Tagging model for Spanish/English
	## robertuito-pos

	Repository: [https://github.com/pysentimiento/pysentimiento/](https://github.com/finiteautomata/pysentimiento/)


	Model trained with the Spanish/English split of the [LinCE NER corpus](https://ritual.uh.edu/lince/), a code-switched benchmark . Base model is [RoBERTuito](https://github.com/pysentimiento/robertuito), a RoBERTa model trained in Spanish tweets.

	## Usage

	If you want to use this model, we suggest you use it directly from the `pysentimiento` library as it is not working properly with the pipeline due to tokenization issues

	```python
	from pysentimiento import create_analyzer

	pos_analyzer = create_analyzer("pos", lang="es")

	pos_analyzer.predict("Quiero que esto funcione correctamente! @perezjotaeme")


	>[{'type': 'PROPN', 'text': 'Quiero', 'start': 0, 'end': 6},
	> {'type': 'SCONJ', 'text': 'que', 'start': 7, 'end': 10},
	> {'type': 'PRON', 'text': 'esto', 'start': 11, 'end': 15},
	> {'type': 'VERB', 'text': 'funcione', 'start': 16, 'end': 24},
	> {'type': 'ADV', 'text': 'correctamente', 'start': 25, 'end': 38},
	> {'type': 'PUNCT', 'text': '!', 'start': 38, 'end': 39},
	> {'type': 'NOUN', 'text': '@perezjotaeme', 'start': 40, 'end': 53}]
	```


	## Results

	Results are taken from the LinCE leaderboard

	\| Model \| Sentiment \| NER \| POS \|
	\|:-----------------------\|:----------------\|:-------------------\|:--------\|
	\| RoBERTuito \| 60.6 \| 68.5 \| 97.2 \|
	\| XLM Large \| -- \| 69.5 \| 97.2 \|
	\| XLM Base \| -- \| 64.9 \| 97.0 \|
	\| C2S mBERT \| 59.1 \| 64.6 \| 96.9 \|
	\| mBERT \| 56.4 \| 64.0 \| 97.1 \|
	\| BERT \| 58.4 \| 61.1 \| 96.9 \|
	\| BETO \| 56.5 \| -- \| -- \|



	## Citation

	If you use this model in your research, please cite pysentimiento, RoBERTuito and LinCE papers:

	```
	@misc{perez2021pysentimiento,
	title={pysentimiento: A Python Toolkit for Sentiment Analysis and SocialNLP tasks},
	author={Juan Manuel Pérez and Juan Carlos Giudici and Franco Luque},
	year={2021},
	eprint={2106.09462},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	@inproceedings{ortega2019overview,
	title={Overview of the task on irony detection in Spanish variants},
	author={Ortega-Bueno, Reynier and Rangel, Francisco and Hern{\'a}ndez Far{\i}as, D and Rosso, Paolo and Montes-y-G{\'o}mez, Manuel and Medina Pagola, Jos{\'e} E},
	booktitle={Proceedings of the Iberian languages evaluation forum (IberLEF 2019), co-located with 34th conference of the Spanish Society for natural language processing (SEPLN 2019). CEUR-WS. org},
	volume={2421},
	pages={229--256},
	year={2019}
	}

	@inproceedings{aguilar2020lince,
	title={LinCE: A Centralized Benchmark for Linguistic Code-switching Evaluation},
	author={Aguilar, Gustavo and Kar, Sudipta and Solorio, Thamar},
	booktitle={Proceedings of the 12th Language Resources and Evaluation Conference},
	pages={1803--1813},
	year={2020}
	}
	```