BERTuit-base / README.md
jahuerta92's picture
Update README.md
c2b379e
---
pipeline_tag: masked-language-modeling
tags:
- online social networks
- twitter
- spanish
language: es
license: apache-2.0
widget:
- text: "Las <mask> causan hipoxia."
example_title: "Natural Language Inference"
---
Model BERTuit as presented in the [BERTuit: Understanding Spanish language in Twitter through a native transformer](https://arxiv.org/abs/2204.03465) article.
Before tokenization replace user tags and urls with "\<usr\>" and "\<url\>" respectively.
Tokenize text with base class RoBERTaTokenizer.