Commit
·
d20173e
1
Parent(s):
e16bf00
Update README.md
Browse files
README.md
CHANGED
@@ -1,8 +1,8 @@
|
|
1 |
# Twitter-XLM-Roberta-base
|
2 |
-
This is a XLM-Roberta-base model trained on ~198M multilingual tweets, described and evaluated in the [reference paper](https://arxiv.org/abs/2104.12250). To evaluate this and other LMs on Twitter-specific data, please refer to the [main repository](https://github.com/cardiffnlp/xlm-t).
|
|
|
|
|
3 |
|
4 |
-
## Preprocess Text
|
5 |
-
Replace usernames and links for placeholders: "@user" and "http".
|
6 |
```python
|
7 |
def preprocess(text):
|
8 |
new_text = []
|
|
|
1 |
# Twitter-XLM-Roberta-base
|
2 |
+
This is a XLM-Roberta-base model trained on ~198M multilingual tweets, described and evaluated in the [reference paper](https://arxiv.org/abs/2104.12250). To evaluate this and other LMs on Twitter-specific data, please refer to the [main repository](https://github.com/cardiffnlp/xlm-t). A usage example is provided below.
|
3 |
+
|
4 |
+
## Computing tweet similarity
|
5 |
|
|
|
|
|
6 |
```python
|
7 |
def preprocess(text):
|
8 |
new_text = []
|