luisespinosa commited on
Commit
8eacbc5
•
1 Parent(s): aa7cb4c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -1
README.md CHANGED
@@ -1,4 +1,4 @@
1
- # Twitter-roBERTa-base
2
 
3
  This is a roBERTa-base model trained on ~58M tweets and finetuned for the Sentiment Analysis task at Semeval 2018.
4
  For full description: [_TweetEval_ benchmark (Findings of EMNLP 2020)](https://arxiv.org/pdf/2010.12421.pdf).
@@ -6,6 +6,15 @@ To evaluate this and other models on Twitter-specific data, please refer to the
6
 
7
  ## Example of classification
8
 
 
 
 
 
 
 
 
 
 
9
  ```python
10
  from transformers import AutoModelForSequenceClassification
11
  from transformers import TFAutoModelForSequenceClassification
@@ -37,6 +46,7 @@ model = AutoModelForSequenceClassification.from_pretrained(MODEL)
37
  model.save_pretrained(MODEL)
38
 
39
  text = "Good night 😊"
 
40
  encoded_input = tokenizer(text, return_tensors='pt')
41
  output = model(**encoded_input)
42
  scores = output[0][0].detach().numpy()
 
1
+ # Twitter-roBERTa-base for Sentiment Analysis
2
 
3
  This is a roBERTa-base model trained on ~58M tweets and finetuned for the Sentiment Analysis task at Semeval 2018.
4
  For full description: [_TweetEval_ benchmark (Findings of EMNLP 2020)](https://arxiv.org/pdf/2010.12421.pdf).
 
6
 
7
  ## Example of classification
8
 
9
+ # Preprocess text (username and link placeholders)
10
+ def preprocess(text):
11
+ new_text = []
12
+ for t in text.split(" "):
13
+ t = '@user' if t.startswith('@') and len(t) > 1 else t
14
+ t = 'http' if t.startswith('http') else t
15
+ new_text.append(t)
16
+ return " ".join(new_text)
17
+
18
  ```python
19
  from transformers import AutoModelForSequenceClassification
20
  from transformers import TFAutoModelForSequenceClassification
 
46
  model.save_pretrained(MODEL)
47
 
48
  text = "Good night 😊"
49
+ text = preprocess(text)
50
  encoded_input = tokenizer(text, return_tensors='pt')
51
  output = model(**encoded_input)
52
  scores = output[0][0].detach().numpy()