File size: 3,037 Bytes
45db32a
1d8f63b
 
 
 
 
 
 
 
 
 
272a78a
9514b13
272a78a
 
06bc749
1d8f63b
9514b13
 
 
 
 
 
 
 
3c4d492
 
 
 
 
 
9514b13
1d8f63b
 
e7fffa7
 
 
 
 
 
 
 
1d8f63b
 
 
 
 
3c4d492
1d8f63b
3c4d492
1d8f63b
2f80cb6
1d8f63b
 
2f80cb6
1d8f63b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ce07771
9514b13
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
# Sentiment Analysis of English Tweets with BERTsent

**BERTsent**: A finetuned **BERT** based **sent**iment classifier for English language tweets.

BERTsent is trained with SemEval 2017 corpus (39k plus tweets) and is based on [bertweet-base](https://github.com/VinAIResearch/BERTweet) that was trained on 850M English Tweets (cased) and additional 23M COVID-19 English Tweets (cased). The base model used [RoBERTa](https://github.com/pytorch/fairseq/blob/master/examples/roberta/README.md) pre-training procedure.

Output labels:

 - 0 represents "negative" sentiment
 - 1 represents "neutral" sentiment
 - 2 represents "positive" sentiment
 
 ## COVID-19 tweets specific task
 
 Eg.,
 The model distinguishes: "covid" -> neutral sentiment, "I have covid" -> negative sentiment

## Cite

If you use BERTsent in your project/research, please cite the following article:  
Lamsal, R., Harwood, A., & Read, M. R. (2022). [Twitter conversations predict the daily confirmed COVID-19 cases](https://arxiv.org/abs/2206.10471). arXiv preprint arXiv:2206.10471.

@article{lamsal2022twitter,    
        title={Twitter conversations predict the daily confirmed COVID-19 cases},    
        author={Lamsal, Rabindra and Harwood, Aaron and Read, Maria Rodriguez},    
        journal={Applied Soft Computing},    
        volume={129},    
        pages={109603},    
        year={2022},    
        publisher={Elsevier}    
}  

## Using the model

Install transformers and emoji, if already not installed:

    terminal:
            pip install transformers
            pip install emoji (for converting emoticons or emojis into text)
    notebooks (Colab, Kaggle):
            !pip install transformers
            !pip install emoji

Import BERTsent from the transformers library:

    from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
      
    tokenizer = AutoTokenizer.from_pretrained("rabindralamsal/BERTsent")
    
    model = TFAutoModelForSequenceClassification.from_pretrained("rabindralamsal/BERTsent")

Import TensorFlow and numpy:

    import tensorflow as tf
    import numpy as np

We have installed and imported everything that's needed for the sentiment analysis. Let's predict sentiment of an example tweet:

    example_tweet = "The NEET exams show our Govt in a poor light: unresponsiveness to genuine concerns; admit cards not delivered to aspirants in time; failure to provide centres in towns they reside, thus requiring unnecessary & risky travels. What a disgrace to treat our #Covid warriors like this!"
    #this tweet resides on Twitter with an identifier-1435793872588738560
        
    input = tokenizer.encode(example_tweet, return_tensors="tf")
    output = model.predict(input)[0]
    prediction = tf.nn.softmax(output, axis=1).numpy()
    sentiment = np.argmax(prediction)
        
    print(prediction)
    print(sentiment)

Output: 

    [[0.972672164440155  0.023684727028012276 0.003643065458163619]]
    0