cappuch commited on
Commit
09db3e6
1 Parent(s): 8de7ea8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -39
README.md CHANGED
@@ -9,57 +9,26 @@ datasets:
9
  license: mit
10
  ---
11
 
12
- Connect me on LinkedIn
13
- - [linkedin.com/in/arpanghoshal](https://www.linkedin.com/in/arpanghoshal)
14
 
 
15
 
16
- ## What is GoEmotions
17
-
18
- Dataset labelled 58000 Reddit comments with 28 emotions
19
-
20
- - admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise + neutral
21
-
22
-
23
- ## What is RoBERTa
24
-
25
- RoBERTa builds on BERT’s language masking strategy and modifies key hyperparameters in BERT, including removing BERT’s next-sentence pretraining objective, and training with much larger mini-batches and learning rates. RoBERTa was also trained on an order of magnitude more data than BERT, for a longer amount of time. This allows RoBERTa representations to generalize even better to downstream tasks compared to BERT.
26
-
27
-
28
- ## Hyperparameters
29
-
30
- | Parameter | |
31
- | ----------------- | :---: |
32
- | Learning rate | 5e-5 |
33
- | Epochs | 10 |
34
- | Max Seq Length | 50 |
35
- | Batch size | 16 |
36
- | Warmup Proportion | 0.1 |
37
- | Epsilon | 1e-8 |
38
-
39
-
40
- ## Results
41
-
42
- Best Result of `Macro F1` - 49.30%
43
-
44
  ## Usage
45
 
46
  ```python
47
 
48
  from transformers import RobertaTokenizerFast, TFRobertaForSequenceClassification, pipeline
49
 
50
- tokenizer = RobertaTokenizerFast.from_pretrained("arpanghoshal/EmoRoBERTa")
51
- model = TFRobertaForSequenceClassification.from_pretrained("arpanghoshal/EmoRoBERTa")
52
 
53
  emotion = pipeline('sentiment-analysis',
54
- model='arpanghoshal/EmoRoBERTa')
55
 
56
- emotion_labels = emotion("Thanks for using it.")
57
  print(emotion_labels)
58
 
59
- ```
60
- Output
61
-
62
- ```
63
- [{'label': 'gratitude', 'score': 0.9964383244514465}]
64
  ```
65
 
 
9
  license: mit
10
  ---
11
 
12
+ ## What is the GoEmotions Dataset?
 
13
 
14
+ The dataset is comprised of 58000 Reddit comments with 28 emotions.
15
 
16
+ - admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  ## Usage
18
 
19
  ```python
20
 
21
  from transformers import RobertaTokenizerFast, TFRobertaForSequenceClassification, pipeline
22
 
23
+ tokenizer = RobertaTokenizerFast.from_pretrained("cappuch/EmoRoBERTa_Retrain")
24
+ model = TFRobertaForSequenceClassification.from_pretrained("cappuch/EmoRoBERTa_Retrain")
25
 
26
  emotion = pipeline('sentiment-analysis',
27
+ model='cappuch/EmoRoBERTa_Retrain')
28
 
29
+ emotion_labels = emotion("Hello!")
30
  print(emotion_labels)
31
 
32
+ #[{'label': 'neutral', 'score': 0.9964383244514465}]
 
 
 
 
33
  ```
34