File size: 2,313 Bytes
092a428 7a76c4f 0e045a8 7a76c4f 079ee57 092a428 ada1ec8 092a428 e06f86e 092a428 75b9a09 d2d55ce 75b9a09 d2d55ce 04392f3 75b9a09 f7dc7cd c1f87b8 f7dc7cd c1f87b8 80a6c9d 93618c7 5a9e9a6 040cd5d 5a9e9a6 155a92b 4f2ad24 ab82646 67e2b84 4f2ad24 bcb853b 7f8f33e e70dbbf |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
---
language: "en"
tags:
- distilroberta
- sentiment
- emotion
- twitter
- reddit
widget:
- text: "Oh wow. I didn't know that."
- text: "This movie always makes me cry.."
- text: "Oh Happy Day"
---
## Description
With this model, you can classify emotions in English text data. The model was trained on 6 diverse datasets (see Appendix) and predicts Ekman's 6 basic emotions, plus a neutral class:
1) anger
2) disgust
3) fear
4) joy
5) neutral
6) sadness
7) surprise
The model is a fine-tuned checkpoint of DistilRoBERTa-base.
## Application
a) Run emotion model with 3 lines of code on single text example using Hugging Face's pipeline command on Google Colab:
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/j-hartmann/emotion-english-distilroberta-base/blob/main/simple_emotion_pipeline.ipynb)
b) Run emotion model on multiple examples and full datasets (e.g., .csv files) on Google Colab:
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/j-hartmann/emotion-english-distilroberta-base/blob/main/emotion_prediction_example.ipynb)
## Contact
Please reach out to jochen.hartmann@uni-hamburg.de if you have any questions or feedback.
Thanks to Samuel Domdey and chrsiebert for their support in making this model available.
## Appendix
Please find an overview of the datasets used for training below. All datasets contain English text. The table summarizes which emotions are available in each of the datasets.
|Name|anger|disgust|fear|joy|neutral|sadness|surprise|
|---|---|---|---|---|---|---|---|
|Crowdflower (2016)|Yes|-|-|Yes|Yes|Yes|Yes|
|Emotion Dataset, Elvis et al. (2018)|Yes|-|Yes|Yes|-|Yes|Yes|
|GoEmotions, Demszky et al. (2020)|Yes|Yes|Yes|Yes|Yes|Yes|Yes|
|ISEAR, Vikash (2018)|Yes|Yes|Yes|Yes|-|Yes|-|
|MELD, Poria et al. (2019)|Yes|Yes|Yes|Yes|Yes|Yes|Yes|
|SemEval-2018, EI-reg (Mohammad et al. 2018) |Yes|-|Yes|Yes|-|Yes|-|
The datasets represent a diverse collection of text types. Specifically, they contain emotion labels for texts from Twitter, Reddit, student self-reports, and utterances from TV dialogues. As MELD (Multimodal EmotionLines Dataset) extends the popular EmotionLines dataset, EmotionLines itself is not included here. |