j-hartmann
/

emotion-english-distilroberta-base

Text Classification

Inference Endpoints

Model card Files Files and versions Community

emotion-english-distilroberta-base / README.md

j-hartmann's picture

Update README.md

7f8f33e about 3 years ago

|

No virus

2.3 kB

	---
	language: "en"
	tags:
	- sentiment
	- emotion
	- twitter

	widget:
	- text: "Oh wow. I didn't know that."
	- text: "This movie always makes me cry.."
	- text: "Oh Happy Day"

	---

	## Description

	With this model, you can classify emotions in English text data. The model was trained on 6 diverse datasets and predicts 7 emotions:

	1) anger
	2) disgust
	3) fear
	4) joy
	5) neutral
	6) sadness
	7) surprise

	The model is a fine-tuned checkpoint of DistilRoBERTa-base. The emotions reflect Ekman's 6 universal emotions, plus a neutral class.

	## Application

	a) Run emotion model with 3 lines of code on single text example using Hugging Face's pipeline command on Google Colab:

	[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/j-hartmann/emotion-english-distilroberta-base/blob/main/simple_emotion_pipeline.ipynb)

	b) Run emotion model on multiple examples and full datasets (e.g., .csv files) on Google Colab:

	[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/j-hartmann/emotion-english-distilroberta-base/blob/main/emotion_prediction_example.ipynb)

	## Contact

	Please reach out to jochen.hartmann@uni-hamburg.de if you have any questions or feedback.

	Thanks to Samuel Domdey and chrsiebert for their support in making this model available.

	## Appendix

	Please find an overview of the datasets used for training below. All datasets contain English text. The table summarizes which emotions are available in each of the datasets.

	\|Name\|anger\|disgust\|fear\|joy\|neutral\|sadness\|surprise\|
	\|---\|---\|---\|---\|---\|---\|---\|---\|
	\|Crowdflower (2016)\|Yes\|-\|-\|Yes\|Yes\|Yes\|Yes\|
	\|Emotion Dataset, Elvis et al. (2018)\|Yes\|Yes\|Yes\|Yes\|-\|Yes\|Yes\|
	\|GoEmotions, Demszky et al. (2020)\|Yes\|Yes\|Yes\|Yes\|Yes\|Yes\|Yes\|
	\|ISEAR, Vikash (2018)\|Yes\|Yes\|Yes\|Yes\|-\|Yes\|-\|
	\|MELD, Poria et al. (2019)\|Yes\|Yes\|Yes\|Yes\|Yes\|Yes\|Yes\|
	\|SemEval-2018, EI-reg (Mohammad et al. 2018) \|Yes\|-\|Yes\|Yes\|-\|Yes\|-\|

	The datasets represent a diverse set of text types. Specifically, they contain emotion labels for texts from Twitter, Reddit, student self-reports, and utterances from TV dialogues. As MELD (Multimodal EmotionLines Dataset) extends the EmotionLines dataset, EmotionLines itself is not included here.