hebEMO_anger / README.md
avichr's picture
Update README.md
396ae3c

HebEMO - Emotion Recognition Model for Modern Hebrew

HebEMO is a tool that detects polarity and extracts emotions from modern Hebrew User-Generated Content (UGC), which was trained on a unique Covid-19 related dataset that we collected and annotated.

HebEMO yielded a high performance of weighted average F1-score = 0.96 for polarity classification. Emotion detection reached an F1-score of 0.78-0.97, with the exception of surprise, which the model failed to capture (F1 = 0.41). These results are better than the best-reported performance, even when compared to the English language.

Emotion UGC Data Description

Our UGC data includes comments posted on news articles collected from 3 major Israeli news sites, between January 2020 to August 2020. The total size of the data is ~150 MB, including over 7 million words and 350K sentences. ~2000 sentences were annotated by crowd members (3-10 annotators per sentence) for overall sentiment (polarity) and eight emotions: anger, disgust, anticipation , fear, joy, sadness, surprise and trust. The percentage of sentences in which each emotion appeared is found in the table below.

anger disgust expectation fear happy sadness surprise trust sentiment
ratio 0.78 0.83 0.58 0.45 0.12 0.59 0.17 0.11 0.25

Performance

Emotion Recognition

emotion f1-score precision recall
anger 0.96 0.99 0.93
disgust 0.97 0.98 0.96
anticipation 0.82 0.80 0.87
fear 0.79 0.88 0.72
joy 0.90 0.97 0.84
sadness 0.90 0.86 0.94
surprise 0.40 0.44 0.37
trust 0.83 0.86 0.80

The above metrics is for positive class (meaning, the emotion is reflected in the text).

Sentiment (Polarity) Analysis

precision recall f1-score
neutral 0.83 0.56 0.67
positive 0.96 0.92 0.94
negative 0.97 0.99 0.98
accuracy 0.97
macro avg 0.92 0.82 0.86
weighted avg 0.96 0.97 0.96

Sentiment (polarity) analysis model is also available on AWS! for more information visit AWS' git

How to use

Emotion Recognition Model

An online model can be found at huggingface spaces or as colab notebook

# !pip install pyplutchik==0.0.7
# !pip install transformers==4.14.1

!git clone https://github.com/avichaychriqui/HeBERT.git
from HeBERT.src.HebEMO import *
HebEMO_model = HebEMO()

HebEMO_model.hebemo(input_path = 'data/text_example.txt')
# return analyzed pandas.DataFrame  

hebEMO_df = HebEMO_model.hebemo(text='讛讞讬讬诐 讬驻讬诐 讜诪讗讜砖专讬诐', plot=True)

For sentiment classification model (polarity ONLY):

from transformers import AutoTokenizer, AutoModel, pipeline

tokenizer = AutoTokenizer.from_pretrained("avichr/heBERT_sentiment_analysis") #same as 'avichr/heBERT' tokenizer
model = AutoModel.from_pretrained("avichr/heBERT_sentiment_analysis")

# how to use?
sentiment_analysis = pipeline(
    "sentiment-analysis",
    model="avichr/heBERT_sentiment_analysis",
    tokenizer="avichr/heBERT_sentiment_analysis",
    return_all_scores = True
)

sentiment_analysis('讗谞讬 诪转诇讘讟 诪讛 诇讗讻讜诇 诇讗专讜讞转 爪讛专讬讬诐')	
>>>  [[{'label': 'neutral', 'score': 0.9978172183036804},
>>>  {'label': 'positive', 'score': 0.0014792329166084528},
>>>  {'label': 'negative', 'score': 0.0007035882445052266}]]

sentiment_analysis('拽驻讛 讝讛 讟注讬诐')
>>>  [[{'label': 'neutral', 'score': 0.00047328314394690096},
>>>  {'label': 'possitive', 'score': 0.9994067549705505},
>>>  {'label': 'negetive', 'score': 0.00011996887042187154}]]

sentiment_analysis('讗谞讬 诇讗 讗讜讛讘 讗转 讛注讜诇诐')
>>>  [[{'label': 'neutral', 'score': 9.214012970915064e-05}, 
>>>  {'label': 'possitive', 'score': 8.876807987689972e-05}, 
>>>  {'label': 'negetive', 'score': 0.9998190999031067}]]

Contact us

Avichay Chriqui
Inbal yahav
The Coller Semitic Languages AI Lab
Thank you, 转讜讚讛, 卮賰乇丕

If you used this model please cite us as :

Chriqui, A., & Yahav, I. (2022). HeBERT & HebEMO: a Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition. INFORMS Journal on Data Science, forthcoming.

@article{chriqui2021hebert,
  title={HeBERT \& HebEMO: a Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition},
  author={Chriqui, Avihay and Yahav, Inbal},
  journal={INFORMS Journal on Data Science},
  year={2022}
}