File size: 3,187 Bytes
55f3f2b ec42e72 55f3f2b c5ec3df 492bba8 55f3f2b 492bba8 55f3f2b 492bba8 c5ec3df 4378977 c5ec3df 4378977 492bba8 c5ec3df ec42e72 25d4ceb ec42e72 4378977 c81b0a7 1791945 c81b0a7 c5ec3df 51661a0 c5ec3df 4378977 da3e342 c5ec3df |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
---
license: apache-2.0
tags:
- setfit
- sentence-transformers
- text-classification
pipeline_tag: text-classification
---
## General description of the model
Unlike a classical sentiment classifier, this model was built to measure the sentiment towards a particular entity on a particular pre-determined topic
```python
model = ....
text = "I pity Facebook for their lack of commitment against global warming , I like google for its support of increased education"
# In the previous example we notice that depending on the type of entity (Google or Facebook) and depending on the type of to#pics (education or climate change) we have two types of sentiments
# Predict the sentiment towards Facebook (entity) on Climate change (topic)
sentiment, probability = model.predict(text, topic="climate change", entity= "Facebook")
# sentiment = "negative
# Predict the sentiment towards Google (entity) on Education (topic)
sentiment, probability = model.predict(text, topic="climate change", entity= "Facebook")
# Sentiment = "positive"
# Predict the sentiment towards Google (entity) on Climate Change (topic)
sentiment, probability = model.predict(text, topic="climate change", entity= "Facebook")
# Sentiment = "neutral" / "not_found"
# Predict the sentiment towards Facebook (entity) on Education (topic)
sentiment, probability = model.predict(text, topic="climate change", entity= "Facebook")
# Sentiment = "neutral" / "not_found"
```
## Training
This is a [SetFit model](https://github.com/huggingface/setfit) that can be used for sentiment classification.
The model has been trained using an efficient few-shot learning technique that involves:
1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
2. Training a classification head with features from the fine-tuned Sentence Transformer.
3. The Training data can be downloaded from [here](https://docs.google.com/spreadsheets/d/1BVDardwVs04ZWmc5_Eg62Lyr_w_OuXysQwhne8ErkoA/edit?usp=sharing)
## Usage and Inference
For a global overview of the pipeline used for inference please refer to this [colab notebook](https://colab.research.google.com/drive/1GgEGrhQZfA1pbcB9Zl0VtV7L5wXdh6vj?usp=sharing)
## Model Performance
The performances of the model on our internal test set are:
* Accuracy: 0.68
* Balanced_Accuracy: 0.45
* MCC: 0.37
* F1: 0.49
## Potential weakness of the model
* As the model has been trained on data of short length, it is difficult to predict how the model will behave on long texts
* Although the model is robust to typos and able to deal with synonyms, the entities and topics must be as explicit as possible.
* The model may have difficulties to detect very abstract and complex topics, a fine tuning of the model can solve this problem
* The model may have difficulty in capturing elements that are very specific to a given context
## BibTeX entry and citation info
```bibtex
author = {HasiMichael, Solofo, Bruce, Sitwala},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Sentiment Classification toward Entity and Topics},
year = {2023/04},
version = {0}
```
|