HasinMDG
/

XT-Sentiment-XLM-Roberta-Large

Text Classification

sentence-transformers

Model card Files Files and versions Community

XT-Sentiment-XLM-Roberta-Large / README.md

HasinMDG's picture

Update README.md

1791945 about 1 year ago

|

history blame contribute delete

3.19 kB

	---
	license: apache-2.0
	tags:
	- setfit
	- sentence-transformers
	- text-classification
	pipeline_tag: text-classification
	---

	## General description of the model

	Unlike a classical sentiment classifier, this model was built to measure the sentiment towards a particular entity on a particular pre-determined topic


	```python
	model = ....

	text = "I pity Facebook for their lack of commitment against global warming , I like google for its support of increased education"
	# In the previous example we notice that depending on the type of entity (Google or Facebook) and depending on the type of to#pics (education or climate change) we have two types of sentiments

	# Predict the sentiment towards Facebook (entity) on Climate change (topic)
	sentiment, probability = model.predict(text, topic="climate change", entity= "Facebook")
	# sentiment = "negative

	# Predict the sentiment towards Google (entity) on Education (topic)
	sentiment, probability = model.predict(text, topic="climate change", entity= "Facebook")
	# Sentiment = "positive"

	# Predict the sentiment towards Google (entity) on Climate Change (topic)
	sentiment, probability = model.predict(text, topic="climate change", entity= "Facebook")
	# Sentiment = "neutral" / "not_found"

	# Predict the sentiment towards Facebook (entity) on Education (topic)
	sentiment, probability = model.predict(text, topic="climate change", entity= "Facebook")
	# Sentiment = "neutral" / "not_found"

	```
	## Training
	This is a [SetFit model](https://github.com/huggingface/setfit) that can be used for sentiment classification.
	The model has been trained using an efficient few-shot learning technique that involves:

	1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
	2. Training a classification head with features from the fine-tuned Sentence Transformer.
	3. The Training data can be downloaded from [here](https://docs.google.com/spreadsheets/d/1BVDardwVs04ZWmc5_Eg62Lyr_w_OuXysQwhne8ErkoA/edit?usp=sharing)

	## Usage and Inference
	For a global overview of the pipeline used for inference please refer to this [colab notebook](https://colab.research.google.com/drive/1GgEGrhQZfA1pbcB9Zl0VtV7L5wXdh6vj?usp=sharing)

	## Model Performance
	The performances of the model on our internal test set are:
	* Accuracy: 0.68
	* Balanced_Accuracy: 0.45
	* MCC: 0.37
	* F1: 0.49

	## Potential weakness of the model

	* As the model has been trained on data of short length, it is difficult to predict how the model will behave on long texts
	* Although the model is robust to typos and able to deal with synonyms, the entities and topics must be as explicit as possible.
	* The model may have difficulties to detect very abstract and complex topics, a fine tuning of the model can solve this problem
	* The model may have difficulty in capturing elements that are very specific to a given context

	## BibTeX entry and citation info

	```bibtex
	author = {HasiMichael, Solofo, Bruce, Sitwala},
	keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
	title = {Sentiment Classification toward Entity and Topics},
	year = {2023/04},
	version = {0}
	```