Update README.md

3653791 verified about 1 month ago

4.41 kB

	---
	license: apache-2.0
	datasets:
	- agentlans/twitter-sentiment-meta-analysis
	language:
	- en
	base_model:
	- microsoft/deberta-v3-xsmall
	pipeline_tag: text-classification
	---
	# DeBERTa-v3 Twitter Sentiment Models

	This page contains one of two DeBERTa-v3 models (xsmall and base) fine-tuned for Twitter sentiment regression.

	## Model Details

	- Model Architecture: DeBERTa-v3
	- Variants:
	- xsmall
	- base
	- Task: Sentiment regression
	- Language: English
	- License: Apache 2.0

	## Intended Use

	These models are designed for fine-grained sentiment analysis of English tweets. They output a continuous sentiment score rather than discrete categories.
	- negative score means negative sentiment
	- zero score means neutral sentiment
	- positive score means positive sentiment
	- the absolute value of the score represents how strong that sentiment is

	## Training Data

	The models were fine-tuned on a dataset of English tweets collected between September 2009 and January 2010. The sentiment scores were derived from a meta-analysis of 10 different sentiment classifiers using principal component analysis. Find the dataset at [agentlans/twitter-sentiment-meta-analysis](https://huggingface.co/datasets/agentlans/twitter-sentiment-meta-analysis).

	## How to use

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	model_name="agentlans/deberta-v3-xsmall-tweet-sentiment"

	# Put model on GPU or else CPU
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForSequenceClassification.from_pretrained(model_name)
	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
	model = model.to(device)

	def sentiment(text):
	"""Processes the text using the model and returns its logits.
	In this case, it's interpreted as the sentiment score for that text."""
	inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True).to(device)
	with torch.no_grad():
	logits = model(**inputs).logits.squeeze().cpu()
	return logits.tolist()

	# Example usage
	text = [x.strip() for x in """
	I absolutely despise this product and regret ever purchasing it.
	The service at that restaurant was terrible and ruined our entire evening.
	I'm feeling a bit under the weather today, but it's not too bad.
	The weather is quite average today, neither good nor bad.
	The movie was okay, I didn't love it but I didn't hate it either.
	I'm looking forward to the weekend, it should be nice to relax.
	This new coffee shop has a really pleasant atmosphere and friendly staff.
	I'm thrilled with my new job and the opportunities it presents!
	The concert last night was absolutely incredible, easily the best I've ever seen.
	I'm overjoyed and grateful for all the love and support from my friends and family.
	""".strip().split("\n")]

	for x, s in zip(text, sentiment(text)):
	print(f"Text: {x}\nSentiment: {round(s, 2)}\n")
	```

	Output:
	```text
	Text: I absolutely despise this product and regret ever purchasing it.
	Sentiment: -2.28

	Text: The service at that restaurant was terrible and ruined our entire evening.
	Sentiment: -2.38

	Text: I'm feeling a bit under the weather today, but it's not too bad.
	Sentiment: 0.25

	Text: The weather is quite average today, neither good nor bad.
	Sentiment: -0.14

	Text: The movie was okay, I didn't love it but I didn't hate it either.
	Sentiment: 0.06

	Text: I'm looking forward to the weekend, it should be nice to relax.
	Sentiment: 2.06

	Text: This new coffee shop has a really pleasant atmosphere and friendly staff.
	Sentiment: 2.48

	Text: I'm thrilled with my new job and the opportunities it presents!
	Sentiment: 2.66

	Text: The concert last night was absolutely incredible, easily the best I've ever seen.
	Sentiment: 2.68

	Text: I'm overjoyed and grateful for all the love and support from my friends and family.
	Sentiment: 2.65
	```

	## Performance

	Evaluation set RMSE:
	- xsmall: 0.2560
	- base: 0.1938

	## Limitations

	- English language only
	- Trained specifically on tweets, may or may not generalize well to other text types
	- Lack of broader context beyond individual tweets
	- May struggle with detecting sarcasm or nuanced sentiment

	## Ethical Considerations

	- Potential biases in the training data related to the time period and Twitter user demographics
	- Risk of misuse for large-scale sentiment monitoring without consent