Sindhi Sentiment Analysis Model

A text classification model that detects positive, negative, and neutral sentiment in Sindhi language text. This is one of the first publicly available sentiment analysis models for the Sindhi language on Hugging Face.

Model Description

This model was trained on a custom Sindhi sentiment dataset collected from Sindhi newspaper corpora. It classifies Sindhi text into three sentiment categories:

✅ Positive
❌ Negative
😐 Neutral

Model Details

Property	Details
Language	Sindhi (`sd`)
Script	Arabic (Nastaliq)
Task	Sentiment Analysis / Text Classification
Labels	Positive, Negative, Neutral
License	MIT
Developer	Ali Nawaz
Institution	Shaikh Ayaz University

Training Data

Trained on the Sindhi Sentiment Analysis Dataset — a dataset of 1,898 sentences in Sindhi collected from Sindhi newspaper corpora using a semi-supervised pipeline, with manual verification.

Column	Description
Sindhi Text	Original Sindhi sentence
English Translation	English translation
Sentiment	Label: Positive / Negative / Neutral
Source	Newspaper/corpus source
Verified	Manual verification status

How to Use

from transformers import pipeline

classifier = pipeline("text-classification", model="alinawazmahar/sindhi-sentiment")
result = classifier("هي ڪتاب تمام سٺو آهي")
print(result)
# [{'label': 'Positive', 'score': 0.95}]

Or load manually:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("alinawazmahar/sindhi-sentiment")
model = AutoModelForSequenceClassification.from_pretrained("alinawazmahar/sindhi-sentiment")

Live Demo

Try the model interactively on the Hugging Face Space:
👉 alinawazmahar/sindhi-sentiment (Space)

Intended Use

Sentiment analysis of Sindhi news articles
Social media monitoring in Sindhi
NLP research on low-resource South Asian languages
Educational and academic research

Limitations

Trained on newspaper text; may perform differently on informal/social media Sindhi
Dataset size is relatively small (1,898 sentences)
Roman Sindhi (Latin script) is not supported — Arabic script only

Citation

If you use this model or dataset in your research, please cite:

@misc{alinawaz2025sindhi,
  author = {Ali Nawaz},
  title  = {Sindhi Sentiment Analysis Model},
  year   = {2025},
  publisher = {Hugging Face},
  url    = {https://huggingface.co/alinawazmahar/sindhi-sentiment},
  institution = {Shaikh Ayaz University}
}

Acknowledgements

Dataset collected from Sindhi newspaper corpora. Developed as part of NLP research at Shaikh Ayaz University.

Downloads last month: -; Downloads are not tracked for this model. How to track

alinawazmahar
/

sindhi-sentiment