metadata
title: Entity Sentiment Classification
emoji: π
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false
Entity Sentiment Classification
Classify sentiment (positive, neutral, negative) for named entities in news article text using fine-tuned DistilBERT models.
Three classification modes:
- marker β entity wrapped in
[E]...[/E]special tokens, single-sequence input - qa_m β question-answering multi-class: "What do you think of the sentiment of {entity}?"
- qa_b β question-answering binary: three hypotheses per entity, argmax of P(yes)
Plus a fastText baseline using marker-mode text.
Report: report.pdf.
Live demo: https://huggingface.co/spaces/lamossta/entity-sentiment-classification
Logs: https://telemetry.betterstack.com/team/t529434/tail?s=2383648
Setup
1. Environment Variables
Create a .env file in the project root:
HF_TOKEN=<your huggingface token> (not needed for the inference)
BETTERSTACK_SOURCE_TOKEN=<your betterstack token>
2. Run
docker compose up
This will install dependencies, download models from HuggingFace, and start the backend and frontend. Everything is accessible on port 7860:
- Frontend: http://localhost:7860
- API: http://localhost:7860/api/
API Endpoints
| Endpoint | Method | Description |
|---|---|---|
/predict |
POST | Classify entities using the marker model |
/predict-all-models |
POST | Classify entities using all available models |
/health |
GET | Health check |
/docs |
GET | Interactive Swagger UI API docs |
Sending Requests
Request Format
[
{
"id": 0,
"text": "Google had solid Q4 2025 earnings but Microsoft's were not great.",
"entities": [
{
"entity_id": 0,
"entity_text": "Google",
"entity_type": "company",
"positions": [
{"position_text": "Google", "length": 6, "offset": 0}
]
},
{
"entity_id": 1,
"entity_text": "Microsoft",
"entity_type": "company",
"positions": [
{"position_text": "Microsoft", "length": 9, "offset": 40}
]
}
]
}
]
Response Format
[
{
"id": 0,
"entities": [
{"entity_id": 0, "entity_text": "Google", "classification": "positive"},
{"entity_id": 1, "entity_text": "Microsoft", "classification": "negative"}
]
}
]
Local
curl -X POST http://localhost:7860/api/predict \
-H "Content-Type: application/json" \
-d @sample_input.json
curl -X POST http://localhost:7860/api/predict-all-models \
-H "Content-Type: application/json" \
-d @sample_input.json
HuggingFace Spaces
curl -X POST https://<your-space>.hf.space/predict \
-H "Content-Type: application/json" \
-d @sample_input.json
curl -X POST https://<your-space>.hf.space/predict-all-models \
-H "Content-Type: application/json" \
-d @sample_input.json
Notebooks
notebooks/data_preprocessing_analysis.ipynbβ data hygiene checks ondata/data_raw.jsonnotebooks/data_augmentation_analysis.ipynbβ article length + label distribution analysisnotebooks/data_splits_analysis.ipynbβ train/val/test splitting strategynotebooks/train_marker.ipynbβ fine-tunes marker modenotebooks/train_qa_m.ipynbβ fine-tunes QA-M modenotebooks/train_qa_b.ipynbβ fine-tunes QA-B modenotebooks/train_fasttext.ipynbβ trains fastText baseline