BGMISentiment β€” gold_v3 (ONNX)

3-class sentiment for BGMI / Indian-esports YouTube live chat β€” fast, code-mixed (Hindi + English + Hinglish) gaming chat. Fine-tuned from distilbert-base-multilingual-cased, exported to ONNX (fp32 + INT8) for cheap CPU serving.

  • Classes: 0 negative Β· 1 neutral Β· 2 positive. signed = P(pos) βˆ’ P(neg).
  • Trained on: ~208k silver rows (random 500/video Γ— 416 streams + slang/emoji prior), combined with a human-reviewed gold set, 70/15/15 split, with gold-distribution class weights [neg 1.2, neu 1.5, pos 0.5].
  • Gold eval (n=666), calibrated (NEU_GATE=0.55): accuracy 0.84, macro-F1 0.75, neutral recall 0.71, positive recall 0.96.

Files

Path Size Use
model.onnx 541 MB fp32 ONNX (self-contained)
onnx_int8/model_quantized.onnx 136 MB INT8 β€” the deployed artifact (p50 ~7 ms CPU)
tokenizer.json, vocab.txt, *config*.json β€” WordPiece tokenizer + config

Usage (ONNX Runtime)

import onnxruntime as ort, numpy as np
from transformers import AutoTokenizer

tok = AutoTokenizer.from_pretrained("<this-repo>")
sess = ort.InferenceSession("onnx_int8/model_quantized.onnx", providers=["CPUExecutionProvider"])
enc = tok(["soul clutch insane", "godlike choked again"], padding=True, truncation=True,
          max_length=64, return_tensors="np")
logits = sess.run(None, {k: v for k, v in enc.items() if k in {i.name for i in sess.get_inputs()}})[0]
p = np.exp(logits) / np.exp(logits).sum(-1, keepdims=True)
print(p.argmax(-1))  # 0 neg / 1 neu / 2 pos

Notes & limitations

  • Tuned for short, slangy, romanized-Hindi esports chat; not a general sentiment model.
  • Domain slang is blended at inference (e.g. "demon"/"goat" = praise). Serving applies a calibrated neutral gate; raw argmax over-predicts neutral.
  • Dataset: companion BGMI Live-Chat Sentiment dataset on Kaggle.
  • Source data = public YouTube live chat; non-commercial / research use.
Downloads last month
32
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support