Instructions to use haimgoldfisher/HalleluBERT_large_sentiment_analysis with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use haimgoldfisher/HalleluBERT_large_sentiment_analysis with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="haimgoldfisher/HalleluBERT_large_sentiment_analysis")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("haimgoldfisher/HalleluBERT_large_sentiment_analysis") model = AutoModelForSequenceClassification.from_pretrained("haimgoldfisher/HalleluBERT_large_sentiment_analysis") - Notebooks
- Google Colab
- Kaggle
HalleluBERT_large_sentiment_analysis
This model is a fine-tuned version of HalleluBERT/HalleluBERT_large for Hebrew sentiment analysis.
The model was trained on the NNLP-IL/HebrewSentiment dataset.
Final evaluation results:
- Loss: 0.3670
- Accuracy: 0.8924
- Macro F1: 0.8918
- Weighted F1: 0.8922
🚀 Use this model
Quickstart with pipeline
from transformers import pipeline
classifier = pipeline(
task="text-classification",
model="haimgoldfisher/HalleluBERT_large_sentiment_analysis",
tokenizer="haimgoldfisher/HalleluBERT_large_sentiment_analysis",
return_all_scores=True,
)
text = "השירות היה מצוין והאוכל היה טעים מאוד!"
print(classifier(text))
# [[{'label': 'positive', 'score': 0.98}, {'label': 'neutral', 'score': 0.01}, {'label': 'negative', 'score': 0.01}]]
Direct loading with AutoModel
For batching, custom thresholds, or export workflows:
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
model_id = "haimgoldfisher/HalleluBERT_large_sentiment_analysis"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)
model.eval()
texts = [
"השירות היה מצוין והאוכל היה טעים מאוד!",
"החוויה הייתה מאכזבת והמחיר היה גבוה מדי.",
"ההזמנה הגיעה בזמן.",
]
inputs = tokenizer(texts, padding=True, truncation=True, max_length=128, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
probs = torch.softmax(logits, dim=-1)
preds = probs.argmax(dim=-1)
labels = [model.config.id2label[p.item()] for p in preds]
for text, label, prob in zip(texts, labels, probs):
print(f"{label}\t({prob.max():.3f})\t{text}")
GPU / half-precision
HalleluBERT-Large is ~355M params — use fp16 on GPU for ~2× throughput:
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
device = "cuda" if torch.cuda.is_available() else "cpu"
model = AutoModelForSequenceClassification.from_pretrained(
"haimgoldfisher/HalleluBERT_large_sentiment_analysis",
torch_dtype=torch.float16 if device == "cuda" else torch.float32,
).to(device)
🌐 Deploy
Inference Providers status: This model isn't currently deployed by any Hugging Face Inference Provider (Novita, Together, Hyperbolic, etc.), so the serverless widget on the model page may show "This model isn't deployed by any Inference Provider." You can request provider support here — react with 👍 to the discussion to upvote.
In the meantime, all four options below work today:
Option 1 — Inference Endpoints (recommended, HF-hosted)
Dedicated HF infrastructure — works for any model on the Hub, no provider listing required. Click Deploy → Inference Endpoints on the model page, or use the CLI:
huggingface-cli login
Recommended starting config for a Large-size model:
- Hardware: GPU T4 (cost-efficient) or A10G (low-latency)
- CPU fallback: Intel Sapphire Rapids — only for < 10 req/min
- Replicas: 1 (autoscale 1→3)
- Task:
text-classification - Max input length: 128 tokens
Call it once running:
curl https://<your-endpoint>.endpoints.huggingface.cloud \
-H "Authorization: Bearer $HF_TOKEN" \
-H "Content-Type: application/json" \
-d '{"inputs": "השירות היה מצוין והאוכל היה טעים מאוד!"}'
Option 2 — Docker (self-hosted with TEI)
text-embeddings-inference supports BERT/RoBERTa sequence classifiers and gives the lowest self-hosted latency:
docker run -p 8080:80 \
-v $PWD/data:/data \
--gpus all \
ghcr.io/huggingface/text-embeddings-inference:1.5 \
--model-id haimgoldfisher/HalleluBERT_large_sentiment_analysis
Call it:
curl http://localhost:8080/predict \
-H 'Content-Type: application/json' \
-d '{"inputs": "השירות היה מצוין והאוכל היה טעים מאוד!"}'
Option 3 — Minimal FastAPI server
For full control or to add custom pre/post-processing:
# server.py
from fastapi import FastAPI
from pydantic import BaseModel
from transformers import pipeline
app = FastAPI()
clf = pipeline(
"text-classification",
model="haimgoldfisher/HalleluBERT_large_sentiment_analysis",
return_all_scores=True,
device=0, # set to -1 for CPU
)
class Payload(BaseModel):
inputs: str | list[str]
@app.post("/predict")
def predict(p: Payload):
return clf(p.inputs)
pip install fastapi uvicorn transformers torch
uvicorn server:app --host 0.0.0.0 --port 8080
Option 4 — ONNX / quantized for edge & CPU
A Large model is heavy on CPU — ONNX + INT8 quantization typically cuts latency by 3–4×:
from optimum.onnxruntime import ORTModelForSequenceClassification
from optimum.onnxruntime.configuration import AutoQuantizationConfig
from optimum.onnxruntime import ORTQuantizer
from transformers import AutoTokenizer
model_id = "haimgoldfisher/HalleluBERT_large_sentiment_analysis"
# Export to ONNX
model = ORTModelForSequenceClassification.from_pretrained(model_id, export=True)
tokenizer = AutoTokenizer.from_pretrained(model_id)
model.save_pretrained("./onnx-halleluBERT-sentiment")
tokenizer.save_pretrained("./onnx-halleluBERT-sentiment")
# INT8 dynamic quantization
quantizer = ORTQuantizer.from_pretrained("./onnx-halleluBERT-sentiment")
qconfig = AutoQuantizationConfig.avx512_vnni(is_static=False, per_channel=False)
quantizer.quantize(save_dir="./onnx-halleluBERT-sentiment-int8", quantization_config=qconfig)
Model description
This model performs sentiment classification for Hebrew text.
It is based on HalleluBERT Large, a RoBERTa-style transformer model pretrained specifically for Hebrew.
The model was fine-tuned for a 3-class sentiment classification task:
- Positive
- Negative
- Neutral
A classification head was added on top of the [CLS] token representation
and the entire model was fine-tuned end-to-end.
Intended uses & limitations
Intended uses
This model is suitable for:
- Sentiment analysis of Hebrew text
- Social media monitoring
- Customer feedback analysis
- Review classification
- General Hebrew NLP research
Limitations
- The model was trained on a specific sentiment dataset and may not generalize perfectly to all domains.
- Performance may degrade on:
- highly informal slang
- mixed Hebrew/English text
- very long documents
- The model assumes single-sentence or short paragraph inputs.
Training and evaluation data
Training was performed using the HebrewSentiment dataset: https://github.com/NNLP-IL/HebrewSentiment
The dataset contains labeled Hebrew sentences with sentiment annotations.
Dataset characteristics:
- Language: Hebrew
- Task: sentiment classification
- Labels:
- Positive
- Negative
- Neutral
The dataset was split into:
- Training set
- Validation set
Evaluation metrics:
- Accuracy
- Macro F1
- Weighted F1
Macro F1 was used as the primary metric for model selection, since it better reflects performance across imbalanced classes.
Framework versions
- Transformers 5.7.0
- PyTorch 2.11.0+cu130
- Datasets 4.8.5
- Tokenizers 0.22.2
- Downloads last month
- 187
Model tree for haimgoldfisher/HalleluBERT_large_sentiment_analysis
Base model
HalleluBERT/HalleluBERT_largeEvaluation results
- Accuracy on HebrewSentiment (NNLP-IL)self-reported0.892
- Macro F1 on HebrewSentiment (NNLP-IL)self-reported0.892
- Weighted F1 on HebrewSentiment (NNLP-IL)self-reported0.892