AlephBERT Hebrew Intent Classifier · ONNX
ONNX-optimized variant of spivi87/alephbert-intent-he
for ~50 ms CPU inference. Identical weights and labels — just packaged as a
runtime-portable graph.
For most users: prefer the PyTorch repo
plus transformers.pipeline("text-classification", ...) — it's a 5-line snippet
and HF handles everything. Use this repo when you need predictable CPU latency
(production webhooks, edge devices, free-tier servers).
Usage
import numpy as np
import onnxruntime as ort
from tokenizers import Tokenizer
session = ort.InferenceSession("model.onnx", providers=["CPUExecutionProvider"])
tok = Tokenizer.from_file("tokenizer.json")
enc = tok.encode("תוסיף חלב וביצים")
# IMPORTANT: use enc.attention_mask — the tokenizer pads to 128 by default,
# so a naive all-ones mask attends to PAD tokens and tanks accuracy to ~16%.
input_ids = np.array([enc.ids], dtype=np.int64)
attention_mask = np.array([enc.attention_mask], dtype=np.int64)
logits = session.run(
None, {"input_ids": input_ids, "attention_mask": attention_mask}
)[0]
probs = np.exp(logits[0] - logits[0].max())
probs /= probs.sum()
# id2label is in config.json (same as the PyTorch repo)
import json
config = json.load(open("config.json"))
id2label = {int(k): v for k, v in config["id2label"].items()}
print(id2label[int(np.argmax(probs))]) # → GROCERY_REQUEST
print(f"confidence: {float(probs.max()):.3f}")
Batched inference
texts = ["תוסיף חלב", "מה ברשימה?", "סיימתי קניות"]
encs = tok.encode_batch(texts)
input_ids = np.array([e.ids for e in encs], dtype=np.int64)
attention_mask = np.array([e.attention_mask for e in encs], dtype=np.int64)
logits = session.run(
None, {"input_ids": input_ids, "attention_mask": attention_mask}
)[0]
preds = np.argmax(logits, axis=-1)
print([id2label[int(p)] for p in preds])
Performance
On Apple M3 (CPU, ONNX Runtime 1.x): ~50 ms / single inference, scaling
linearly with batch size. See spivi87/alephbert-intent-he
for accuracy / F1 — the ONNX export is validated to match PyTorch logits within
atol=1e-4 on the test sentences.
Attribution & License
Apache 2.0. Built on onlplab/alephbert-base
(also Apache 2.0). See the GitHub repo
for the full reproducible recipe.
- Downloads last month
- 25
Model tree for spivi87/alephbert-intent-he-onnx
Base model
onlplab/alephbert-base