Sawb — AraBERT Base + Glossary Augmentation (bert-base-arabertv02)

Part of the Sawb Arabic Cultural Hallucination Detection Collection for ICAIRE 2026 Track 3.

Overview

Sawb — AraBERT Base + Glossary is a binary classifier fine-tuned from aubmindlab/bert-base-arabertv02 (125M parameters) on the Sawb dataset augmented with 1,076 examples synthesized from the ICAIRE AI Glossary.

This model explores how glossary-synthesized training data affects a smaller (base) encoder model. The augmented training expands the dataset from 1,828 to 2,904 examples by adding definition-style examples from the 1,188-term ICAIRE AI Glossary.

Key finding: Adding glossary examples to the AraBERT base model caused a performance regression compared to the base model without glossary augmentation (F1 dropped from 0.9599 to 0.9246). The regression is attributed to a format mismatch between definition-style glossary inputs and conversational QA training examples. The AraBERT-Large + Glossary model (HassanB4/sawb, 355M parameters) handles this format diversity more robustly.

Model Architecture

Property	Value
Base model	`aubmindlab/bert-base-arabertv02`
Architecture	`BertForSequenceClassification`
Parameters	125M
Labels	`LABEL_1` = hallucination, `LABEL_0` = not hallucination
Max sequence length	512 tokens
Input format	`السؤال: {question}\n\nإجابة النموذج: {answer[:500]}`

Training

Hyperparameter	Value
Training examples	2,904 (1,828 original + 1,076 glossary-synthesized)
Epochs	3
Learning rate	2×10⁻⁵
Batch size	8 per device
Gradient accumulation	4 steps (effective batch: 32)
LR schedule	Cosine
Optimizer	AdamW
Model selection	Best macro F1 on validation set
Framework	Hugging Face Transformers

Evaluation Results

Metric	Value
Macro F1 (validation, θ=0.50)	0.9246
Task	Binary classification (hallucination / not)
Evaluation set	457 Arabic (question, LLM answer) pairs
Optimal threshold	0.50 (shifted from 0.30 due to glossary distribution)

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("HassanB4/sawb-arabert-glossary")
model = AutoModelForSequenceClassification.from_pretrained("HassanB4/sawb-arabert-glossary")
model.eval()

question = "كيف تُطبَّق مبادئ أخلاقيات الذكاء الاصطناعي في القضاء الإسلامي؟"
answer = "يجب تطبيق AI Act الأوروبي على المحاكم الإسلامية..."

text = f"السؤال: {question}\n\nإجابة النموذج: {answer[:500]}"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    logits = model(**inputs).logits

prob_hallucination = torch.softmax(logits, dim=-1)[0, 1].item()
is_hallucination = prob_hallucination > 0.50  # optimal threshold for this model

print(f"Hallucination probability: {prob_hallucination:.3f}")
print(f"Is hallucination: {is_hallucination}")