Text Generation
PEFT
Safetensors
Arabic
arabic
hallucination-detection
cultural-hallucination
lora
sft
icaire
conversational
Instructions to use HassanB4/sawb-deepseek-r1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use HassanB4/sawb-deepseek-r1 with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Llama-8B") model = PeftModel.from_pretrained(base_model, "HassanB4/sawb-deepseek-r1") - Notebooks
- Google Colab
- Kaggle
Sawb — DeepSeek-R1-Distill-Llama-8B (LoRA SFT — Explanation Model)
Part of the Sawb Arabic Cultural Hallucination Detection Collection for ICAIRE 2026 Track 3.
Overview
Sawb — DeepSeek-R1 is a LoRA-adapted generative model fine-tuned to produce structured Arabic explanations for cultural hallucinations detected in LLM outputs. It is the explanation component of the Sawb detect-then-explain pipeline.
The Sawb pipeline works as follows:
- Detection: The HassanB4/sawb model (AraBERT-Large + Glossary, 355M params) classifies each (Arabic question, LLM answer) pair as hallucination or not
- Explanation: For detected hallucinations, this model generates a case-specific Arabic explanation citing exact phrases from the LLM's answer to explain why it is culturally incorrect
This model is fine-tuned from deepseek-ai/DeepSeek-R1-Distill-Llama-8B (8B parameters) using supervised fine-tuning (SFT) on the Sawb Arabic Cultural Hallucination Dataset.
Model Architecture
| Property | Value |
|---|---|
| Base model | deepseek-ai/DeepSeek-R1-Distill-Llama-8B |
| Fine-tuning method | LoRA (Low-Rank Adaptation) |
| LoRA rank (r) | 16 |
| LoRA alpha (α) | 32 |
| LoRA dropout | 0.05 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Task type | Causal Language Modeling |
| Parameters (base) | 8B |
Training
| Hyperparameter | Value |
|---|---|
| Training examples | 1,828 |
| Method | Supervised Fine-Tuning (SFT) |
| Framework | PEFT + TRL |
Output Format
The model generates structured JSON with a case-specific Arabic explanation:
{
"is_hallucination": true,
"category": "religious_misrepresentation",
"explanation_ar": "استشهدت الإجابة بحديث 'الخالق هو الله وحده، وما سواه مخلوق لا يملك خلقاً' ونسبته للنبي صلى الله عليه وسلم، وهذا حديث مكذوب لا أصل له في كتب السنة.",
"confidence": 0.9
}
Hallucination Categories
| Category | Description |
|---|---|
ethical_framework_mismatch |
Applies EU AI Act / GDPR instead of Maqasid al-Shariah |
religious_misrepresentation |
Fabricated or unverifiable hadith, inaccurate Islamic rulings |
historical_inaccuracy |
Omits Arab AI contributions (KACST, SDAIA, MBZUAI, Vision 2030) |
social_norms_violation |
Applies Western social standards ignoring Gulf/Islamic norms |
dialectal_confusion |
Responds in wrong dialect or refuses the requested dialect |
regional_context_errors |
Uses Western examples in a Saudi/Gulf-specific context |
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
base_model = "deepseek-ai/DeepSeek-R1-Distill-Llama-8B"
adapter = "HassanB4/sawb-deepseek-r1"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)
model.eval()
system_prompt = (
"أنت محكم متخصص في الكشف عن الهلوسة الثقافية في نماذج اللغة الكبيرة. "
"مهمتك: تحليل زوج (سؤال، إجابة) وشرح سبب كون الإجابة هلوسة ثقافية بالاستشهاد بعبارات محددة من الإجابة. "
"أخرج إجابتك بتنسيق JSON فقط."
)
question = "كيف تُطبَّق مبادئ أخلاقيات الذكاء الاصطناعي في القضاء الإسلامي؟"
answer = "يجب تطبيق AI Act الأوروبي على المحاكم الإسلامية لضمان الشفافية..."
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": f"السؤال: {question}\n\nإجابة النموذج: {answer}"},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(inputs, max_new_tokens=256, temperature=0.1, do_sample=True)
response = tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)
print(response)
Dataset
Trained on HassanB4/sawb-arabic-hallucination-dataset.
Collection
- Downloads last month
- 27
Model tree for HassanB4/sawb-deepseek-r1
Base model
deepseek-ai/DeepSeek-R1-Distill-Llama-8BDataset used to train HassanB4/sawb-deepseek-r1
Viewer • Updated • 2.29k • 65