Instructions to use ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2")
model = AutoModelForMultimodalLM.from_pretrained("ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2

SGLang

How to use ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2 with Docker Model Runner:
```
docker model run hf.co/ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2
```

ESPRIT-Derja-Qwen3-8B-v2

Le premier modèle instruct bilingue arabe/arabizi pour le dialecte tunisien (Derja).

Développé par la Direction de l'Intelligence Artificielle - ESPRIT School of Engineering, Tunis, Tunisie.

Description

ESPRIT-Derja-Qwen3-8B-v2 est un modèle de langue fine-tuné pour comprendre et générer du dialecte tunisien (Derja) en écriture arabe et en arabizi (translittération latine). Il est construit sur Qwen3-8B (instruct) avec un fine-tuning LoRA sur un dataset de 7 013 exemples d'instructions bilingues.

Le modèle répond naturellement en dialecte tunisien authentique, couvre la conversation quotidienne, la culture, la cuisine, la technologie, le sport et les expressions idiomatiques tunisiennes.

Caractéristiques

Bilingue arabe/arabizi : comprend et répond en écriture arabe tunisienne et en arabizi (3=ع, 7=ح, 9=ق, 5=خ, 8=غ, 2=ء)
Dialecte authentique : vocabulaire tunisien (barcha, yezzi, kifech, 3lech, ya5i, bech, mta3, 5ater)
Instruction-following : suit les instructions et répond de manière conversationnelle
Contexte 8K tokens : supporte des conversations longues

Exemples

Arabizi :

User: chnou ma3netha barcha?
Assistant: barcha ta3ni kther, w nesta3mlouha lamma n7ebou ngoulou illi fi 7aja bzaaf.

Arabe tunisien :

User: كيفاش نطيبو كسكسي بالعلوش؟
Assistant: الكسكسي بالعلوش لازمك تخلي اللحم يتقلى بالزيت الزيتون والبصلة، وبعد تحطلو الطماطم والفلفل والتوابل
. الكسكسي تفتلو وتعديه بالبخار مرتين، وبعد تزيد عليه المرقة..

Détails techniques

Paramètre	Valeur
Modèle de base	Qwen/Qwen3-8B (instruct)
Méthode	LoRA (Low-Rank Adaptation)
LoRA rank	32
LoRA alpha	64
Modules ciblés	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Epochs	10
Batch effectif	16 (4 x 4 gradient accumulation)
Learning rate	2e-4 (cosine schedule)
Précision	bfloat16
Loss finale	0.012
Durée d'entraînement	~6h sur NVIDIA A100 80GB

Dataset : ESPRIT-Derja-Instruct

Le dataset d'entraînement contient 7 013 exemples d'instructions, construits en deux phases :

Distillation GPT-4o (3 494 paires) : à partir d'un échantillon de 3 000 textes du corpus tunisian-derja-unified-raw-corpus (802K textes bruts), GPT-4o a généré des paires instruction/réponse en arabe tunisien ET en arabizi. Chaque paire produit 2 exemples (arabe + arabizi) = 6 988 exemples.
Données manuelles (25 exemples) : exemples conversationnels écrits manuellement pour ancrer le style.

Le dataset sera publié séparément sur HuggingFace.

Infrastructure

Entraîné et déployé sur l'infrastructure souveraine d'ESPRIT :

Hardware : NVIDIA DGX A100 (1 GPU A100 80GB)
Inference : vLLM 0.9.2 sur la plateforme AI Forge ESPRIT
API : compatible OpenAI (mêmes endpoints, mêmes SDK)

Utilisation

Avec vLLM

VLLM_USE_V1=0 vllm serve ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2 \
  --gpu-memory-utilization 0.40 --max-model-len 8192 --port 8009 \
  --served-model-name ESPRIT-Derja-Qwen3-8B-v2 --trust-remote-code

Avec transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model = AutoModelForCausalLM.from_pretrained(
    "ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2")

messages = [
    {"role": "system", "content": "Inti ESPRIT-Derja, modele IA ta3 ESPRIT Tunisie. Toujours jaweb bel dialecte tounsi."},
    {"role": "user", "content": "chnou ma3netha barcha?"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=300, temperature=0.7, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

System prompt recommandé

Inti ESPRIT-Derja, modele IA ta3 ESPRIT Tunisie. Toujours jaweb bel dialecte tounsi (darija tunisienne).
Ista3mel l-arabizi (3=ع, 7=ح, 9=ق, 5=خ, 8=غ) walla l-3arbiya tounsiya 7asb ki yekteblek el user.
Jaweb b tari9a tabi3iya, kima tounsi ya7ki m3a sa7bou.

Limites

Le modèle peut occasionnellement mélanger du vocabulaire d'autres dialectes maghrébins (algérien, marocain)
Les réponses factuelles ne sont pas fiables - le modèle est optimisé pour la conversation en dialecte, pas pour la précision factuelle
Le dataset de 7K exemples est modeste - une version v3 avec 20K+ exemples et DPO est prévue
Certains mots inventés peuvent apparaître (hallucinations lexicales)

Versions

Version	Base	Dataset	Méthode	Date
v1	Qwen3-8B	407 exemples	LoRA r=16	Juin 2026
v2 (actuelle)	Qwen3-8B	7 013 exemples	LoRA r=32	Juin 2026
v3 (prévue)	Qwen3-8B	20K+ exemples	Continued PT + SFT + DPO	Juin 2026

Citation

@misc{esprit-derja-2026,
  title={ESPRIT-Derja-Qwen3-8B-v2: Arabizi-Aware Instruction Tuning for Tunisian Arabic},
  author={Zerai, Mourad},
  year={2026},
  publisher={ESPRIT School of Engineering},
  url={https://huggingface.co/ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2}
}

Licence

Apache 2.0

Contact

Mourad Zerai - Direction de l'Intelligence Artificielle, ESPRIT School of Engineering, Tunis Mourad Zéraï
Organisation HuggingFace : ESPRIT-Group

Downloads last month: -

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for ESPRIT-Group/ESPRIT-Derja-Qwen3-8B-v2

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Adapter

(1459)

this model