Instructions to use AyoubChLin/lfm2.5-8b-saudi-dialect with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use AyoubChLin/lfm2.5-8b-saudi-dialect with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="AyoubChLin/lfm2.5-8b-saudi-dialect")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("AyoubChLin/lfm2.5-8b-saudi-dialect")
model = AutoModelForCausalLM.from_pretrained("AyoubChLin/lfm2.5-8b-saudi-dialect")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use AyoubChLin/lfm2.5-8b-saudi-dialect with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "AyoubChLin/lfm2.5-8b-saudi-dialect"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AyoubChLin/lfm2.5-8b-saudi-dialect",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/AyoubChLin/lfm2.5-8b-saudi-dialect

SGLang

How to use AyoubChLin/lfm2.5-8b-saudi-dialect with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "AyoubChLin/lfm2.5-8b-saudi-dialect" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AyoubChLin/lfm2.5-8b-saudi-dialect",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "AyoubChLin/lfm2.5-8b-saudi-dialect" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AyoubChLin/lfm2.5-8b-saudi-dialect",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use AyoubChLin/lfm2.5-8b-saudi-dialect with Docker Model Runner:
```
docker model run hf.co/AyoubChLin/lfm2.5-8b-saudi-dialect
```

LFM2.5-8B Saudi Dialect

AyoubChLin/lfm2.5-8b-saudi-dialect is a Saudi Arabic conversational fine-tune of LiquidAI/LFM2.5-8B-A1B.

The model was fine-tuned to produce more natural Saudi dialect responses in chat-style conversations. It is intended for Arabic dialogue, informal Saudi phrasing, and assistant-style responses using a Saudi Arabic system prompt.

Model Details

Field	Value
Base model	`LiquidAI/LFM2.5-8B-A1B`
Fine-tuned model	`AyoubChLin/lfm2.5-8b-saudi-dialect`
Dataset	`HeshamHaroon/saudi-dialect-conversations`
Dataset size	3,545 examples
Train split	3,474 examples
Evaluation split	71 examples
Fine-tuning method	Supervised fine-tuning with LoRA
Final format	Merged model
Precision	bf16
Quantization	None
Max sequence length	10,244 tokens
Language	Arabic
Dialect focus	Saudi Arabic
License	Apache 2.0

Intended Use

This model is intended for Saudi Arabic conversational use cases, including:

Saudi dialect chatbots
Arabic assistant responses with Saudi phrasing
Dialogue generation
Informal Saudi Arabic conversation
Domain-specific Saudi Arabic assistant prototypes

Example system prompt used during fine-tuning:

أنت مساعد مفيد يتحدث باللهجة السعودية.

Dataset

The model was fine-tuned on:

HeshamHaroon/saudi-dialect-conversations

The dataset contains multi-turn Saudi Arabic conversations with metadata such as scenario, topic, complexity, and English summary. During preprocessing, each conversation was rendered with the model chat template. A Saudi Arabic system message was injected when missing.

Example conversational style includes casual Saudi phrases such as:

هلا والله
وش سالفتك؟
ايه والله
الله يعطيك العافية

Training Setup

The model was trained with supervised fine-tuning using LoRA adapters. The base model was loaded in bf16 without 4-bit quantization, and Flash Attention 2 was enabled.

LoRA Configuration

Parameter	Value
LoRA rank	128
LoRA alpha	254
LoRA dropout	0.05
Bias	none
Task type	Causal LM
Trainable parameters	38,535,168
Total parameters	8,506,391,296
Trainable percentage	0.4530%

Target Modules

LoRA was applied to the following modules:

q_proj
k_proj
v_proj
out_proj
in_proj
conv.in_proj
conv.out_proj
gate_proj
up_proj
down_proj

Training Hyperparameters

Parameter	Value
Epochs	6
Per-device train batch size	8
Per-device eval batch size	8
Gradient accumulation steps	8
Effective batch size	64
Learning rate	2e-4
LR scheduler	cosine
Warmup ratio	0.05
Optimizer	adamw_torch_fused
Precision	bf16
FP16	false
Max sequence length	10,244
Evaluation strategy	steps
Eval steps	70
Save steps	70
Save total limit	2
Logging steps	10
Dataset packing	false
Dataloader workers	4
Seed	42
Flash Attention 2	enabled
Gradient checkpointing	disabled
Quantization	none

Training Environment

Component	Value
GPU	NVIDIA H200
VRAM	150.1 GB
PyTorch	2.8.0+cu129
CUDA	12.9
Transformers	5.12.1
PEFT	0.19.1
Attention implementation	Flash Attention 2
Training tracker	Weights & Biases
Runtime	756 seconds
Runtime	~12.6 minutes
Throughput	27.6 samples/sec

Note: The notebook was prepared for an A100 target, but the recorded run was executed on an NVIDIA H200 with 150.1 GB VRAM.

Training Results

Training completed successfully for 6 epochs and 330 optimization steps.

Metric	Value
Final training loss, logged step 330	1.0633
Overall train loss reported by trainer	1.5250
Final validation loss, step 330	1.7088
Best validation loss	1.6409 at step 140
Final train mean token accuracy	0.7597
Final eval mean token accuracy	0.6545
Best eval mean token accuracy	0.6584 at step 210
Total training steps	330
Final epoch	6
Total tokens seen at final eval	3,736,326

Evaluation Progress

Step	Training Loss	Validation Loss	Eval Mean Token Accuracy	Tokens Seen
70	1.6887	1.7474	0.6430	793,936
140	1.4219	1.6409	0.6540	1,588,892
210	1.2429	1.6442	0.6584	2,384,614
280	1.0833	1.6863	0.6562	3,171,273
330	1.0633	1.7088	0.6545	3,736,326

Notes on the Results

The training loss decreased consistently during the run, from 4.6616 at the first logged step to 1.0633 at step 330. Train mean token accuracy also improved steadily, reaching 0.7597 at the final logged step.

Validation performance improved early in training, with the best validation loss appearing at step 140 and the best evaluation mean token accuracy appearing at step 210. After that point, the training loss continued to decrease while validation loss increased slightly. This suggests that the final checkpoint is more strongly adapted to the training distribution, while an earlier checkpoint around steps 140–210 may generalize slightly better on the small held-out validation split.

Because the evaluation split contains only 71 examples, these metrics should be treated as training diagnostics rather than a full benchmark. A stronger evaluation should include human review by native Saudi Arabic speakers, dialect naturalness scoring, response helpfulness scoring, safety checks, and comparisons against the base model.

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "AyoubChLin/lfm2.5-8b-saudi-dialect"

tokenizer = AutoTokenizer.from_pretrained(
    model_id,
    trust_remote_code=True,
)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

messages = [
    {
        "role": "system",
        "content": "أنت مساعد مفيد يتحدث باللهجة السعودية."
    },
    {
        "role": "user",
        "content": "هلا، وش تنصحني أسوي إذا أبي أتعلم برمجة؟"
    }
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)

inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    repetition_penalty=1.05,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Recommended Generation Settings

For natural Saudi conversational responses:

generation_config = {
    "max_new_tokens": 256,
    "temperature": 0.7,
    "top_p": 0.9,
    "do_sample": True,
    "repetition_penalty": 1.05,
}

For more deterministic assistant-style responses:

generation_config = {
    "max_new_tokens": 256,
    "temperature": 0.3,
    "top_p": 0.8,
    "do_sample": True,
    "repetition_penalty": 1.05,
}

Limitations

This model is a specialized Saudi dialect fine-tune and may not be optimal for:

Non-Saudi Arabic dialects
Formal Modern Standard Arabic tasks
Safety-critical domains
Legal, medical, or financial advice
Factual questions requiring up-to-date information
Long-context reasoning beyond the fine-tuning distribution

The model may also reflect biases, inaccuracies, or style artifacts present in the training dataset.

Evaluation

The reported evaluation used validation loss and mean token accuracy on a small held-out split of 71 examples.

For future releases, stronger evaluation should include:

Human evaluation by native Saudi Arabic speakers
Dialect naturalness scoring
Response helpfulness scoring
Safety evaluation
Comparison against the base model
Saudi dialect benchmark prompts
Evaluation on prompts outside the training dataset distribution

Citation

Base model:

@misc{liquidai_lfm25_8b_a1b,
  title = {LFM2.5-8B-A1B},
  author = {Liquid AI},
  year = {2026},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/LiquidAI/LFM2.5-8B-A1B}}
}

Dataset:

@misc{saudi_dialect_conversations,
  title = {Saudi Dialect Conversations},
  author = {HeshamHaroon},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/datasets/HeshamHaroon/saudi-dialect-conversations}}
}

Disclaimer

This model is provided for research and development purposes. Outputs should be reviewed before use in production systems, especially in sensitive or high-stakes applications.