LFM2.5-8B Saudi Dialect

AyoubChLin/lfm2.5-8b-saudi-dialect is a Saudi Arabic conversational fine-tune of LiquidAI/LFM2.5-8B-A1B.

The model was fine-tuned to produce more natural Saudi dialect responses in chat-style conversations. It is intended for Arabic dialogue, informal Saudi phrasing, and assistant-style responses using a Saudi Arabic system prompt.

Model Details

Field Value
Base model LiquidAI/LFM2.5-8B-A1B
Fine-tuned model AyoubChLin/lfm2.5-8b-saudi-dialect
Dataset HeshamHaroon/saudi-dialect-conversations
Dataset size 3,545 examples
Train split 3,474 examples
Evaluation split 71 examples
Fine-tuning method Supervised fine-tuning with LoRA
Final format Merged model
Precision bf16
Quantization None
Max sequence length 10,244 tokens
Language Arabic
Dialect focus Saudi Arabic
License Apache 2.0

Intended Use

This model is intended for Saudi Arabic conversational use cases, including:

  • Saudi dialect chatbots
  • Arabic assistant responses with Saudi phrasing
  • Dialogue generation
  • Informal Saudi Arabic conversation
  • Domain-specific Saudi Arabic assistant prototypes

Example system prompt used during fine-tuning:

أنت مساعد مفيد يتحدث باللهجة السعودية.

Dataset

The model was fine-tuned on:

HeshamHaroon/saudi-dialect-conversations

The dataset contains multi-turn Saudi Arabic conversations with metadata such as scenario, topic, complexity, and English summary. During preprocessing, each conversation was rendered with the model chat template. A Saudi Arabic system message was injected when missing.

Example conversational style includes casual Saudi phrases such as:

هلا والله
وش سالفتك؟
ايه والله
الله يعطيك العافية

Training Setup

The model was trained with supervised fine-tuning using LoRA adapters. The base model was loaded in bf16 without 4-bit quantization, and Flash Attention 2 was enabled.

LoRA Configuration

Parameter Value
LoRA rank 128
LoRA alpha 254
LoRA dropout 0.05
Bias none
Task type Causal LM
Trainable parameters 38,535,168
Total parameters 8,506,391,296
Trainable percentage 0.4530%

Target Modules

LoRA was applied to the following modules:

q_proj
k_proj
v_proj
out_proj
in_proj
conv.in_proj
conv.out_proj
gate_proj
up_proj
down_proj

Training Hyperparameters

Parameter Value
Epochs 6
Per-device train batch size 8
Per-device eval batch size 8
Gradient accumulation steps 8
Effective batch size 64
Learning rate 2e-4
LR scheduler cosine
Warmup ratio 0.05
Optimizer adamw_torch_fused
Precision bf16
FP16 false
Max sequence length 10,244
Evaluation strategy steps
Eval steps 70
Save steps 70
Save total limit 2
Logging steps 10
Dataset packing false
Dataloader workers 4
Seed 42
Flash Attention 2 enabled
Gradient checkpointing disabled
Quantization none

Training Environment

Component Value
GPU NVIDIA H200
VRAM 150.1 GB
PyTorch 2.8.0+cu129
CUDA 12.9
Transformers 5.12.1
PEFT 0.19.1
Attention implementation Flash Attention 2
Training tracker Weights & Biases
Runtime 756 seconds
Runtime ~12.6 minutes
Throughput 27.6 samples/sec

Note: The notebook was prepared for an A100 target, but the recorded run was executed on an NVIDIA H200 with 150.1 GB VRAM.

Training Results

Training completed successfully for 6 epochs and 330 optimization steps.

Metric Value
Final training loss, logged step 330 1.0633
Overall train loss reported by trainer 1.5250
Final validation loss, step 330 1.7088
Best validation loss 1.6409 at step 140
Final train mean token accuracy 0.7597
Final eval mean token accuracy 0.6545
Best eval mean token accuracy 0.6584 at step 210
Total training steps 330
Final epoch 6
Total tokens seen at final eval 3,736,326

Evaluation Progress

Step Training Loss Validation Loss Eval Mean Token Accuracy Tokens Seen
70 1.6887 1.7474 0.6430 793,936
140 1.4219 1.6409 0.6540 1,588,892
210 1.2429 1.6442 0.6584 2,384,614
280 1.0833 1.6863 0.6562 3,171,273
330 1.0633 1.7088 0.6545 3,736,326

Notes on the Results

The training loss decreased consistently during the run, from 4.6616 at the first logged step to 1.0633 at step 330. Train mean token accuracy also improved steadily, reaching 0.7597 at the final logged step.

Validation performance improved early in training, with the best validation loss appearing at step 140 and the best evaluation mean token accuracy appearing at step 210. After that point, the training loss continued to decrease while validation loss increased slightly. This suggests that the final checkpoint is more strongly adapted to the training distribution, while an earlier checkpoint around steps 140–210 may generalize slightly better on the small held-out validation split.

Because the evaluation split contains only 71 examples, these metrics should be treated as training diagnostics rather than a full benchmark. A stronger evaluation should include human review by native Saudi Arabic speakers, dialect naturalness scoring, response helpfulness scoring, safety checks, and comparisons against the base model.

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "AyoubChLin/lfm2.5-8b-saudi-dialect"

tokenizer = AutoTokenizer.from_pretrained(
    model_id,
    trust_remote_code=True,
)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

messages = [
    {
        "role": "system",
        "content": "أنت مساعد مفيد يتحدث باللهجة السعودية."
    },
    {
        "role": "user",
        "content": "هلا، وش تنصحني أسوي إذا أبي أتعلم برمجة؟"
    }
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)

inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    repetition_penalty=1.05,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Recommended Generation Settings

For natural Saudi conversational responses:

generation_config = {
    "max_new_tokens": 256,
    "temperature": 0.7,
    "top_p": 0.9,
    "do_sample": True,
    "repetition_penalty": 1.05,
}

For more deterministic assistant-style responses:

generation_config = {
    "max_new_tokens": 256,
    "temperature": 0.3,
    "top_p": 0.8,
    "do_sample": True,
    "repetition_penalty": 1.05,
}

Limitations

This model is a specialized Saudi dialect fine-tune and may not be optimal for:

  • Non-Saudi Arabic dialects
  • Formal Modern Standard Arabic tasks
  • Safety-critical domains
  • Legal, medical, or financial advice
  • Factual questions requiring up-to-date information
  • Long-context reasoning beyond the fine-tuning distribution

The model may also reflect biases, inaccuracies, or style artifacts present in the training dataset.

Evaluation

The reported evaluation used validation loss and mean token accuracy on a small held-out split of 71 examples.

For future releases, stronger evaluation should include:

  • Human evaluation by native Saudi Arabic speakers
  • Dialect naturalness scoring
  • Response helpfulness scoring
  • Safety evaluation
  • Comparison against the base model
  • Saudi dialect benchmark prompts
  • Evaluation on prompts outside the training dataset distribution

Citation

Base model:

@misc{liquidai_lfm25_8b_a1b,
  title = {LFM2.5-8B-A1B},
  author = {Liquid AI},
  year = {2026},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/LiquidAI/LFM2.5-8B-A1B}}
}

Dataset:

@misc{saudi_dialect_conversations,
  title = {Saudi Dialect Conversations},
  author = {HeshamHaroon},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/datasets/HeshamHaroon/saudi-dialect-conversations}}
}

Disclaimer

This model is provided for research and development purposes. Outputs should be reviewed before use in production systems, especially in sensitive or high-stakes applications.


Downloads last month
182
Safetensors
Model size
8B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AyoubChLin/lfm2.5-8b-saudi-dialect

Adapter
(28)
this model
Adapters
1 model

Dataset used to train AyoubChLin/lfm2.5-8b-saudi-dialect

Collection including AyoubChLin/lfm2.5-8b-saudi-dialect