Instructions to use Nimra28/urdu-llama-pakistan with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Nimra28/urdu-llama-pakistan with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Nimra28/urdu-llama-pakistan") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Nimra28/urdu-llama-pakistan", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Nimra28/urdu-llama-pakistan with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Nimra28/urdu-llama-pakistan" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Nimra28/urdu-llama-pakistan", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Nimra28/urdu-llama-pakistan
- SGLang
How to use Nimra28/urdu-llama-pakistan with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Nimra28/urdu-llama-pakistan" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Nimra28/urdu-llama-pakistan", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Nimra28/urdu-llama-pakistan" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Nimra28/urdu-llama-pakistan", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Nimra28/urdu-llama-pakistan with Docker Model Runner:
docker model run hf.co/Nimra28/urdu-llama-pakistan
🇵🇰 Urdu LLaMA — Pakistan's First Fine-tuned Urdu LLM
Llama 3.2-1B fine-tuned on a Pakistani corpus — Urdu news, literature, Islamic knowledge, history, culture and code-switching (Urdu+English)
Model Summary
Urdu LLaMA is Pakistan's first open-source instruction-following language model fine-tuned specifically on Pakistani and Urdu content. It is built on top of meta-llama/Llama-3.2-1B-Instruct and fine-tuned using QLoRA (4-bit quantization + LoRA) on a curated Pakistani corpus.
The model understands and generates text in:
- 🇵🇰 Urdu (primary language)
- 🌐 English (secondary)
- 💬 Code-switching (Urdu + English mixed — as spoken by Pakistanis daily)
🚀 Quick Start
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "Nimra28/urdu-llama-pakistan"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
device_map="auto"
)
def ask_urdu(question):
prompt = f"""<|begin_of_text|><|start_header_id|>system<|end_header_id|>
آپ ایک مددگار پاکستانی AI اسسٹنٹ ہیں جو اردو اور انگریزی میں جواب دے سکتے ہیں۔<|eot_id|><|start_header_id|>user<|end_header_id|>
{question}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=200,
temperature=0.7,
do_sample=True,
pad_token_id=tokenizer.eos_token_id,
)
response = tokenizer.decode(
outputs[0][inputs["input_ids"].shape[1]:],
skip_special_tokens=True
)
return response.strip()
# Try it
print(ask_urdu("پاکستان کی تاریخ بتائیں"))
print(ask_urdu("علامہ اقبال کون تھے؟"))
print(ask_urdu("Machine learning کیا ہے؟"))
📚 Training Data
The model was fine-tuned on a curated Pakistani corpus covering:
| Domain | Description |
|---|---|
| 📰 Urdu News | Pakistani current affairs, politics, sports |
| 📚 Urdu Literature | Poetry by Iqbal, Ghalib, Mir, Faiz Ahmed Faiz |
| 🕌 Islamic Knowledge | Namaz, Zakat, Hajj, Ramadan, Quran knowledge |
| 🏛️ Pakistani History | Independence, Quaid-e-Azam, national leaders |
| 🎓 Education | CSS exam, MDCAT, academic topics |
| 🍛 Culture & Food | Traditions, weddings, cuisine, festivals |
| 💬 Code-switching | Natural Urdu+English mixed conversations |
| 🌍 Geography | Provinces, cities, K2, national landmarks |
⚙️ Training Details
| Parameter | Value |
|---|---|
| Base Model | unsloth/Llama-3.2-1B-Instruct |
| Method | QLoRA (4-bit quantization + LoRA) |
| LoRA Rank | r=16, alpha=32 |
| LoRA Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| LoRA Dropout | 0.05 |
| Training Epochs | 3 |
| Learning Rate | 2e-4 |
| Batch Size | 2 (effective 8 with gradient accumulation) |
| Max Sequence Length | 512 tokens |
| Optimizer | paged_adamw_8bit |
| Hardware | Google Colab T4 GPU (free tier) |
| Training Time | ~35 minutes |
| Quantization | 4-bit NF4 with double quantization |
📊 Evaluation
The model was evaluated on held-out Urdu test examples:
- Perplexity: Improved significantly vs base model on Urdu text
- Qualitative evaluation: Model responds correctly in Urdu to Pakistani cultural, historical, and Islamic questions
- Code-switching: Handles natural Urdu+English mixing as spoken by Pakistanis
💬 Example Outputs
Question: پاکستان کب آزاد ہوا؟
پاکستان 14 اگست 1947 کو برطانوی استعمار سے آزاد ہوا۔ یہ دن ہر سال یوم آزادی کے طور پر منایا جاتا ہے۔ قائداعظم محمد علی جناح پاکستان کے پہلے گورنر جنرل بنے۔
Question: Machine learning کیا ہے؟
Machine learning یعنی مشینی سیکھ ایک ایسی technology ہے جس میں computer خود بخود data سے سیکھتا ہے۔ جیسے ہم experience سے سیکھتے ہیں، ویسے ہی machine بھی examples سے patterns پہچاننا سیکھتی ہے۔
Question: علامہ اقبال کون تھے؟
علامہ اقبال پاکستان کے قومی شاعر اور مفکر تھے۔ انہوں نے 1930 میں مسلمانوں کے لیے ایک الگ وطن کا خواب پیش کیا۔ انہیں شاعر مشرق اور مفکر پاکستان کہا جاتا ہے۔
⚠️ Limitations
- Fine-tuned on a relatively small dataset (~30 curated examples) — larger dataset will improve quality
- May occasionally mix Urdu and English even when pure Urdu is requested
- Islamic knowledge is general — not a substitute for qualified scholars
- Political opinions reflect training data — use with caution
🔮 Future Work
- Scale training data to 10,000+ Urdu examples
- Add Urdu news datasets from Dawn, Geo, ARY
- Fine-tune on Urdu Wikipedia dump
- Add Punjabi, Sindhi, Pashto support
- Release larger 7B parameter version
- Benchmark on Urdu NLP tasks
📄 Citation
If you use this model, please cite:
@misc{urdu-llama-pakistan-2024,
title={Urdu LLaMA: Pakistan's First Fine-tuned Urdu Language Model},
author={Nimra Tariq},
year={2024},
publisher={HuggingFace},
url={https://huggingface.co/Nimra28/urdu-llama-pakistan}
}
👩💻 Developed By
Nimra Tariq — AI Engineer & Assistant Professor, Superior University, Pakistan
- 🐙 GitHub: github.com/nimra-pixel
- 🤗 HuggingFace: huggingface.co/Nimra28
📜 License
This model is built on Llama 3.2 and follows the Llama 3.2 Community License.
Model tree for Nimra28/urdu-llama-pakistan
Base model
meta-llama/Llama-3.2-1B-Instruct