Instructions to use Faizaniqbal/KoshurAI_Tarjuma_v3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Faizaniqbal/KoshurAI_Tarjuma_v3 with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Faizaniqbal/koshur-kouter-ks-en_v1_re") model = PeftModel.from_pretrained(base_model, "Faizaniqbal/KoshurAI_Tarjuma_v3") - Notebooks
- Google Colab
- Kaggle
KoshurAI v3 — Kashmiri ↔ English Translation (LoRA Adapter)
⚠️ Gated model. Request access to download weights.
A LoRA fine-tuned adapter for bidirectional Kashmiri ↔ English translation,
built on top of Faizaniqbal/KoshurAI_Tarjuma_v2
— itself a Gemma 3 (4.5B) model continually pretrained on 2.8M tokens of Kashmiri text.
On the FLORES-200 devtest (1,012 sentences), KoshurAI v3 outperforms NLLB-200 distilled-600M on COMET in both translation directions.
Model Details
| Author | Faizan Iqbal (@Faizaniqbal) |
| Base model | Faizaniqbal/KoshurAI_Tarjuma_v2 |
| Adapter type | LoRA (QLoRA training) |
| Architecture | Gemma3ForCausalLM + PEFT LoRA |
| Languages | Kashmiri (ks · kas_Arab), English (en) |
| License | Apache-2.0 |
| Training data | 16,637 curated bidirectional EN↔KS sentence pairs |
| Training compute | Google Colab GPU |
Model Tree
google/gemma-3-4b-pt
└─ google/gemma-3-4b-it
└─ sarvamai/sarvam-translate
└─ Faizaniqbal/KoshurAI_Tarjuma_v2 ← 2.8M Kashmiri pretraining
└─ Faizaniqbal/KoshurAI_Tarjuma_v3 ← this adapter (SFT)
Quickstart
Install
pip install transformers peft accelerate bitsandbytes sentencepiece
Load & Translate
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
BASE_MODEL = "Faizaniqbal/KoshurAI_Tarjuma_v2"
ADAPTER = "Faizaniqbal/KoshurAI_Tarjuma_v3"
bnb_cfg = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
tok = AutoTokenizer.from_pretrained(BASE_MODEL)
tok.pad_token = tok.eos_token
base = AutoModelForCausalLM.from_pretrained(
BASE_MODEL, quantization_config=bnb_cfg, device_map="auto"
)
model = PeftModel.from_pretrained(base, ADAPTER)
model.eval()
def translate(text, direction="en2ks"):
prefix = "Translate to Kashmiri: " if direction == "en2ks" else "Translate to English: "
prompt = f"<start_of_turn>user\n{prefix}{text}<end_of_turn>\n<start_of_turn>model\n"
inputs = tok(prompt, return_tensors="pt", truncation=True, max_length=512).to("cuda")
with torch.no_grad():
out = model.generate(
**inputs,
max_new_tokens=150,
min_new_tokens=5,
do_sample=False,
repetition_penalty=1.1,
)
return tok.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True).strip()
print(translate("The dog is sleeping.", "en2ks"))
print(translate("ہونٛد چھُ شُنٛگِتھ", "ks2en"))
Training
Stage 1 — Kashmiri Pretraining (base model)
The base model (KoshurAI_Tarjuma_v2) was continually pretrained on
2.8 million tokens of Kashmiri text from publicly available sources
(literature, journalism, academic texts, religious scholarship). This gave
the model deep Kashmiri language knowledge.
Stage 2 — SFT for Translation (this adapter)
This LoRA adapter was trained on 16,637 curated bidirectional sentence pairs (EN↔KS + KS↔EN) to teach the model explicit translation capability.
| Split | Records |
|---|---|
| Base SFT corpus (v2) | 15,527 |
| New pairs (v3) | 1,110 |
| Total | 16,637 |
Training Configuration
| Hyperparameter | Value |
|---|---|
| Base model | Faizaniqbal/KoshurAI_Tarjuma_v2 |
| LoRA rank (r) | 16 |
| LoRA alpha | 16 |
| LoRA dropout | 0.05 |
| LoRA target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Quantization | 4-bit NF4 (BitsAndBytes) |
| Compute dtype | bfloat16 |
| Epochs | 2 |
| Learning rate | 1e-4 |
| Effective batch size | 8 (2 × grad_accum 4) |
| Max sequence length | 512 tokens |
| Optimizer | paged_adamw_8bit |
| LR scheduler | Cosine |
| Warmup steps | 100 |
| Weight decay | 0.01 |
Evaluation — FLORES-200 Devtest (1,012 sentences)
| Direction | Model | BLEU | COMET |
|---|---|---|---|
| KS→EN | KoshurAI v3 (ours) | 15.74 | 0.6982 ✅ |
| KS→EN | NLLB-200 distilled-600M | 16.28 | 0.6741 |
| EN→KS | KoshurAI v3 (ours) | 30.37¹ | 0.6604 ✅ |
| EN→KS | NLLB-200 distilled-600M | 39.65¹ | 0.6431 |
¹ EN→KS BLEU is character-level (tokenize='char'), standard for Arabic-script output.
COMET = Unbabel/wmt22-comet-da system score.
KoshurAI v3 outperforms NLLB-200 on COMET in both directions.
Sample Translations (EN→KS)
| English | KoshurAI v3 |
|---|---|
| They include the Netherlands, with Anna Jochemsen finishing ninth. | تِیَم چھُ نیدرلینڈس شامِل کَران اَینا جوکیمسن فِنِشِنگ نائنتھ سیتھ |
| Hershey and Chase used phages, or viruses, to implant their own DNA. | ۂرشے تہٕ چیسن کٔرۍ فیگ تہٕ جَراثیم منٛز پنُن ڈی این اے اَزناوُنہِ خٲطر |
| They usually have special food, drink and entertainment offers. | تِیَمَن چھُ اکثر خاص کھٮ۪ن، چیٖز تہٕ تفریح پیش کَرنہِ یِوان |
Inference Settings
| Parameter | Value |
|---|---|
do_sample |
False (greedy) |
max_new_tokens |
150 (EN→KS) / 200 (KS→EN) |
min_new_tokens |
5 |
repetition_penalty |
1.1 |
Hardware Requirements
| Setting | VRAM |
|---|---|
| 4-bit inference (recommended) | ~6–8 GB |
| Colab free tier (T4) | ✅ with 4-bit |
| Colab L4 / A100 | ✅ comfortable |
Limitations
- Trained on sentence-level pairs (≤ 512 tokens); long-form translation unsupported.
- Performance on technical, legal, or dialectal Kashmiri is unverified.
- No human evaluation conducted; COMET and BLEU are automatic metrics only.
- 4-bit quantization used for inference; full-precision may yield higher scores.
Citation
If you use this model, please cite:
@misc{iqbal2026koshurai,
title = {KoshurAI v3: A Fine-Tuned Neural Machine Translation System
for Kashmiri--English Bidirectional Translation},
author = {Iqbal, Faizan},
year = {2026},
howpublished = {\url{https://huggingface.co/Faizaniqbal/KoshurAI_Tarjuma_v3}},
note = {LoRA adapter fine-tuned from Faizaniqbal/KoshurAI_Tarjuma_v2}
}
This work fine-tunes the model by Malik & Nissar — also cite:
@misc{malik2026koshurkouter,
title = {Koshur Kouter KS-EN v1: A Merged QLoRA Kashmiri--English Translation Model},
author = {Malik, Haq Nawaz and Nissar, Nahfid},
year = {2026},
howpublished = {\url{https://huggingface.co/Omarrran/koshur-kouter-ks-en_v1}},
note = {Fine-tuned from sarvamai/sarvam-translate}
}
And the original base model:
@misc{sarvam2025translate,
title = {Sarvam-Translate},
author = {{Sarvam AI}},
howpublished = {\url{https://huggingface.co/sarvamai/sarvam-translate}}
}
Acknowledgements
This model builds on Omarrran/koshur-kouter-ks-en_v1, which was fine-tuned
by Haq Nawaz Malik & Nahfid Nissar (2026),
itself built on sarvamai/sarvam-translate (Gemma 3, 4.5B) by Sarvam AI.
Evaluated on FLORES-200 devtest. COMET scored using Unbabel/wmt22-comet-da.
- Downloads last month
- 7
Model tree for Faizaniqbal/KoshurAI_Tarjuma_v3
Base model
google/gemma-3-4b-ptEvaluation results
- BLEU on FLORES-200 devtestself-reported15.740
- COMET (wmt22-comet-da) on FLORES-200 devtestself-reported0.698
- BLEU (char-level) on FLORES-200 devtestself-reported30.370
- COMET (wmt22-comet-da) on FLORES-200 devtestself-reported0.660