You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

๐Ÿ‡ฒ๐Ÿ‡ฆ Terjman-Nano-v2.0 (77M) ๐Ÿš€

Terjman-Nano-v2.0 is an improved version of atlasia/Terjman-Nano-v1, built on the powerful Transformer architecture and fine-tuned for high-quality, accurate translations.

This version is based on atlasia/Terjman-Nano-v1 and has been trained on a larger and more refined dataset, leading to improved translation performance. The model achieves results on par with gpt-4o-2024-08-06 on TerjamaBench, an evaluation benchmark for English-Moroccan darija translation models, that challenges the models more on the cultural aspect.

๐Ÿš€ Features

โœ… Fine-tuned for English->Moroccan darija translation.
โœ… State-of-the-art performance among open-source models.
โœ… Compatible with ๐Ÿค— Transformers and easily deployable on various hardware setups.

๐Ÿ”ฅ Performance Comparison

The following table compares Terjman-Nano-v2.0 against proprietary and open-source models using BLEU, chrF, and TER scores. Higher BLEU/chrF and lower TER indicate better translation quality.

Model Size BLEUโ†‘ chrFโ†‘ TERโ†“
Proprietary Models
gemini-exp-1206 * 30.69 54.16 67.62
claude-3-5-sonnet-20241022 * 30.51 51.80 67.42
gpt-4o-2024-08-06 * 28.30 50.13 71.77
Open-Source Models
Terjman-Ultra-v2.0 1.3B 25.00 44.70 77.20
Terjman-Supreme-v2.0 3.3B 23.43 44.57 78.17
Terjman-Large-v2.0 240M 22.67 42.57 83.00
Terjman-Nano-v2.0 (This model) 77M 18.84 38.41 94.73
atlasia/Terjman-Large-v1.2 240M 16.33 37.10 89.13
MBZUAI-Paris/Atlas-Chat-9B 9B 14.80 35.26 93.95
facebook/nllb-200-3.3B 3.3B 14.76 34.17 94.33
atlasia/Terjman-Nano 77M 09.98 26.55 106.49

๐Ÿ”ฌ Model Details

  • Base Model: atlasia/Terjman-Nano-v1
  • Architecture: Transformer-based sequence-to-sequence model
  • Training Data: High-quality parallel corpora with high quality translations
  • Training Precision: FP16 for efficient inference

๐Ÿš€ How to Use

You can use the model with the Hugging Face Transformers library:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "BounharAbdelaziz/Terjman-Nano-v2.0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

def translate(text):
    inputs = tokenizer(text, return_tensors="pt")
    output = model.generate(**inputs)
    return tokenizer.decode(output[0], skip_special_tokens=True)

# Example translation
text = "Hello there! Today the weather is so nice in Geneva, couldn't ask for more to enjoy the holidays :)"
translation = translate(text)
print("Translation:", translation)
# prints: ุตุจุงุญ ุงู„ุฎูŠุฑ! ุงู„ูŠูˆู… ุงู„ุทู‚ุณ ุฒูˆูŠู† ุจุฒุงู ูุฌู†ูŠูุŒ ู…ุง ู‚ุฏุฑุด ู†ุทู„ุจ ุฃูƒุซุฑ ุจุงุด ู†ุณุชู…ุชุน ุจุงู„ุนุทู„ุงุช:)

๐Ÿ–ฅ๏ธ Deployment

Run in a Hugging Face Space

Try the model interactively in the Terjman-Nano Space ๐Ÿค—

Use with Text Generation Inference (TGI)

For fast inference, use Hugging Face TGI:

pip install text-generation
text-generation-launcher --model-id BounharAbdelaziz/Terjman-Nano-v2.0

Run Locally with Transformers & PyTorch

pip install transformers torch
python -c "from transformers import pipeline; print(pipeline('translation', model='BounharAbdelaziz/Terjman-Nano-v2.0')('Hello there!'))"

Deploy on an API Server

Use FastAPI to serve translations as an API:

from fastapi import FastAPI
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

app = FastAPI()
model_name = "BounharAbdelaziz/Terjman-Nano-v2.0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

@app.get("/translate/")
def translate(text: str):
    inputs = tokenizer(text, return_tensors="pt")
    output = model.generate(**inputs)
    return {"translation": tokenizer.decode(output[0], skip_special_tokens=True)}

๐Ÿ› ๏ธ Training Details Hyperparameters**

The model was fine-tuned using the following training settings:

  • Learning Rate: 0.0001
  • Training Batch Size: 64
  • Evaluation Batch Size: 64
  • Seed: 42
  • Gradient Accumulation Steps: 4
  • Total Effective Batch Size: 256
  • Optimizer: AdamW (Torch) with betas=(0.9,0.999), epsilon=1e-08
  • Learning Rate Scheduler: Linear
  • Warmup Ratio: 0.1
  • Epochs: 5
  • Precision: Mixed FP16 for efficient training

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.21.0

๐Ÿ“œ License

This model is released under the CC BY-NC (Creative Commons Attribution-NonCommercial) license, meaning it can be used for research and personal projects but not for commercial purposes. For commercial use, please get in touch :)

@misc{terjman-v2,
  title = {Terjman-v2: High-Quality English-Moroccan Darija Translation Model},
  author={Abdelaziz Bounhar},
  year={2025},
  howpublished = {\url{https://huggingface.co/BounharAbdelaziz/Terjman-Nano-v2.0}},
  license = {CC BY-NC}
}
Downloads last month
57
Safetensors
Model size
76.4M params
Tensor type
BF16
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for atlasia/Terjman-Nano-v2.0

Finetuned
(2)
this model

Collection including atlasia/Terjman-Nano-v2.0