Terjman-Supreme (3.3B)

Our model is built upon the powerful Transformer architecture, leveraging state-of-the-art natural language processing techniques. It is a fine-tuned version of facebook/nllb-200-3.3B on a the darija_english dataset enhanced with curated corpora ensuring high-quality and accurate translations.

It achieves the following results on the evaluation set:

Loss: 2.3687
Bleu: 5.6718
Gen Len: 39.9504

The finetuning was conducted using a A100-40GB and took 57 hours.

Try it out on our dedicated Terjman-Supreme Space 🤗

Usage

Using our model for translation is simple and straightforward. You can integrate it into your projects or workflows via the Hugging Face Transformers library. Here's a basic example of how to use the model in Python:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("atlasia/Terjman-Supreme")
model = AutoModelForSeq2SeqLM.from_pretrained("atlasia/Terjman-Supreme")

# Define your Moroccan Darija Arabizi text
input_text = "Your english text goes here."

# Tokenize the input text
input_tokens = tokenizer(input_text, return_tensors="pt", padding=True, truncation=True)

# Perform translation
output_tokens = model.generate(**input_tokens)

# Decode the output tokens
output_text = tokenizer.decode(output_tokens[0], skip_special_tokens=True)

print("Translation:", output_text)

Example

Let's see an example of transliterating Moroccan Darija Arabizi to Arabic:

Input: "Hi my friend, can you tell me a joke in moroccan darija? I'd be happy to hear that from you!"

Output: "أهلا صاحبي، واش تقدر تقول لي نكتة بالدارجة المغربية؟ غادي نكون فرحان باش نسمعها منك!"

Limitations

This version has some limitations mainly due to the Tokenizer. We're currently collecting more data with the aim of continous improvements.

Feedback

We're continuously striving to improve our model's performance and usability and we will be improving it incrementaly. If you have any feedback, suggestions, or encounter any issues, please don't hesitate to reach out to us.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 4
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.03
num_epochs: 25

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
2.7591	1.0000	8968	2.5401	4.6975	39.4298
2.593	1.9999	17936	2.4204	5.4569	40.4573
2.6088	2.9999	26904	2.3929	5.5729	40.7879
2.5744	4.0	35873	2.3821	5.567	40.27
2.5658	5.0000	44841	2.3752	5.6589	40.3085
2.5786	5.9999	53809	2.3725	5.5713	39.9532
2.5197	6.9999	62777	2.3705	5.5688	39.9642
2.5599	8.0	71746	2.3703	5.644	39.9449
2.5763	9.0000	80714	2.3692	5.5916	40.3581
2.5734	9.9999	89682	2.3695	5.6339	40.0248
2.5461	10.9999	98650	2.3689	5.6151	40.0055
2.5712	12.0	107619	2.3690	5.5447	40.3554
2.5338	13.0000	116587	2.3685	5.6284	40.0138
2.5733	13.9999	125555	2.3693	5.7858	40.4105
2.5621	14.9999	134523	2.3684	5.6093	39.9614
2.5205	16.0	143492	2.3685	5.6545	40.3444
2.5815	17.0000	152460	2.3686	5.7027	40.3416
2.5666	17.9999	161428	2.3686	5.6684	39.9284
2.5721	18.9999	170396	2.3683	5.6005	40.0028
2.5325	20.0	179365	2.3685	5.688	40.3223
2.5385	21.0000	188333	2.3684	5.6137	39.989
2.5399	21.9999	197301	2.3681	5.6324	40.3499
2.5501	22.9999	206269	2.3679	5.6608	39.9725
2.5419	24.0	215238	2.3679	5.6112	40.3196
2.5583	24.9993	224200	2.3687	5.6718	39.9504

Framework versions

Transformers 4.40.2
Pytorch 2.2.1+cu121
Datasets 2.19.1
Tokenizers 0.19.1

atlasia
/

Terjman-Supreme-v1

You need to agree to share your contact information to access this model

Terjman-Supreme (3.3B)

Usage

Example

Limitations

Feedback

Training hyperparameters

Training results

Framework versions

Model tree for atlasia/Terjman-Supreme-v1

Collection including atlasia/Terjman-Supreme-v1

Translation Models

Evaluation results