mBART-50 English ↔ Pnar Translation (Gold Dataset)

Model Description

This model is a fine-tuned version of Facebook's mBART-50 multilingual sequence-to-sequence model for machine translation between English and Pnar.

The model was trained on a curated Gold parallel dataset consisting of English–Pnar sentence pairs and evaluated on a held-out test set.

Base Model

Model: facebook/mbart-large-50
Architecture: Transformer Encoder-Decoder
Framework: Hugging Face Transformers

Training Details

Dataset

Language Pair: English ↔ Pnar
Training Dataset: Gold Parallel Corpus
Validation Dataset: Held-out Gold Validation Set
Test Dataset: 604 parallel sentence pairs

Training Configuration

Epochs: 5
Learning Rate: 5e-5
Batch Size: 16
Mixed Precision: FP16
Optimizer: AdamW
Maximum Sequence Length: 160

Best Validation Performance

Epoch	Validation Loss
1	2.2268
2	1.7741
3	1.7491
4	1.8374
5	1.9467

Best validation loss was achieved around Epoch 3.

Evaluation

Evaluation was performed on a held-out test set containing 604 English–Pnar sentence pairs. Metrics were computed in both translation directions.

English → Pnar

Metric	Score
BLEU	25.80
ChrF	49.16
TER	58.84
COMET	0.6731

Pnar → English

Metric	Score
BLEU	9.19
ChrF	27.76
TER	94.73
COMET	0.4500

Usage

from transformers import MBartForConditionalGeneration, MBart50Tokenizer

model = MBartForConditionalGeneration.from_pretrained("FithaAsma/mbart-pnar-gold")
tokenizer = MBart50Tokenizer.from_pretrained("FithaAsma/mbart-pnar-gold")

text = "Please arrange the chairs before the guests arrive."

inputs = tokenizer(
    text,
    return_tensors="pt",
    truncation=True,
    max_length=160
)

outputs = model.generate(
    **inputs,
    max_length=160,
    num_beams=4
)

translation = tokenizer.decode(
    outputs[0],
    skip_special_tokens=True
)

print(translation)

Limitations

Performance is significantly stronger in the English → Pnar direction than in the reverse direction.
The model was trained on a limited-resource language pair and may struggle with domain-specific terminology, named entities, or highly complex sentences.
Additional training data and larger-scale fine-tuning may further improve translation quality.

Citation

If you use this model, please cite:

Facebook AI Research. mBART: Multilingual Denoising Pre-training for Neural Machine Translation.
This repository's associated project and dataset.

License

Please ensure compliance with the licenses of the original mBART model and the datasets used for training.

Downloads last month: 25

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support