YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

mBART-50 English ↔ Pnar Translation (Gold Dataset)

Model Description

This model is a fine-tuned version of Facebook's mBART-50 multilingual sequence-to-sequence model for machine translation between English and Pnar.

The model was trained on a curated Gold parallel dataset consisting of English–Pnar sentence pairs and evaluated on a held-out test set.

Base Model

  • Model: facebook/mbart-large-50
  • Architecture: Transformer Encoder-Decoder
  • Framework: Hugging Face Transformers

Training Details

Dataset

  • Language Pair: English ↔ Pnar
  • Training Dataset: Gold Parallel Corpus
  • Validation Dataset: Held-out Gold Validation Set
  • Test Dataset: 604 parallel sentence pairs

Training Configuration

  • Epochs: 5
  • Learning Rate: 5e-5
  • Batch Size: 16
  • Mixed Precision: FP16
  • Optimizer: AdamW
  • Maximum Sequence Length: 160

Best Validation Performance

Epoch Validation Loss
1 2.2268
2 1.7741
3 1.7491
4 1.8374
5 1.9467

Best validation loss was achieved around Epoch 3.

Evaluation

Evaluation was performed on a held-out test set containing 604 English–Pnar sentence pairs. Metrics were computed in both translation directions.

English β†’ Pnar

Metric Score
BLEU 25.80
ChrF 49.16
TER 58.84
COMET 0.6731

Pnar β†’ English

Metric Score
BLEU 9.19
ChrF 27.76
TER 94.73
COMET 0.4500

Usage

from transformers import MBartForConditionalGeneration, MBart50Tokenizer

model = MBartForConditionalGeneration.from_pretrained("FithaAsma/mbart-pnar-gold")
tokenizer = MBart50Tokenizer.from_pretrained("FithaAsma/mbart-pnar-gold")

text = "Please arrange the chairs before the guests arrive."

inputs = tokenizer(
    text,
    return_tensors="pt",
    truncation=True,
    max_length=160
)

outputs = model.generate(
    **inputs,
    max_length=160,
    num_beams=4
)

translation = tokenizer.decode(
    outputs[0],
    skip_special_tokens=True
)

print(translation)

Limitations

  • Performance is significantly stronger in the English β†’ Pnar direction than in the reverse direction.
  • The model was trained on a limited-resource language pair and may struggle with domain-specific terminology, named entities, or highly complex sentences.
  • Additional training data and larger-scale fine-tuning may further improve translation quality.

Citation

If you use this model, please cite:

  • Facebook AI Research. mBART: Multilingual Denoising Pre-training for Neural Machine Translation.
  • This repository's associated project and dataset.

License

Please ensure compliance with the licenses of the original mBART model and the datasets used for training.

Downloads last month
25
Safetensors
Model size
0.6B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support