English-Hausa Neural Machine Translation

This model is a fine-tuned version of facebook/nllb-200-distilled-600M for English to Hausa translation.

Model description

  • Model Architecture: NLLB-200 (No Language Left Behind)
  • Base Model: facebook/nllb-200-distilled-600M
  • Fine-tuning Dataset: TICO-19 English-Hausa parallel corpus
  • BLEU Score: 61.48
  • Languages: English (eng_Latn) → Hausa (hau_Latn)

Intended uses & limitations

This model is designed for translating English text to Hausa. It performs best on:

  • General domain text
  • Short to medium-length sentences

Training and Evaluation

Training Hyperparameters

The model was trained with the following hyperparameters:

  • Learning rate: 1e-5
  • Batch size: 16
  • Number of epochs: 30
  • Weight decay: 0.01
  • Maximum sequence length: 128
  • Beam size: 10

Training Results

  • BLEU score: 61.48

How to use

Here's how to use the model with the Transformers library:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

# Load model and tokenizer
model_name = "mide6x/english-hausa-nllb"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

# Prepare text for translation
text = "Hello, how are you?"
inputs = tokenizer(text, return_tensors="pt", padding=True)

# Set source and target languages
tokenizer.src_lang = "eng_Latn"
tokenizer.tgt_lang = "hau_Latn"

# Get the Hausa language token ID correctly
hausa_token_id = tokenizer.convert_tokens_to_ids(tokenizer.tgt_lang)

# Generate translation
outputs = model.generate(
    **inputs,
    forced_bos_token_id=hausa_token_id,
    max_length=128,
    num_beams=10,
    early_stopping=True
)

# Decode the translation
translation = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
print(f"English: {text}")
print(f"Hausa: {translation}")

Limitations and Biases

  • Performance may vary for very long sentences or complex grammatical structures
  • The model inherits any biases present in the training data

Citation

If you use this model, please cite:

  • The original NLLB-200 paper
  • The TICO-19 dataset

Author

GenZ AI

Downloads last month
8
Safetensors
Model size
615M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Evaluation results