Edit model card

eng-mal-translator

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1660
  • Bleu: 16.9895

Overview

This project utilizes a custom dataset for training a translation model from English to Malayalam. The model leverages the facebook/nllb-200-distilled-600M architecture from Hugging Face's Transformers library, fine-tuned on the dataset. It aims to provide accurate translations from English text inputs into Malayalam.

Dataset Used

The training data consists of a curated dataset containing parallel English-Malayalam text pairs, ensuring robust training and evaluation of the translation model. https://huggingface.co/datasets/Govardhan-06/flores_eng_mal

Model Used

The translation model employed is based on the facebook/nllb-200-distilled-600M architecture, chosen for its efficiency and performance in sequence-to-sequence tasks.

Functionality

Users can input English text, and the model will generate corresponding Malayalam translations, facilitating cross-language communication and understanding.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu
No log 1.0 226 1.1084 15.0719
No log 2.0 452 1.0917 16.3698
1.1672 3.0 678 1.0952 16.2931
1.1672 4.0 904 1.0994 16.7858
0.8967 5.0 1130 1.1154 16.5906
0.8967 6.0 1356 1.1300 17.7039
0.7415 7.0 1582 1.1414 16.8886
0.7415 8.0 1808 1.1523 17.1442
0.6532 9.0 2034 1.1628 16.9454
0.6532 10.0 2260 1.1660 16.9895

Framework versions

  • Transformers 4.42.3
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
16
Safetensors
Model size
615M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) has been turned off for this model.