Seq2Seq Model for Next Token Prediction (English-Hausa Translation)

Overview

This is a Seq2Seq model developed for next token prediction in a translation task from English to Hausa. It was created to explore and implement Seq2Seq architecture as part of a course assignment.

Assignment Context

This model was developed as part of an assignment that required:

Training a Seq2Seq model for next token prediction.
Comparing performance with an LSTM model using BLEU and ChrF scores.
Publishing the model on Hugging Face Hub.

Model Architecture

Type: Seq2Seq (Encoder-Decoder)
Hidden Dimensions: 256
Layers: 2
Embedding Dimension: Based on tokenizer vocabulary size
Loss Function: Cross-Entropy Loss with padding token ignored
Optimizer: Adam with learning rate of 0.001

Training Details

Dataset: opus100 English-Hausa dataset
Epochs: 10
Batch Size: 32
Training Loss Progress: Converging near zero
Validation Loss: Minimal change due to effective Seq2Seq architecture

Evaluation Metrics

Epoch	BLEU Score	ChrF Score
1	0.0998	32.03
10	0.0998	32.03

The Seq2Seq model demonstrated stable performance, achieving a BLEU score of 0.0998 and a ChrF score of 32.03 consistently across epochs.

Usage

To load and use the model in your project, use the following code:

from transformers import AutoModel, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("/AppalanaiduSaketi/LSTM-model-based-translator")
model = AutoModel.from_pretrained("AppalanaiduSaketi/LSTM-model-based-translator")

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support