For Seq2Seq Model Repository

Filename: README.md

Seq2Seq Model for Next Token Prediction (English-Hausa Translation)

Overview

This is a Seq2Seq model developed for next token prediction in a translation task from English to Hausa. It was created to explore and implement Seq2Seq architecture as part of a course assignment.

Assignment Context

This model was developed as part of an assignment that required:

  1. Training a Seq2Seq model for next token prediction.
  2. Comparing performance with an LSTM model using BLEU and ChrF scores.
  3. Publishing the model on Hugging Face Hub.

Model Architecture

  • Type: Seq2Seq (Encoder-Decoder)
  • Hidden Dimensions: 256
  • Layers: 2
  • Embedding Dimension: Based on tokenizer vocabulary size
  • Loss Function: Cross-Entropy Loss with padding token ignored
  • Optimizer: Adam with learning rate of 0.001

Training Details

  • Dataset: opus100 English-Hausa dataset
  • Epochs: 10
  • Batch Size: 32
  • Training Loss Progress: Converging near zero
  • Validation Loss: Minimal change due to effective Seq2Seq architecture

Evaluation Metrics

Epoch BLEU Score ChrF Score
1 0.0998 32.03
10 0.0998 32.03

The Seq2Seq model demonstrated stable performance, achieving a BLEU score of 0.0998 and a ChrF score of 32.03 consistently across epochs.

Usage

To load and use the model in your project, use the following code:

from transformers import AutoModel, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("/AppalanaiduSaketi/LSTM-model-based-translator")
model = AutoModel.from_pretrained("AppalanaiduSaketi/LSTM-model-based-translator")
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support