YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

English-Chinese Translator with LSTM and Seq-to-Seq Models

Welcome to my project! ๐ŸŽ‰ Here, I've built a translator that converts English to Traditional Chinese and vice versa using advanced deep learning models. Itโ€™s an exciting dive into the world of natural language processing (NLP) and machine translation.


Project Overview

This project includes:

  • Seq-to-Seq Model:
    A sequence-to-sequence model with encoder-decoder architecture for language translation.

  • LSTM Model:
    An LSTM-based translation model designed to handle sequential data and long-term dependencies.

  • Evaluation Metrics:

    • BLEU Score: Measures the quality of machine translation by comparing model output with reference translations.
    • ChrF Score: Focuses on character-level similarity, particularly useful for non-alphabetic languages like Chinese.
  • Dataset:
    A bilingual dataset of 1000+ English-Traditional Chinese sentence pairs, split for training, validation, and testing.


How It Works

  1. Data Preparation:

    • Sentences are tokenized and padded to maintain uniformity.
    • The dataset is split into training and validation sets.
  2. Model Training:

    • Two models are trained:
      • LSTM Model for baseline performance.
      • Seq-to-Seq Model with an attention mechanism for enhanced results.
  3. Translation Process:

    • Input a sentence in English or Chinese.
    • The model generates a translation in the target language.
  4. Evaluation:

    • Use BLEU and ChrF scores to validate model performance.
    • Plot training and validation loss curves to monitor learning.

What I Learned

  • Designing and implementing Seq-to-Seq and LSTM models for translation.
  • Working with bilingual datasets and tokenizing for both English and Chinese.
  • Understanding and applying BLEU and ChrF scores to evaluate translation quality.
  • Managing challenges of long sequences and context switching in language models.

Results

  • BLEU Score: [Add the result here]
  • ChrF Score: [Add the result here]

The scores show promising results, with potential for further optimization.


Future Work

  1. Experimenting with Transformers:
    Implementing models like BERT or GPT to enhance translation quality.

  2. Expanding Dataset:
    Adding more sentence pairs to improve fluency and context handling.

  3. Multi-language Translation:
    Extending support to other languages like Spanish or French.


Usage

  1. Clone this repository:
    git clone <repository-link>
    
  2. Install required libraries:
    pip install -r requirements.txt
    
  3. Run the training scripts:
    python Seq_to_seq_code.ipynb
    python LSTM_code.ipynb
    
  4. Input your test sentences and generate translations.

Thanks for exploring my project! ๐ŸŒŸ Feel free to fork the repo, try it out, and share your feedback. ๐Ÿ˜Š

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support