README

Overview

This project implements a language translation model using GPT-2, capable of translating between Icelandic and English. The pipeline includes data preprocessing, model training, evaluation, and an interactive user interface for translations.

Features

Text Preprocessing: Tokenization and padding for uniform input size.

Model Training: Training a GPT-2 model on paired Icelandic-English sentences.

Evaluation: Perplexity-based validation of model performance.

Interactive Interface: An easy-to-use widget for real-time translations.

Installation

Prerequisites

Ensure you have the following installed:

Python (>= 3.8)

PyTorch

Transformers library by Hugging Face

ipywidgets (for the translation interface)

Steps

Clone the repository:

git clone cd

Install the required libraries:

pip install -r requirements.txt

Ensure GPU availability for faster training (optional but recommended).

Usage

Training the Model

Prepare your dataset with English-Icelandic sentence pairs.

Run the script to preprocess the data and train the model:

python train_model.py

The trained model and tokenizer will be saved in the ./trained_gpt2 directory.

Evaluating the Model

Evaluate the trained model using validation data:

python evaluate_model.py

The script computes perplexity to measure model performance.

Running the Interactive Interface

Launch a Jupyter Notebook or Jupyter Lab.

Open the file interactive_translation.ipynb.

Enter a sentence in English or Icelandic, and view the translation in real-time.

File Structure

train_model.py: Contains code for data preprocessing, model training, and saving.

evaluate_model.py: Evaluates model performance using perplexity.

interactive_translation.ipynb: Interactive interface for testing translations.

requirements.txt: List of required Python packages.

trained_gpt2/: Directory to save trained model and tokenizer.

Key Parameters

Max Length: Maximum token length for inputs (default: 128).

Learning Rate: .

Batch Size: 4 (both training and validation).

Epochs: 10.

Beam Search: Used for generating translations, with a beam size of 5.

Future Improvements

Expand dataset to include additional language pairs.

Optimize the model for faster inference.

Integrate the application into a web-based interface.

Acknowledgements

Hugging Face for providing the GPT-2 model and libraries.

PyTorch for enabling seamless implementation and training.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.