t5-small-finetuned-opus-books

This model is a fine-tuned version of google-t5/t5-small on the Opus Books dataset. The model is trained to perform various text-to-text tasks such as translation, summarization, and text generation, with a focus on book-related content.

Model Description

The T5 architecture, or Text-To-Text Transfer Transformer, converts all NLP tasks into a text-to-text format, allowing a single model to be fine-tuned for different NLP tasks. This particular model, t5-small-finetuned-opus-books, is fine-tuned on a subset of the Opus Books dataset, which includes multilingual parallel texts derived from books.

Key Features:

Base Model: google-t5/t5-small
Architecture: T5 (Text-to-Text Transfer Transformer) - a versatile model that can be adapted to a variety of NLP tasks.
Fine-tuning: Adapted for tasks involving book-related content such as translation, summarization, and paraphrasing.

Intended Uses & Limitations

Intended Uses:

Book Translation: Translating text from one language to another, particularly for book content.
Summarization: Generating concise summaries of longer texts, particularly from books.
Text Generation: Creating new text based on given prompts, useful for creative writing or content generation related to books.

Limitations:

Generalization: The model is fine-tuned on a specific dataset, which may limit its performance on texts that are vastly different from book-related content.
Bleu Score: With a BLEU score of 3.1445, the model's translation performance might be modest compared to more specialized translation models.
Generation Length: The generated text tends to be relatively short (average length: 17.716 tokens), which may not be ideal for tasks requiring longer outputs.

Training and Evaluation Data

Dataset:

Training Data: The Opus Books dataset, which includes multilingual text pairs from various books. This dataset is particularly rich in literary and academic content, making it suitable for training models focused on book-related NLP tasks.
Evaluation Data: A subset of the Opus Books dataset was used to evaluate the model's performance.

Data Characteristics:

Domain: Book content, including fiction and non-fiction across various genres.
Languages: Multilingual, with a focus on pairs involving English.

Training Procedure

Training Hyperparameters:

Learning Rate: 2e-05
Train Batch Size: 16
Eval Batch Size: 16
Seed: 42
Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
Learning Rate Scheduler Type: Linear
Number of Epochs: 2
Mixed Precision Training: Native AMP (Automatic Mixed Precision) to optimize training time and memory usage.

Training Results:

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
2.2831	1.0	500	1.9888	3.0971	17.7335
2.2008	2.0	1000	1.9653	3.1445	17.716

Final Validation Loss: 1.9653
Final BLEU Score: 3.1445
Generation Length: Average of 17.716 tokens per output.

Framework Versions

Transformers: 4.42.4
PyTorch: 2.3.1+cu121
Datasets: 2.21.0
Tokenizers: 0.19.1

Usage

You can use this model in a Hugging Face pipeline for various text-to-text tasks:

from transformers import pipeline

translator = pipeline(
    "translation_en_to_fr",
    model="ashaduzzaman/t5-small-finetuned-opus-books"
)

# Example usage: Translation
text = "translate English to French: Legumes share resources with nitrogen-fixing bacteria."

print(translator(text))

Acknowledgments

This model was developed using the Hugging Face Transformers library and fine-tuned on the Opus Books dataset. Special thanks to the Opus project for providing a rich source of multilingual book content.

ashaduzzaman
/

t5-small-finetuned-opus-books