t5-small-finetuned-opus-books
This model is a fine-tuned version of google-t5/t5-small on the Opus Books dataset. The model is trained to perform various text-to-text tasks such as translation, summarization, and text generation, with a focus on book-related content.
Model Description
The T5 architecture, or Text-To-Text Transfer Transformer, converts all NLP tasks into a text-to-text format, allowing a single model to be fine-tuned for different NLP tasks. This particular model, t5-small-finetuned-opus-books
, is fine-tuned on a subset of the Opus Books dataset, which includes multilingual parallel texts derived from books.
Key Features:
- Base Model: google-t5/t5-small
- Architecture: T5 (Text-to-Text Transfer Transformer) - a versatile model that can be adapted to a variety of NLP tasks.
- Fine-tuning: Adapted for tasks involving book-related content such as translation, summarization, and paraphrasing.
Intended Uses & Limitations
Intended Uses:
- Book Translation: Translating text from one language to another, particularly for book content.
- Summarization: Generating concise summaries of longer texts, particularly from books.
- Text Generation: Creating new text based on given prompts, useful for creative writing or content generation related to books.
Limitations:
- Generalization: The model is fine-tuned on a specific dataset, which may limit its performance on texts that are vastly different from book-related content.
- Bleu Score: With a BLEU score of 3.1445, the model's translation performance might be modest compared to more specialized translation models.
- Generation Length: The generated text tends to be relatively short (average length: 17.716 tokens), which may not be ideal for tasks requiring longer outputs.
Training and Evaluation Data
Dataset:
- Training Data: The Opus Books dataset, which includes multilingual text pairs from various books. This dataset is particularly rich in literary and academic content, making it suitable for training models focused on book-related NLP tasks.
- Evaluation Data: A subset of the Opus Books dataset was used to evaluate the model's performance.
Data Characteristics:
- Domain: Book content, including fiction and non-fiction across various genres.
- Languages: Multilingual, with a focus on pairs involving English.
Training Procedure
Training Hyperparameters:
- Learning Rate: 2e-05
- Train Batch Size: 16
- Eval Batch Size: 16
- Seed: 42
- Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- Learning Rate Scheduler Type: Linear
- Number of Epochs: 2
- Mixed Precision Training: Native AMP (Automatic Mixed Precision) to optimize training time and memory usage.
Training Results:
Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
---|---|---|---|---|---|
2.2831 | 1.0 | 500 | 1.9888 | 3.0971 | 17.7335 |
2.2008 | 2.0 | 1000 | 1.9653 | 3.1445 | 17.716 |
- Final Validation Loss: 1.9653
- Final BLEU Score: 3.1445
- Generation Length: Average of 17.716 tokens per output.
Framework Versions
- Transformers: 4.42.4
- PyTorch: 2.3.1+cu121
- Datasets: 2.21.0
- Tokenizers: 0.19.1
Usage
You can use this model in a Hugging Face pipeline for various text-to-text tasks:
from transformers import pipeline
translator = pipeline(
"translation_en_to_fr",
model="ashaduzzaman/t5-small-finetuned-opus-books"
)
# Example usage: Translation
text = "translate English to French: Legumes share resources with nitrogen-fixing bacteria."
print(translator(text))
Acknowledgments
This model was developed using the Hugging Face Transformers library and fine-tuned on the Opus Books dataset. Special thanks to the Opus project for providing a rich source of multilingual book content.
- Downloads last month
- 12
Model tree for ashaduzzaman/t5-small-finetuned-opus-books
Base model
google-t5/t5-small