File size: 2,526 Bytes
f30b7f9 c9e09fc f30b7f9 1df277c c034279 1df277c cb857f8 1df277c cb857f8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
---
license: mit
datasets:
- HariprasathSB/tamil_summarization
language:
- en
- ta
tags:
- summarization
- translation
---
# Tamil Summarization and English-to-Tamil Translation Model
## Overview
This repository contains a fine-tuned model for both Tamil summarization and English-to-Tamil translation. The model was fine-tuned using the Hugging Face Transformers library. This README provides information on how to use the model and its capabilities.
## Model Details
- **Model Name**: [suriya7/Tamil-Summarization]
- **Model Type**: [Summarization , Translation]
- **Framework**: Hugging Face Transformers
- **Original Model**: [Mr-Vicky-01/Fine_tune_english_to_tamil](Mr-Vicky-01/Fine_tune_english_to_tamil)
- **Fine-tuning Dataset**: [HariprasathSB/tamil_summarization](https://huggingface.co/datasets/HariprasathSB/tamil_summarization)
- **Languages Supported**: English, Tamil
## Model Performance
![W&B Chart 23_3_2024, 11_46_59 pm.png](https://cdn-uploads.huggingface.co/production/uploads/65ae9249e50627e40c159b16/82PwF19H9V9o1CVoYuuJo.png)
## Usage
### Installation
You can install the necessary dependencies using pip:
```bash
pip install transformers
```
## Inference
Below is an example of how to use the model for both summarization and translation tasks:
```python
# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("suriya7/Tamil-Summarization")
model = AutoModelForSeq2SeqLM.from_pretrained("suriya7/Tamil-Summarization")
- **Example English-to-Tamil Translation**
input_text = "This is an example English sentence."
input_ids = tokenizer.encode(input_text, return_tensors="pt").input_ids
outputs = model.generate(input_ids,max_length=128)
translated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Translated Tamil Sentence:", translated_text)
- **Example Tamil Summarization**
tamil_article = "தமிழ் உரையினை சுருக்கமாக சுருக்கமாக உரையிடுவது எப்படி?"
tamil_input_ids = tokenizer.encode(tamil_article, return_tensors="pt",truncation=True).input_ids
summary_ids = model.generate(tamil_input_ids, max_length=128)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print("Summarized Tamil Text:", summary)
```
## Model Output
- **For translation tasks, the model outputs translated text in Tamil.**
- **For summarization tasks, the model outputs a summarized version of the input Tamil text.** |