--- license: mit datasets: - HariprasathSB/tamil_summarization language: - en - ta tags: - summarization - translation --- # Tamil Summarization and English-to-Tamil Translation Model ## Overview This repository contains a fine-tuned model for both Tamil summarization and English-to-Tamil translation. The model was fine-tuned using the Hugging Face Transformers library. This README provides information on how to use the model and its capabilities. ## Model Details - **Model Name**: [suriya7/Tamil-Summarization] - **Model Type**: [Summarization , Translation] - **Framework**: Hugging Face Transformers - **Original Model**: [Mr-Vicky-01/Fine_tune_english_to_tamil](Mr-Vicky-01/Fine_tune_english_to_tamil) - **Fine-tuning Dataset**: [HariprasathSB/tamil_summarization](https://huggingface.co/datasets/HariprasathSB/tamil_summarization) - **Languages Supported**: English, Tamil ## Model Performance ![W&B Chart 23_3_2024, 11_46_59 pm.png](https://cdn-uploads.huggingface.co/production/uploads/65ae9249e50627e40c159b16/82PwF19H9V9o1CVoYuuJo.png) ## Usage ### Installation You can install the necessary dependencies using pip: ```bash pip install transformers ``` ## Inference Below is an example of how to use the model for both summarization and translation tasks: ```python # Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("suriya7/Tamil-Summarization") model = AutoModelForSeq2SeqLM.from_pretrained("suriya7/Tamil-Summarization") - **Example English-to-Tamil Translation** input_text = "This is an example English sentence." input_ids = tokenizer.encode(input_text, return_tensors="pt").input_ids outputs = model.generate(input_ids,max_length=128) translated_text = tokenizer.decode(outputs[0], skip_special_tokens=True) print("Translated Tamil Sentence:", translated_text) - **Example Tamil Summarization** tamil_article = "தமிழ் உரையினை சுருக்கமாக சுருக்கமாக உரையிடுவது எப்படி?" tamil_input_ids = tokenizer.encode(tamil_article, return_tensors="pt",truncation=True).input_ids summary_ids = model.generate(tamil_input_ids, max_length=128) summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True) print("Summarized Tamil Text:", summary) ``` ## Model Output - **For translation tasks, the model outputs translated text in Tamil.** - **For summarization tasks, the model outputs a summarized version of the input Tamil text.**