Telugu Colloquial to English Translator
This model was fine-tuned for translating colloquial Telugu expressions to English, developed for the SAWiT.AI Hackathon 2025.
Model Details
- Base Model: facebook/nllb-200-distilled-600M
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Dataset: Custom curated dataset of colloquial Telugu expressions
- Purpose: To accurately translate natural spoken Telugu to English, with emphasis on slang and colloquial expressions
Usage
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
# Load model and tokenizer
model_name = "your-username/telugu-colloquial-translator" # Replace with your username
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
# Translate a colloquial Telugu phrase
phrase = "Enti mawa, ela unnaavu?"
inputs = tokenizer(phrase, return_tensors="pt")
inputs.input_ids[:, 0] = tokenizer.lang_code_to_id["tel_Telu"] # Set source language
outputs = model.generate(
**inputs,
forced_bos_token_id=tokenizer.lang_code_to_id["eng_Latn"], # Set target language
max_length=128,
num_beams=5
)
translation = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
print(f"Telugu: {phrase}")
print(f"English: {translation}")
Dataset
The dataset consists of colloquial Telugu expressions gathered from:
- Everyday conversations
- Social media content
- Movies and TV shows dialogue
- Youth slang and expressions
These expressions represent how Telugu is naturally spoken by native speakers in informal contexts, rather than formal written Telugu.
Evaluation
This model was evaluated on its ability to accurately translate:
- Slang terms and idioms
- Colloquial expressions
- Informal grammar patterns
- Code-mixed Telugu (with English words)
License
This model is shared under the Apache 2.0 license.
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.