Vakya-Mini-Extended-100M (vakya-v1.2.1)
Vakya is a lightweight en-hi translation model finetuned using SFT on the base falcon-h1-tiny-multilingual-100m-instruct model. It offers quick and fairly accurate hindi translations for english sentences. It's small size (108M parameters) allows it to run comfortably on laptop grade GPUs.
Vakya-Mini-Extended is the first generation of Lightweight Indic Translator(LIT) models which can provide accurate and fast translation in local, memory-constrained environments. The Extended model features updated training on a larger translation corpus which leads to overall superior translation capabilities.
The Vakya series will be made available in the following sizes: Mini(100M), Standard(270M), Large(500M)
Estimated parameters: ~100M
Architecture: Falcon-H1
Intended use: English-Hindi Translations
Training data
- Source: en-hi-instruct-structured dataset (https://huggingface.co/datasets/DireDreadlord/en-hi-instruct-structured)
- Rows: ~1,660,000 rows templated with a custom .jinja chat format
- Training: trained for 2,000 steps on an RTX 3050 (4GB VRAM)
Usage
Install requirements:
pip install -r requirements.txt
pip install transformers datasets accelerate safetensors
Usage (Hugging Face Hub)
You can load it directly from HuggingFace:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" if torch.cuda.is_available() else "cpu"
model_id = "DireDreadlord/Vakya-Mini-Extended-100M"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, dtype="auto")
model.eval()
model.to(device)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
model.resize_token_embeddings(len(tokenizer))
sentence = "I work at the market."
messages = [
{
"role": "user",
"content": "Translate the following English sentence into Hindi:\n\n" + sentence,
}
]
input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
input_ids = {k: v.to(device) for k, v in input_ids.items()}
outputs = model.generate(**input_ids, max_new_tokens=128, do_sample=False)
prompt_text = tokenizer.decode(input_ids["input_ids"][0], skip_special_tokens=True)
full_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
if full_text.startswith(prompt_text):
output_text = full_text[len(prompt_text):].strip()
else:
output_text = full_text
print(output_text)
Limitations
- The model is exceptionally light(108M params), it may hallucinate under heavy use.
- Model for experimental use only; users should employ it as such under license.
- Downloads last month
- 19