Sarvam-1 — Hinglish Grapheme-to-Phoneme (G2P) LoRA

A LoRA adapter for sarvamai/sarvam-1 (2B) that performs Grapheme-to-Phoneme (G2P) on normalized code-mixed Hindi/English text, emitting IPA. It is distilled from espeak-ng: a single LLM thus does both text-normalization (see the companion TN adapter) and phonemization — a unified TTS front-end.

Part of fast-indic-tts.

Live demo: https://huggingface.co/spaces/AK04-IXR/fast-indic-tts

Results

On a held-out split it reproduces the espeak-ng reference with 0.00% PER / 100% exact phoneme match (n=60) — i.e. it generalizes the phonemizer's deterministic mapping to unseen sentences.

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

tok = AutoTokenizer.from_pretrained("sarvamai/sarvam-1")
m = AutoModelForCausalLM.from_pretrained("sarvamai/sarvam-1")
m = PeftModel.from_pretrained(m, "AK04-IXR/sarvam1-hinglish-g2p-lora")

prompt = "Input: Mera flight ticket pee-en-aar eight three nine two hai.\nOutput:"
ids = tok(prompt, return_tensors="pt").to(m.device)
out = m.generate(**ids, max_new_tokens=160, do_sample=False)
print(tok.decode(out[0][ids['input_ids'].shape[1]:], skip_special_tokens=True))

Training

LoRA (r=16, α=32; 0.94% of params) on ~7k (text → IPA) pairs phonemized by espeak-ng (en-us), 3 epochs, bf16, single A100.

Limitations

Distilled from espeak-ng, so it matches (does not surpass) that reference; trained on Latin-script normalized text (Devanagari-carrier lines held out), and code-switched phonemization (per-span language ID) remains an open problem.

Downloads last month: 51

Model tree for AK04-IXR/sarvam1-hinglish-g2p-lora

Base model

sarvamai/sarvam-1

Adapter

(29)

this model