Model Name: open-vakgyata

Model Overview: open-vakgyata is an open-source language identification model capable of detecting and classifying indian languages from speech inputs.

Supported Languages:

Language Code
English (India) en-IN
Hindi hi-IN
Odia or-IN
Bengali bn-IN
Tamil ta-IN
Telugu te-IN
Kannada kn-IN
Malayalam ml-IN
Marathi mr-IN
Gujarati gu-IN

Specification

  • Supported Sampling Rate: 16000
  • Recomonded Audio Format: 16kHz, 16bit PCM

Usage:

from transformers import Wav2Vec2ForSequenceClassification, AutoFeatureExtractor
import torch

device = "cpu" # "cuda"

model_id = "onecxi/open-vakgyata"

processor = AutoFeatureExtractor.from_pretrained(model_id)
model = Wav2Vec2ForSequenceClassification.from_pretrained(model_id).to(device)

Inference:

import torchaudio

audio, sr = torchaudio.load("path/to/audio.wav")

# Process the waveform and move to the appropriate device
inputs = processor(audio.flatten(), sampling_rate=sr, return_tensors="pt").to(device)

# Perform inference
with torch.no_grad():
    logits = model(**inputs).logits

# Get language probabilities
probs = logits.softmax(dim=-1).cpu().numpy()
language = model.config.id2label.get(probs.argmax())

print(language)
Downloads last month
2
Safetensors
Model size
58.7M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support