|
--- |
|
license: creativeml-openrail-m |
|
language: |
|
- en |
|
- hi |
|
pipeline_tag: automatic-speech-recognition |
|
--- |
|
--- |
|
language: |
|
- hi |
|
license: apache-2.0 |
|
tags: |
|
- whisper-event |
|
metrics: |
|
- wer |
|
model-index: |
|
- name: LLM-HINDI-LARGE - Manan Raval |
|
results: |
|
- task: |
|
type: automatic-speech-recognition |
|
name: Automatic Speech Recognition |
|
dataset: |
|
name: google/fleurs |
|
type: google/fleurs |
|
config: hn_in |
|
split: test |
|
metrics: |
|
- type: wer |
|
value: 12.33 |
|
name: WER |
|
|
|
|
|
## Usage |
|
|
|
In order to infer a single audio file using this model, the following code snippet can be used: |
|
|
|
```python |
|
>>> import torch |
|
>>> from transformers import pipeline |
|
|
|
>>> # path to the audio file to be transcribed |
|
>>> audio = "/path/to/audio.format" |
|
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu" |
|
|
|
>>> transcribe = pipeline(task="automatic-speech-recognition", model="web30india/LLM-Hindi-Large", chunk_length_s=30, device=device) |
|
>>> transcribe.model.config.forced_decoder_ids = transcribe.tokenizer.get_decoder_prompt_ids(language="hi", task="transcribe") |
|
|
|
>>> print('Transcription: ', transcribe(audio)["text"]) |
|
``` |
|
|
|
## Acknowledgement |
|
This work was done at [Virtual Height IT Services Pvt. Ltd.] |