metadata

widget:
  - example_title: Sample Iban audio
    src: ibf_003_014.wav

Whisper Small for Bahasa Iban - Meisin Lee

This model is a fine-tuned version of openai/whisper-small on the Iban Speech Corpus. More specifically, this Iban ASR is fine-tuned from the most similar language, in this case Malay is used. It achieves the following results on the evaluation set:

Loss: 0.257025
Wer Ortho: 0.158626
Wer: 0.158781

How to Get Started with the Model

Use the code below to use the model in Inference Mode.

from transformers import pipeline
import torch

device = "cuda:0" if torch.cuda.is_available() else "cpu"

pipe = pipeline("automatic-speech-recognition", model="meisin123/whisper-small-iban", chunk_length_s=30, device=device,)

audio_file = "audio.mp3"   ## use your own audio here

transcribed_text = pipe(audio_file, batch_size = 16)

Training Details

Training Data

The model is trained on the Iban Speech Corpus. The dataset is available on Huggingface, more information here. Iban is one of the under-resourced languages. The Iban language (jaku Iban) is spoken by the Iban, one of the Dayak ethnic groups, who live in Brunei, the Indonesian province of West Kalimantan and in the Malaysian state of Sarawak. It belongs to the Malayic subgroup, a Malayo-Polynesian branch of the Austronesian language family.

Evaluation

Performance and Limitations

There are still a lot of room for improvement for this Iban ASR model.

The accuracy of the model can be further improved with more training data. As Iban is an under-resourced languages, there are limited audio data to train on.
Currently, the model is not able to handle code-switched speech. If the audio contains a combination of English and Iban, the model does poorly on the English portion.

Model Card Contact

For more information, please contact the author at meisin123@gmail.com