Configuration Parsing Warning: In adapter_config.json: "peft.task_type" must be a string

Model Details

Model Description

Whisper large-v3 trained on common-voice-13 Hindi dataset using LoRA

Model Sources

Uses

  • Automatic Speech Recognition (ASR)

Direct Use

from peft import PeftModel, PeftConfig
from transformers import WhisperForConditionalGeneration, WhisperProcessor

peft_model_id = "kasunw/whisper-large-v3-hindi"

peft_config = PeftConfig.from_pretrained(peft_model_id)
model = WhisperForConditionalGeneration.from_pretrained(
    peft_config.base_model_name_or_path, device_map="auto", torch_dtype=torch.float16
)
model = PeftModel.from_pretrained(model, peft_model_id)
model.config.use_cache = True

processor = WhisperProcessor.from_pretrained(peft_config.base_model_name_or_path, language="Hindi", task="transcribe")
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32

pipe = pipeline(
    "automatic-speech-recognition",
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
    max_new_tokens=128,
    chunk_length_s=30,
    batch_size=16,
    return_timestamps=True,
    torch_dtype=torch_dtype,
    device=model.device,
)

path_to_audio = "audio.mp3"

result = pipe(path_to_audio)
print(result["text"])

Training Details

Training Data

common-voice-13.0 Hindi Portion

Training Procedure

Followed the instruction given in this notebook

Training Hyperparameters

  • per_device_train_batch_size=16
  • gradient_accumulation_steps=1
  • learning_rate=1e-5
  • warmup_steps=50
  • fp16=True
  • max_steps=1000

Metrics

  • word error rate (WER)
Downloads last month
20,798
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support automatic-speech-recognition models for peft library.

Dataset used to train kasunw/whisper-large-v3-hindi