|
--- |
|
library_name: peft |
|
|
|
--- |
|
|
|
## Config |
|
```python |
|
model_name_or_path = "openai/whisper-large-v2" |
|
language = "Marathi" |
|
language_abbr = "mr" |
|
task = "transcribe" |
|
dataset_name = "mozilla-foundation/common_voice_11_0" |
|
|
|
common_voice["train"] = load_dataset(dataset_name, language_abbr, split="train+validation", use_auth_token=True) |
|
common_voice["test"] = load_dataset(dataset_name, language_abbr, split="test", use_auth_token=True) |
|
|
|
feature_extractor = AutoFeatureExtractor.from_pretrained(model_name_or_path) |
|
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, language=language, task=task) |
|
processor = AutoProcessor.from_pretrained(model_name_or_path, language=language, task=task) |
|
|
|
|
|
model = AutoModelForSpeechSeq2Seq.from_pretrained(model_name_or_path, load_in_8bit=True, device_map="auto") |
|
config = LoraConfig(r=32, lora_alpha=64, target_modules=["q_proj", "v_proj"], lora_dropout=0.05, bias="none") |
|
model = get_peft_model(model, config) |
|
model.print_trainable_parameters() |
|
#"trainable params: 15728640 || all params: 1559033600 || trainable%: 1.0088711365810203" |
|
``` |
|
|
|
## Training procedure |
|
|
|
|
|
The following `bitsandbytes` quantization config was used during training: |
|
- load_in_8bit: True |
|
- load_in_4bit: False |
|
- llm_int8_threshold: 6.0 |
|
- llm_int8_skip_modules: None |
|
- llm_int8_enable_fp32_cpu_offload: False |
|
- llm_int8_has_fp16_weight: False |
|
- bnb_4bit_quant_type: fp4 |
|
- bnb_4bit_use_double_quant: False |
|
- bnb_4bit_compute_dtype: float32 |
|
### Framework versions |
|
|
|
|
|
- PEFT 0.5.0 |
|
|
|
|
|
wer=38.514602540132806 |
|
|