marianbasti
/

distil-whisper-large-v3-es

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Community

marianbasti commited on Jan 31, 2024

Commit

0d8f5da

·

verified ·

1 Parent(s): 02881c0

Update README.md

Files changed (1) hide show

README.md +16 -2

README.md CHANGED Viewed

@@ -40,7 +40,7 @@ from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
 from datasets import load_dataset
 device = "cuda:0" if torch.cuda.is_available() else "cpu"
 torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
-model_id = "distil-whisper/distil-large-v2"
 model = AutoModelForSpeechSeq2Seq.from_pretrained(
     model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True
 )
@@ -155,7 +155,7 @@ print(result["text"])
 ```
 ## Training
-The model was trained for 40,000 optimisation steps (or four epochs), and the following training parameters:
 ```
 --teacher_model_name_or_path "openai/whisper-large-v3"
 --train_dataset_name "mozilla-foundation/common_voice_16_1"
@@ -174,6 +174,20 @@ The model was trained for 40,000 optimisation steps (or four epochs), and the fo
 --logging_steps 25
 --save_total_limit 1
 --max_steps 40000
 ```
 ## Results

 from datasets import load_dataset
 device = "cuda:0" if torch.cuda.is_available() else "cpu"
 torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
+model_id = "marianbasti/distil-whisper-large-v3-es"
 model = AutoModelForSpeechSeq2Seq.from_pretrained(
     model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True
 )
 ```
 ## Training
+The model was trained for 40,000 optimisation steps (or four epochs), on a single RTX3090 for ~30 hours, using the following training parameters:
 ```
 --teacher_model_name_or_path "openai/whisper-large-v3"
 --train_dataset_name "mozilla-foundation/common_voice_16_1"
 --logging_steps 25
 --save_total_limit 1
 --max_steps 40000
+--wer_threshold 10
+--per_device_train_batch_size 8
+--per_device_eval_batch_size 8
+--dataloader_num_workers 12
+--preprocessing_num_workers 12
+--output_dir "./"
+--do_train
+--do_eval
+--gradient_checkpointing
+--predict_with_generate
+--overwrite_output_dir
+--use_pseudo_labels "false"
+--freeze_encoder
+--streaming False
 ```
 ## Results