JairamKanna
/

whisper-large-v2-tamil

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

JairamKanna commited on Dec 9, 2023

Commit

40ffc7e

•

1 Parent(s): d27ff0e

Create README.md

Files changed (1) hide show

README.md +49 -0

README.md ADDED Viewed

	@@ -0,0 +1,49 @@

+---
+language:
+- ta
+metrics:
+- wer
+library_name: transformers
+pipeline_tag: automatic-speech-recognition
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+This is the fine-tuned version of whisper-large-v2 model for Tamil language.
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <training_args = Seq2SeqTrainingArguments(
+    output_dir="./pretrainedwhisper-medium-native-v2",  # change to a repo name of your choice
+    per_device_train_batch_size=4,
+    gradient_accumulation_steps=1,  # increase by 2x for every 2x decrease in batch size
+    learning_rate=1e-5,
+    warmup_steps=200,
+    max_steps=2000,
+    gradient_checkpointing=True,
+    fp16=True,
+    evaluation_strategy="steps",
+    per_device_eval_batch_size=8,
+    predict_with_generate=True,
+    generation_max_length=225,
+    save_steps=500,
+    eval_steps=500,
+    logging_steps=25,
+    report_to=["tensorboard"],
+    load_best_model_at_end=True,
+    metric_for_best_model="wer",
+    greater_is_better=False,
+    push_to_hub=True,
+    optim="adamw_bnb_8bit"
+)>
+### Model Architecture and Objective
+The model follows the whisper architecture with the encoder-decoder part. Where the encoder used to create the embeddings from the speech input and the decoder used to give the textual outputs.