polish_whisper / README.md
filipzawadka's picture
readme update
2505a15
---
title: Polish Whisper
emoji: πŸƒ
colorFrom: red
colorTo: blue
sdk: gradio
sdk_version: 4.8.0
app_file: app.py
pinned: false
license: apache-2.0
---
Possible model improvments
(a) model-centric approach -
for sure the biggest improvment is using the bigger whisper architecture
increase the batch size and train for longer, we could use a scheduler to rise it consistently,
until the model stabilizes completly
multi-head training: we could train on all languages with common part of the architecture, which could iprove generalization
and help us be able to use much more data
(b) data-centric approach -
we can use a dataset with better phonetic desctiption like TIMIT dataset
we can use more data, and more diverse data, here most of the files
are recorder from a laptop microphone, which can influence
predictions on other sourses
add noise and other transformations to the dataset
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference