xls-r-300m-sv-robust / prepare_dataset_lm.py
marinone94's picture
restructure main code
8829a08
""" Script to prepare and upload dataset for training Swedish n-gram LM to boost ASR. """
# Check colab notebook to get started
# https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/Boosting_Wav2Vec2_with_n_grams_in_Transformers.ipynb#scrollTo=IrAzjWc3Ok2l
# Notebook train_n_gram_lm_with_KenLM.ipynb has actual code