AutoNLP supports fine-tuning of speech models. So, you can train an automatic speech recognition model easily.
Let’s assume our data is in CSV format and looks something like the following:
|hello, how are you?||a1.mp3|
|i am fine||a2.mp3|
|training asr models||a3.mp3|
Here, we see only three samples but you can have as many samples as you like: 5000, 10000, 100000 or even a million or more! Please note that the specified audio files must exist on disk.
Once you have the data in the format specified above, you are ready to train models using AutoNLP. Yes, it’s that easy.
The first step would be login to AutoNLP:
$ autonlp login --api-key YOUR_HUGGING_FACE_API_TOKEN
If you do not know your Hugging Face API token, please create an account on huggingface.co and you will find your api key in settings. Please do not share your api key with anyone!
Once you have logged in, you can create a new project:
$ autonlp create_project --name speech_model --language fr --task speech_recognition
During creation of project, you can choose the language using “–language” parameter.
The next step is to upload files. Here, column mapping is very important. The columns from original data are mapped to AutoNLP column names. In the data above, the original columns are “sentence” and “audio_path”. We do not need more columns for a speech recognition problem.
AutoNLP columns for speech recognition model are:
The original columns, thus, need to be mapped to text and path. This is done in upload command. You also need to tell AutoNLP what kind of split you are uploading: train or valid.
autonlp upload --project speech1 --split train \ --col_mapping sentence:text,path:path --files train.csv --path_to_audio ~/audio_data/clips
Similarly, upload the validation file:
autonlp upload --project speech1 --split valid \ --col_mapping sentence:text,path:path --files valid.csv --path_to_audio ~/audio_data/clips
Column mapping is always from original column to AutoNLP column (original_column:autonlp_column).
Please note that you can upload multiple files by separating the paths by a comma, however, the column names must be the same in each file.
Once you have uploaded the files successfully, you can start training by using the train command:
$ autonlp train --project speech1
And that’s it!
Your model will start training and you can monitor the training if you wish.