Grapheme-to-Phoneme Model

#1
by bofenghuang - opened

Hi, thanks for this fantastic work! I'm curious to know how you converted the transcriptions in MCV into phonemes. Could you share a bit about the process?

Laboratoire de Mécanique des Structures et des Systèmes Couplés org
edited Nov 14, 2023

Dear Bofeng,

Thanks for your interest in our work !

After investigating several solutions for the G2P task (including pretty recent neural solutions), we used the bootphon/phonemizer using the EspeakBackend on all the text data of CommonVoice ( Github bootphon/phonemizer ) before generating the vocab.json needed by the tokenizer for wav2vec2.

This can be simply extended to any language / dataset with text transcriptions, as long as the phonemizer backend is powerful enough to not introduce errors in grapheme to phoneme translation.

Hope this helps,

Best regards,


Éric Bavu
Senior Researcher / Full Professor - Acoustics
Cnam/LMSSC


Personal Webpage


Thanks for your detailed response, Eric!

I've also experimented with phonemizer and found its results much more precise than other tools such as epitran (https://github.com/dmort27/epitran). However, it takes a bit more time. I'll launch it on my dataset to see if the running time is acceptable.

Laboratoire de Mécanique des Structures et des Systèmes Couplés org
edited Nov 14, 2023

@bofenghuang : if you want to reduce the time footprint for G2P data preparation, you can use the map function of :huggingface: datasets with the batched=True option, and prepare the data once and for all before training to create a dataset with a phonetic transcription attribute added to the audio and text transcription.

It doesn't take long to go through CommonVoice 13 fr, for example, which corresponds to 2.5 k hours of audio (and this is done one for once before training, thus not slowing the training process. ).

Best Regards,


Éric Bavu
Senior Researcher / Full Professor - Acoustics
Cnam/LMSSC


Personal Webpage


bofenghuang changed discussion status to closed

Sign up or log in to comment