Question about Wav2vec 2.0 - RoBERT integration

#1
by aciobanitei - opened

Hi,

I have a fine-tunned wav2vec2.0 model for Romanian language ASR. I would also like to integrate a language model in there, but I don't quite understand how to do it.
I can run the wav2vec model so that I can obtain the text from the sound file.
I also can evaluate the wav2vec model, against a dataset.
I can run this model on a text and obtain some number, but I don't understand what they mean.

I some tutorials, I found some integrations between wav2vec and ken language models (https://huggingface.co/blog/wav2vec2-with-ngram)
I understand that I need to use Wav2Vec2ProcessorWithLM but I don't understand the values I should attribute there, since I don't have an .arpa file for the kenlm.
If this kind of validation is not to be done straight forward, at least I would like to know how I may translate the numbers received from outputs = model(**inputs) into an actual text.

Regards

Hello,

I think you are looking for a LM that is suitable for generating text. The RoBERT-large model is trained on MLM and NSP tasks. While this model is capable of generating text (e.g: by adding a MASK token at the end of the string at each step), I think a GPT-2 model is more suitable for your task. Have a look at the following models: https://huggingface.co/readerbench/RoGPT2-medium , https://huggingface.co/readerbench/RoGPT2-base , https://huggingface.co/readerbench/RoGPT2-large .

Sign up or log in to comment