Training dataset for the N-gram LM
#3
by
kamil
- opened
Hello!
Thank you for your amazing contribution. Can you please tell me which dataset did you use for the LM training ?
Was it just a merge of all the sentences you used for the W2V2 fine-tuning?
Thank's
Hi @kamil ,
Thank you for your interest!
Yes, I only used the sentences from the ASR training set for N-gram. However, to improve performance, you can collect more data and combine it with the existing set.
bofenghuang
changed discussion status to
closed