The number of training data?

#1
by JayKe700 - opened

Hi alvanlii

I tried to reproduce your experimental results, and I found that the number of data mismatch. According to your epoch, training steps and batch size, it can be inferred that the total number of training sets you use is about 18w. And the total number of data I obtained using these three data(Common Voice 11 Canto Train Set, CantoMap, Cantonse-ASR) sets you listed is about 9.76w. I would like to ask if I have miss any information?
Looking forward to your reply.

Thanks
Jayke

Hi Jayke, I doubled the training data and applied augmentation differently on the duplicated set. Sorry, should have been more clear about it.

Hi alvanlii,thanks for your reply, I got it.

alvanlii changed discussion status to closed

Sign up or log in to comment