Training dataset + Hyperparamters
Hello,
Thank you for making this public, it looks that there is recent rise in non-English models.
Do you plan to make the dataset public? If not, would it be possible to make public at least small portion of it, to see how similar dataset could be modeled in different languages?
Could you provide some details on training procedure? Hyper-parameters and you HW setup + total time it took you to finish training?
Danke!
I would also love to see the data public - would like to reproduce it with different models. Thanks
Hey @ all.
We are already planning to publish a partial data set that we used for the training. This is data that has been completely augmented from an existing English top dataset.
I think the dataset should make our approach clearer for the open source community.
Best Regards,
David