Training time and resoruces

#1
by adendek - opened

Dear authors,

Could you please share the resources (time and hardware) that were needed to train or finetune this model?

We conducted the training on a V100 GPU in Google Colab with a batch size of 100. On average, it took about 2 seconds per epoch, and we performed 10 epochs in one sweep.

However, the model is small enough to fit into even an 8 GB GPU. Training would also work on such a GPU, but it would take significantly longer.

Sign up or log in to comment