How many data you use

#1
by kevinpro - opened

365 * 16 = 5840

I notice that you train the model with 1 epoch(365 steps) and batchsize 16

Does that mean you only train on 5840 samples?

Lightblue KK. org

We use sample packing, so its 5840 samples packed to 8096 tokens, so more samples packed together

We use sample packing, so its 5840 samples packed to 8096 tokens, so more samples packed together

Thank you for your explaination!
My understanding is that you used all the data: 76,338 + 669 + 6,206 samples. Also, due to packing, these samples were compiled into training data consisting of 8,096 * 5,840 tokens. Please let me know if my understanding is correct.

Sign up or log in to comment