Confusion about the amount of data used in the training process.

#15
by StevenTu - opened

While reading Section 4.2 "Training" of the paper, I noticed that the Orca2 training process utilized over 6 million data points, and there were also a considerable number of epochs. May I ask if this approach is necessary? What is the quality of the data with over 6 million points, and how diverse is it?

image.png

Sign up or log in to comment