Train tokens amount

#74
by aynetdia - opened

Hi,

the technical report states that phi-1.5 was trained on a dataset of 30 billion tokens (Section 1), however Table 1 and Section 2.3 indicate that it was trained for 150 billion tokens. Does it mean that the model went over the synthetic 30B token dataset 5 times, i.e. that the pre-training lasted 5 epochs?

Best,
Ansar

Microsoft org

Hi @aynetdia ! I hope everything is going well with you.

Exactly, it was pre-trained for 5 epochs.

Regards,
Gustavo.

gugarosa changed discussion status to closed

Sign up or log in to comment