Checkpoint "step200000-tokens838B" seems like fully trained model

#4
by Cartinoe5930 - opened

Thank you for your great work!

I tried to evaluate the checkpoints on the Korean benchmark, Haerae Bench to analyze how the multilingual ability of OLMo evolves with the pre-training steps. The results revealed that the "step200000-tokens838B" checkpoint performance is same as fully trained "main" checkpoint. I think there seems to have an error when "step200000-tokens838B" checkpoint was saved. Please check if there are any errors in the "step200000-tokens838B" checkpoint!

image.png

Cartinoe5930 changed discussion title from Revision "step200000-tokens838B" seems like fully trained model to Checkpoint "step200000-tokens838B" seems like fully trained model
Allen Institute for AI org

Thank you for pointing this out! Unfortunately you are right. Something went wrong with the upload-to-HF job for some of the checkpoints, which we will investigate.

Allen Institute for AI org

Steps 200k to 251k incorrectly match the fully trained model. We will try to fix them quickly. Please let us know if you see any other incorrect checkpoints.

Thank you for your kind response! Unfortunately, the experiment was only conducted on the models shown in the table, so I am not sure about other checkpoints.. 🥲 I will let you know if there are strange results through future experiments!

Allen Institute for AI org

Steps 200k to 251k are now updated are far as I can tell. Please let us know if you encounter any other issues.

shanearora changed discussion status to closed

Sign up or log in to comment