jinwonkim93's picture
Scheduler implementation of Continual Pre-Training of Large Language Models: How to (re)warm your model? (#1273)
8430db2 unverified