bartpotrykus's picture
30m training steps with linear learning rate scheduler applied after 15m steps
6881971