No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models Paper • 2307.06440 • Published Jul 12 • 2