Common Corpus Collection The largest public domain dataset for training LLMs. • 27 items • Updated 9 days ago • 106
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling Paper • 2311.00430 • Published Nov 1, 2023 • 53