DPO datasets for DE Collection A collection of DPO datasets for the DE language. • 6 items • Updated 12 days ago • 1
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method Paper • 2402.17193 • Published Feb 27 • 23
Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens Paper • 2401.17377 • Published Jan 30 • 31
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling Paper • 2311.00430 • Published Nov 1, 2023 • 53
ModuleFormer: Learning Modular Large Language Models From Uncurated Data Paper • 2306.04640 • Published Jun 7, 2023 • 7
Retentive Network: A Successor to Transformer for Large Language Models Paper • 2307.08621 • Published Jul 17, 2023 • 165
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis Paper • 2307.01952 • Published Jul 4, 2023 • 71
TART: A plug-and-play Transformer module for task-agnostic reasoning Paper • 2306.07536 • Published Jun 13, 2023 • 10