ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated 24 days ago • 122
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 32 items • Updated 6 days ago • 64
MagicQuill: An Intelligent Interactive Image Editing System Paper • 2411.09703 • Published Nov 14, 2024 • 63
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published Nov 7, 2024 • 113
OpenCoder Collection OpenCoder is an open and reproducible code LLM family which matches the performance of top-tier code LLMs. • 8 items • Updated Nov 23, 2024 • 79
LLM-Assisted Code Cleaning For Training Accurate Code Generators Paper • 2311.14904 • Published Nov 25, 2023 • 4
LayerSkip Collection Models continually pretrained using LayerSkip - https://arxiv.org/abs/2404.16710 • 8 items • Updated Nov 21, 2024 • 46
Direct Preference Optimization Datasets Collection Datasets suitable for DPO based on having 'chosen', 'rejected', and 'prompt' columns. Created using librarian-bots/dataset-column-search-api • 4412 items • Updated 25 days ago • 6
CursorCore: Assist Programming through Aligning Anything Paper • 2410.07002 • Published Oct 9, 2024 • 13
Enhancing Training Efficiency Using Packing with Flash Attention Paper • 2407.09105 • Published Jul 12, 2024 • 14