sentence-transformers-from-synthetic-data Collection Example of using distilabel to generate synthetic triplets data for fine-tuning a Sentence Transformer model • 3 items • Updated 1 day ago • 15
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization Paper • 2405.15071 • Published 9 days ago • 30
view article Article Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models Mar 20 • 24
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model Paper • 2405.04434 • Published 25 days ago • 10
view article Article seemore: Implement a Vision Language Model from Scratch By AviSoori1x • 20 days ago • 42
view article Article The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare Apr 19 • 70
Eurus Collection Advancing LLM Reasoning Generalists with Preference Trees • 11 items • Updated Apr 15 • 22
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 58
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6 • 176
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper • 2403.09611 • Published Mar 14 • 122
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking Paper • 2403.09629 • Published Mar 14 • 54
Simple and Scalable Strategies to Continually Pre-train Large Language Models Paper • 2403.08763 • Published Mar 13 • 48
Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks Paper • 2403.05185 • Published Mar 8 • 19
Common 7B Language Models Already Possess Strong Math Capabilities Paper • 2403.04706 • Published Mar 7 • 16
UDOP Collection UDOP is a general multimodal model for document AI • 4 items • Updated 10 days ago • 20
RoleCraft-GLM: Advancing Personalized Role-Playing in Large Language Models Paper • 2401.09432 • Published Dec 17, 2023 • 2
RoleEval: A Bilingual Role Evaluation Benchmark for Large Language Models Paper • 2312.16132 • Published Dec 26, 2023 • 2
Instruction-tuned Language Models are Better Knowledge Learners Paper • 2402.12847 • Published Feb 20 • 25
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models Paper • 2310.08659 • Published Oct 12, 2023 • 20
datasets-SPIN Collection Generated synthetic data used to finetune SPIN. • 8 items • Updated Feb 9 • 10
The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs Paper • 2210.14986 • Published Oct 26, 2022 • 4
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5 • 61
ReGAL: Refactoring Programs to Discover Generalizable Abstractions Paper • 2401.16467 • Published Jan 29 • 7
Convergent Learning: Do different neural networks learn the same representations? Paper • 1511.07543 • Published Nov 24, 2015 • 2
Medical Merges Collection Playful merges that try to improve small medical LMs by merging them with models with higher reasoning capabilities. • 35 items • Updated Mar 5 • 2
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding Paper • 2306.02858 • Published Jun 5, 2023 • 14
Adapting Large Language Models via Reading Comprehension Paper • 2309.09530 • Published Sep 18, 2023 • 70
Preference Datasets for DPO Collection This collection contains a list of curated preference datasets for DPO fine-tuning for intent alignment of LLMs • 7 items • Updated Apr 4 • 20
Comparing DPO with IPO and KTO Collection A collection of chat models to explore the differences between three alignment techniques: DPO, IPO, and KTO. • 56 items • Updated Jan 9 • 31
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 29 items • Updated 1 day ago • 181
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws Paper • 2401.00448 • Published Dec 31, 2023 • 25
Generative AI for Math: Part I -- MathPile: A Billion-Token-Scale Pretraining Corpus for Math Paper • 2312.17120 • Published Dec 28, 2023 • 24
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models Paper • 2312.06585 • Published Dec 11, 2023 • 26
Awesome feedback datasets Collection A curated list of datasets with human or AI feedback. Useful for training reward models or applying techniques like DPO. • 19 items • Updated Apr 12 • 53
Orca 2: Teaching Small Language Models How to Reason Paper • 2311.11045 • Published Nov 18, 2023 • 69
ChatDoctor: A Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge Paper • 2303.14070 • Published Mar 24, 2023 • 9
The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning Paper • 2312.01552 • Published Dec 4, 2023 • 26
MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records Paper • 2308.14089 • Published Aug 27, 2023 • 24
Medical QA Datasets Collection A collection of medical question answering (QA) datasets • 19 items • Updated Oct 31, 2023 • 16