view article Article makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch By AviSoori1x • 26 days ago • 25
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 29 items • Updated 3 days ago • 181
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published May 2 • 103
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models Paper • 2404.18796 • Published Apr 29 • 63
view article Article Expanding Model Context and Creating Chat Models with a Single Click By maywell • Apr 28 • 33
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published Apr 22 • 122
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22 • 239
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM Paper • 2403.07816 • Published Mar 12 • 37
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 58
ORPO Collection This is the official collection of "ORPO: Monolithic Preference Optimization without Reference Model". • 5 items • Updated Apr 12 • 10
Matryoshka Embedding Models Collection https://huggingface.co/blog/matryoshka • 12 items • Updated 18 days ago • 10
Self-Discover: Large Language Models Self-Compose Reasoning Structures Paper • 2402.03620 • Published Feb 6 • 102
PokéLLMon: A Human-Parity Agent for Pokémon Battles with Large Language Models Paper • 2402.01118 • Published Feb 2 • 28
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research Paper • 2402.00159 • Published Jan 31 • 55
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling Paper • 2401.16380 • Published Jan 29 • 46
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation Paper • 2401.08417 • Published Jan 16 • 27
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Paper • 2401.01335 • Published Jan 2 • 61
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling Paper • 2312.15166 • Published Dec 23, 2023 • 55
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 131
Awesome feedback datasets Collection A curated list of datasets with human or AI feedback. Useful for training reward models or applying techniques like DPO. • 19 items • Updated Apr 12 • 53
Journal Club Collection Candidate papers to read in the H4 journal club • 54 items • Updated Apr 21 • 24
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision Paper • 2312.09390 • Published Dec 14, 2023 • 32
TinyGSM: achieving >80% on GSM8k with small language models Paper • 2312.09241 • Published Dec 14, 2023 • 34
Awesome SFT datasets Collection A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12 • 94
Orca 2: Teaching Small Language Models How to Reason Paper • 2311.11045 • Published Nov 18, 2023 • 69
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2 Paper • 2311.10702 • Published Nov 17, 2023 • 17
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models Paper • 2310.08491 • Published Oct 12, 2023 • 50
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models Paper • 2309.03883 • Published Sep 7, 2023 • 14
Large Language Models Cannot Self-Correct Reasoning Yet Paper • 2310.01798 • Published Oct 3, 2023 • 30
Textbooks Are All You Need II: phi-1.5 technical report Paper • 2309.05463 • Published Sep 11, 2023 • 84
Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding Paper • 2307.15337 • Published Jul 28, 2023 • 34
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback Paper • 2307.15217 • Published Jul 27, 2023 • 34
Measuring Faithfulness in Chain-of-Thought Reasoning Paper • 2307.13702 • Published Jul 17, 2023 • 26
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets Paper • 2307.10928 • Published Jul 20, 2023 • 11
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper • 2307.09288 • Published Jul 18, 2023 • 235
Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning Paper • 2307.02053 • Published Jul 5, 2023 • 23