Would I Lie To You? Inference Time Alignment of Language Models using Direct Preference Heads Paper • 2405.20053 • Published 14 days ago • 2
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 18 items • Updated 14 days ago • 143
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Apr 18 • 584
Orca-Math: Unlocking the potential of SLMs in Grade School Math Paper • 2402.14830 • Published Feb 16 • 23
OpenMath Collection A collection of models and datasets introduced in "OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset" • 15 items • Updated Feb 19 • 29
Self-Discover: Large Language Models Self-Compose Reasoning Structures Paper • 2402.03620 • Published Feb 6 • 104
Awesome reward models Collection A curated collection of reward models to use with techniques like rejection sampling and RLHF / RLAIF • 4 items • Updated Apr 12 • 4
Journal Club Collection Candidate papers to read in the H4 journal club • 54 items • Updated Apr 21 • 24
Mixtral HQQ Quantized Models Collection 4-bit and 2-bit Mixtral models quantized using https://github.com/mobiusml/hqq • 9 items • Updated Mar 29 • 14
LLaMA Beyond English: An Empirical Study on Language Capability Transfer Paper • 2401.01055 • Published Jan 2 • 50