Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Paper • 2402.09844 • Published Feb 15 • 15
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published 6 days ago • 213
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 51
view article Article Orchestration of Experts: The First-Principle Multi-Model System By alirezamsh • 12 days ago • 8
view article Article Pollen-Vision: Unified interface for Zero-Shot vision models in robotics Mar 25 • 6
view article Article Open Source All About Data Processing, Dataverse By EujeongChoi • 24 days ago • 2
SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning Paper • 2401.16013 • Published Jan 29 • 17
QuRating: Selecting High-Quality Data for Training Language Models Paper • 2402.09739 • Published Feb 15 • 3
Simple linear attention language models balance the recall-throughput tradeoff Paper • 2402.18668 • Published Feb 28 • 16
Gemma release Collection Groups the Gemma models released by the Google team. • 40 items • Updated 16 days ago • 286
Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning Paper • 2402.03046 • Published Feb 5 • 4
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5 • 60
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity Paper • 2401.17072 • Published Jan 30 • 22
Leaderboards and benchmarks ✨ Collection Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 55 items • Updated 12 days ago • 50
Paloma Collection Dataset and baseline models for Paloma, a benchmark of language model fit to 585 textual domains • 8 items • Updated Feb 1 • 13
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 252
Paloma: A Benchmark for Evaluating Language Model Fit Paper • 2312.10523 • Published Dec 16, 2023 • 11
Journal Club Collection Candidate papers to read in the H4 journal club • 54 items • Updated 7 days ago • 23
Recent models: last 100 repos, sorted by creation date Collection The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. • 121 items • Updated Jan 31 • 438
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI Paper • 2311.16502 • Published Nov 27, 2023 • 33
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2 Paper • 2311.10702 • Published Nov 17, 2023 • 17
System 2 Attention (is something you might need too) Paper • 2311.11829 • Published Nov 20, 2023 • 38
Top 10% instruction tuning datasets Collection Collects datasets with 'instruction' in the name and more than 1 download and in the top 10% for the number of likes • 13 items • Updated Sep 25, 2023 • 5
Orca 2: Teaching Small Language Models How to Reason Paper • 2311.11045 • Published Nov 18, 2023 • 68
zephyr story Collection sources mentioned by hf.co/thomwolf tweet: x.com/Thom_Wolf/status/1720503998518640703 • 8 items • Updated Jan 24 • 15
Historical - Spaces of the Week Collection All Spaces of the Week...from all weeks • 636 items • Updated Jan 17 • 18
BitNet: Scaling 1-bit Transformers for Large Language Models Paper • 2310.11453 • Published Oct 17, 2023 • 93
LLM Leaderboard best models ❤️🔥 Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: • 63 items • Updated 3 days ago • 283
A Function Interpretation Benchmark for Evaluating Interpretability Methods Paper • 2309.03886 • Published Sep 7, 2023 • 1
Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers Paper • 2309.08532 • Published Sep 15, 2023 • 50
On the Origin of LLMs: An Evolutionary Tree and Graph for 15,821 Large Language Models Paper • 2307.09793 • Published Jul 19, 2023 • 45
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper • 2307.09288 • Published Jul 18, 2023 • 232
How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources Paper • 2306.04751 • Published Jun 7, 2023 • 4
INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models Paper • 2306.04757 • Published Jun 7, 2023 • 4
PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization Paper • 2306.05087 • Published Jun 8, 2023 • 5
Improving Open Language Models by Learning from Organic Interactions Paper • 2306.04707 • Published Jun 7, 2023 • 3
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena Paper • 2306.05685 • Published Jun 9, 2023 • 20