BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks Jun 18 • 41
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22 • 118
view article Article A failed experiment: Infini-Attention, and why we should keep trying? Aug 14 • 50
Agentless: Demystifying LLM-based Software Engineering Agents Paper • 2407.01489 • Published Jul 1 • 42
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published Jun 25 • 86
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions Paper • 2406.15877 • Published Jun 22 • 45
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24 • 177
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations Paper • 2405.18392 • Published May 28 • 12
Leaderboards and benchmarks ✨ Collection Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 78 items • Updated 1 day ago • 89