The Unreasonable Ineffectiveness of the Deeper Layers Paper • 2403.17887 • Published Mar 26, 2024 • 83
Scalable Pre-training of Large Autoregressive Image Models Paper • 2401.08541 • Published Jan 16, 2024 • 38
codellama/CodeLlama-13b-Instruct-hf Text Generation • 13B • Updated Apr 12, 2024 • 1.99k • • 160