Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering Paper • 2411.11504 • Published Nov 18 • 19
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training Paper • 2411.13476 • Published Nov 20 • 15
Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published Nov 20 • 38
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Paper • 2411.15124 • Published Nov 22 • 55
Star Attention: Efficient LLM Inference over Long Sequences Paper • 2411.17116 • Published 29 days ago • 47
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper • 2411.16489 • Published 29 days ago • 40
nGPT: Normalized Transformer with Representation Learning on the Hypersphere Paper • 2410.01131 • Published Oct 1 • 9
Training Large Language Models to Reason in a Continuous Latent Space Paper • 2412.06769 • Published 15 days ago • 61
Weighted-Reward Preference Optimization for Implicit Model Fusion Paper • 2412.03187 • Published 20 days ago • 9
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models Paper • 2412.11605 • Published 8 days ago • 15
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN Paper • 2412.13795 • Published 6 days ago • 18
A Post-Training Enhanced Optimization Approach for Small Language Models Paper • 2411.02939 • Published Nov 5