Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey Paper • 2412.18619 • Published 17 days ago • 42
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published 23 days ago • 70
JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation Paper • 2411.07975 • Published Nov 12, 2024 • 27
LLM-based Optimization of Compound AI Systems: A Survey Paper • 2410.16392 • Published Oct 21, 2024 • 14
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Nov 28, 2024 • 451
view article Article Meet Yi-Coder: A Small but Mighty LLM for Code By lorinma • Sep 4, 2024 • 14
view article Article Understanding Vector Quantization in VQ-VAE By ariG23498 • Aug 28, 2024 • 13
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming Paper • 2408.16725 • Published Aug 29, 2024 • 52
LongRecipe: Recipe for Efficient Long Context Generalization in Large Languge Models Paper • 2409.00509 • Published Aug 31, 2024 • 37
Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts Paper • 2408.15664 • Published Aug 28, 2024 • 11
I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm Paper • 2408.08072 • Published Aug 15, 2024 • 32
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents Paper • 2407.16741 • Published Jul 23, 2024 • 68
Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing Paper • 2407.08770 • Published Jul 11, 2024 • 19
Human-like Episodic Memory for Infinite Context LLMs Paper • 2407.09450 • Published Jul 12, 2024 • 59