Can LLMs Maintain Fundamental Abilities under KV Cache Compression? Paper • 2502.01941 • Published 1 day ago • 3
FastKV: KV Cache Compression for Fast Long-Context Processing with Token-Selective Propagation Paper • 2502.01068 • Published 2 days ago • 14
Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch Paper • 2501.18512 • Published 6 days ago • 24
Optimizing Large Language Model Training Using FP4 Quantization Paper • 2501.17116 • Published 8 days ago • 32
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models Paper • 2501.12370 • Published 14 days ago • 10
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 14 days ago • 292
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning Paper • 2501.12570 • Published 14 days ago • 23
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback Paper • 2501.12895 • Published 14 days ago • 55
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published 13 days ago • 79
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Paper • 2501.11425 • Published 16 days ago • 90
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 22 days ago • 53
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models Paper • 2501.09686 • Published 20 days ago • 36