QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models Paper • 2502.12346 • Published Feb 17 • 1
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients Paper • 2407.08296 • Published Jul 11, 2024 • 34
When are 1.58 bits enough? A Bottom-up Exploration of BitNet Quantization Paper • 2411.05882 • Published Nov 8, 2024 • 1
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27, 2024 • 615
1-bit AI Infra: Part 1.1, Fast and Lossless BitNet b1.58 Inference on CPUs Paper • 2410.16144 • Published Oct 21, 2024 • 5
BitNet: Scaling 1-bit Transformers for Large Language Models Paper • 2310.11453 • Published Oct 17, 2023 • 101
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published 4 days ago • 87
FANformer: Improving Large Language Models Through Effective Periodicity Modeling Paper • 2502.21309 • Published Feb 28 • 1
Cobra: Efficient Line Art COlorization with BRoAder References Paper • 2504.12240 • Published 6 days ago • 26
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft Paper • 2504.08388 • Published 12 days ago • 39
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published 11 days ago • 120
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper • 2504.06263 • Published 14 days ago • 148
Efficient Model Selection for Time Series Forecasting via LLMs Paper • 2504.02119 • Published 20 days ago • 16