Focusing on What Matters: Saliency-Harnessing Accurate Routing for Diffusion MoE Paper • 2606.26938 • Published 11 days ago • 5
stefanocarrera/autophagycode_M_mercury_Qwen3-8B_lr0.0001_c142_surplexity_t1_g3_run2 Updated Jun 3 • 1
OSCAR: Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization Paper • 2605.17757 • Published May 18 • 66
WorldKV: Efficient World Memory with World Retrieval and Compression Paper • 2605.22718 • Published May 21 • 42
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published May 12 • 196
Code-as-Room: Generating 3D Rooms from Top-Down View Images via Agentic Code Synthesis Paper • 2605.18451 • Published May 18 • 41
Learning to Foresee: Unveiling the Unlocking Efficiency of On-Policy Distillation Paper • 2605.11739 • Published May 13 • 60
ai-safety-institute/Qwen3.6-27B-gender_secret_female-merged Text Generation • 27B • Updated May 14 • 548 • 1