Zikun Li's picture

83 9

Zikun Li

zikun-li

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 10 days ago

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

upvoted a paper 20 days ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

upvoted a paper 24 days ago

Towards General-Purpose Model-Free Reinforcement Learning

View all activity

Organizations

None yet

zikun-li's activity

upvoted a paper 10 days ago

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published 12 days ago • 133

upvoted a paper 20 days ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 25 days ago • 106

upvoted 3 papers 24 days ago

Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published 26 days ago • 26

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published 27 days ago • 61

Baichuan-Omni-1.5 Technical Report

Paper • 2501.15368 • Published 27 days ago • 61

upvoted 4 papers 29 days ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 97

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published about 1 month ago • 83

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published about 1 month ago • 328

Improving Video Generation with Human Feedback

Paper • 2501.13918 • Published 30 days ago • 49

upvoted 7 papers about 1 month ago

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published Jan 9 • 95

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 90

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 257

The GAN is dead; long live the GAN! A Modern GAN Baseline

Paper • 2501.05441 • Published Jan 9 • 89

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Paper • 2501.06186 • Published Jan 10 • 61

VideoRAG: Retrieval-Augmented Generation over Video Corpus

Paper • 2501.05874 • Published Jan 10 • 67

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13 • 91

upvoted a paper 4 months ago

Sample-Efficient Alignment for LLMs

Paper • 2411.01493 • Published Nov 3, 2024 • 11

upvoted 3 papers 5 months ago

TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention

Paper • 2410.05076 • Published Oct 7, 2024 • 8

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 137

HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Paper • 2409.16191 • Published Sep 24, 2024 • 42