ldwang's picture

ldwang

ldwang

·

ftgreat

AI & ML interests

LLM, MLLM, Infra

Recent Activity

upvoted an article 7 days ago

Open-R1: a fully open reproduction of DeepSeek-R1

liked a dataset 7 days ago

dgslibisey/MuSiQue

commented on a paper 7 days ago

Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training

View all activity

Organizations

ldwang's activity

upvoted an article 7 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 836

upvoted a paper 7 days ago

Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training

Paper • 2503.18929 • Published 16 days ago • 3

upvoted a paper 9 days ago

Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback

Paper • 2503.22230 • Published 13 days ago • 43

upvoted 2 collections 22 days ago

OpenSeek

OpenSeek • 3 items • Updated Feb 25 • 2

Aquila

22 items • Updated Mar 9 • 4

upvoted a collection about 1 month ago

SimpleRL

The collection for the Project "Simple Reinforcement Learning for Reasoning" • 2 items • Updated Feb 19 • 6

upvoted 2 papers about 1 month ago

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published Jan 26 • 69

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 148

upvoted a collection about 2 months ago

DeepSeek-R1

8 items • Updated Jan 21 • 600

upvoted an article about 2 months ago

Article

Large-scale Near-deduplication Behind BigCode

May 16, 2023

• 23

upvoted a paper 3 months ago

Free Process Rewards without Process Labels

Paper • 2412.01981 • Published Dec 2, 2024 • 35

upvoted a collection 3 months ago

OpenCoder Datasets

OpenCoder datasets! • 6 items • Updated Nov 15, 2024 • 40

upvoted a paper 3 months ago

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published Nov 7, 2024 • 124

upvoted 2 collections 3 months ago

OpenCoder Model

OpenCoder Models • 9 items • Updated Nov 19, 2024 • 10

MiscModels

6 items • Updated Mar 7 • 1

upvoted a paper 3 months ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25, 2024 • 96

upvoted a collection 3 months ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated Feb 26 • 585