Yuxiang Zhang's picture

1 4

Yuxiang Zhang

TokerZ

·

AI & ML interests

LLM-based Agent, RL, Large Reasoning Model

Recent Activity

upvoted a paper about 1 month ago

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

upvoted a paper about 2 months ago

o1-Coder: an o1 Replication for Coding

authored a paper about 2 months ago

Agent models: Internalizing Chain-of-Action Generation into Reasoning models

View all activity

Organizations

None yet

TokerZ's activity

upvoted a paper about 1 month ago

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 122

upvoted a paper about 2 months ago

o1-Coder: an o1 Replication for Coding

Paper • 2412.00154 • Published Nov 29, 2024 • 45

authored a paper about 2 months ago

Agent models: Internalizing Chain-of-Action Generation into Reasoning models

Paper • 2503.06580 • Published Mar 9 • 17

upvoted a paper about 2 months ago

Agent models: Internalizing Chain-of-Action Generation into Reasoning models

Paper • 2503.06580 • Published Mar 9 • 17

commented a paper about 2 months ago

Agent models: Internalizing Chain-of-Action Generation into Reasoning models

Paper • 2503.06580 • Published Mar 9 • 17 •

authored a paper 3 months ago

Don't Command, Cultivate: An Exploratory Study of System-2 Alignment

Paper • 2411.17075 • Published Nov 26, 2024 • 1

authored a paper 4 months ago

OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning

Paper • 2412.16849 • Published Dec 22, 2024 • 9

upvoted a paper 4 months ago

OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning

Paper • 2412.16849 • Published Dec 22, 2024 • 9

authored 2 papers 5 months ago

TeleChat Technical Report

Paper • 2401.03804 • Published Jan 8, 2024 • 8

o1-Coder: an o1 Replication for Coding

Paper • 2412.00154 • Published Nov 29, 2024 • 45