Yihe Deng's picture

Yihe Deng PRO

ydeng9

·

https://yihe-deng.notion.site/Yihe-Deng-167ab2d2c1fb80b3a76dfb120f716c84

Yihe__Deng

AI & ML interests

LLM post-training

Recent Activity

published a dataset 6 days ago

ydeng9/llavaone_grpo_v2

published a dataset 6 days ago

ydeng9/OpenVLThinker_sft_iter2

upvoted a paper 11 days ago

When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning

View all activity

Organizations

ydeng9's activity

New activity in ydeng9/OpenVLThinker-7B 19 days ago

Highlight code

#2 opened 20 days ago by

New activity in ydeng9/OpenVLThinker-7B 20 days ago

Add library name and pipeline tag

#1 opened 21 days ago by

commented a paper 21 days ago

OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement

Paper • 2503.17352 • Published 24 days ago • 21 •

New activity in DuoGuard/DuoGuard-1.5B-transfer 2 months ago

Add link to code

#1 opened 2 months ago by

New activity in DuoGuard/DuoGuard-1B-Llama-3.2-transfer 2 months ago

Add link to Github repository

#1 opened 2 months ago by

New activity in DuoGuard/DuoGuard-0.5B 2 months ago

Add link to Github repository

#3 opened 2 months ago by

Add library name

#2 opened 2 months ago by

Add link to paper, add pipeline tag

#1 opened 2 months ago by

commented a paper 2 months ago

DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails

Paper • 2502.05163 • Published Feb 7 • 22 •

commented a paper 6 months ago

Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning

Paper • 2410.22304 • Published Oct 29, 2024 • 18 •

commented a paper 9 months ago

MIRAI: Evaluating LLM Agents for Event Forecasting

Paper • 2407.01231 • Published Jul 1, 2024 • 18 •

commented a paper 10 months ago

MIRAI: Evaluating LLM Agents for Event Forecasting

Paper • 2407.01231 • Published Jul 1, 2024 • 18 •

New activity in UCLA-AGI/zephyr-7b-sft-full-SPIN-iter1 about 1 year ago

Training code

#1 opened about 1 year ago by

New activity in UCLA-AGI/zephyr-7b-sft-full-SPIN-iter2 about 1 year ago

How to reproduce the results ?

#1 opened over 1 year ago by

New activity in open-llm-leaderboard-old/results over 1 year ago

Delete UCLA-AGI/test

#21 opened over 1 year ago by

Delete UCLA-AGI/zephyr-7b-sft-full-spin-iter1

#20 opened over 1 year ago by