Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Yuan's picture

19

Yuan

MinakamiYuki

·

AI & ML interests

None yet

Organizations

None yet

Collections 1

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 139
Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models

Paper • 2409.18943 • Published Sep 27, 2024 • 29
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge

Paper • 2411.16594 • Published Nov 25, 2024 • 40
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38

models

None public yet

datasets

None public yet

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs