arxiv:2501.02790
Shentao Yang
shentaoyang
AI & ML interests
Generative AI, Large Language Models, RLHF, RLAIF, Reinforcement Learning
Recent Activity
authored
a paper
1 day ago
Preference-grounded Token-level Guidance for Language Model Fine-tuning
authored
a paper
1 day ago
A Dense Reward View on Aligning Text-to-Image Diffusion with Preference
authored
a paper
1 day ago
Segmenting Text and Learning Their Rewards for Improved RLHF in Language
Model
Organizations
None yet