arxiv:2606.04923
Xuekang Wang
wxk123
ยท
AI & ML interests
LLM Safety
Recent Activity
authored a paper 1 day ago
Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning updated a model 6 months ago
wxk123/llama-3.2-1b-instruct-augmented