CodeI/O Collection Collection for CodeI/O @ https://codei-o.github.io/ • 15 items • Updated Feb 13 • 6
VersaPRM Collection Collection of VersaPRMs using various training configurations • 8 items • Updated Feb 8 • 1
SEABO: A Simple Search-Based Method for Offline Imitation Learning Paper • 2402.03807 • Published Feb 6, 2024
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published Feb 10 • 147
PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation Paper • 2306.03615 • Published Jun 6, 2023
A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning Paper • 2410.14660 • Published Oct 18, 2024
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors Paper • 2412.10713 • Published Dec 14, 2024
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published Feb 10 • 147
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-72B-Instruct-style2 Viewer • Updated Feb 4 • 6.82k • 61
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-72B-Instruct-style1 Viewer • Updated Feb 4 • 6.82k • 64
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-7B-Instruct-style2 Viewer • Updated Jan 9 • 6.82k • 88
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-7B-Instruct-style1 Viewer • Updated Jan 9 • 6.82k • 82
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-7B-Instruct-style2 Viewer • Updated Jan 9 • 6.82k • 84
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-7B-Instruct-style1 Viewer • Updated Jan 9 • 6.82k • 95
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-1.5B-Instruct-style2 Viewer • Updated Jan 9 • 6.82k • 85
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-1.5B-Instruct-style1 Viewer • Updated Jan 9 • 6.82k • 84
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-1.5B-Instruct-style2 Viewer • Updated Jan 9 • 6.82k • 87
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-1.5B-Instruct-style1 Viewer • Updated Jan 9 • 6.82k • 81
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-1.5B-Instruct-style2 Viewer • Updated Jan 9 • 6.82k • 87
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-1.5B-Instruct-style1 Viewer • Updated Jan 9 • 6.82k • 81