-
Natural Language Reinforcement Learning
Paper • 2411.14251 • Published • 29 -
Benjamin-eecs/Llama-3.1-8B-Instruct-NLRL-TicTacToe-Value
Feature Extraction • Updated • 18 -
Benjamin-eecs/Llama-3.1-8B-Instruct-NLRL-TicTacToe-Policy
Feature Extraction • Updated • 16 -
Waterhorse/Llama-3.1-8B-Instruct-NLRL-Breakthrough-Value
Feature Extraction • Updated • 14
Bo Liu
Benjamin-eecs
AI & ML interests
Reinforcement Learning, Reasoning, Machine Learning Systems
Recent Activity
upvoted
a
paper
9 days ago
Self-rewarding correction for mathematical reasoning
liked
a Space
13 days ago
bigcomputer/SWE-Arena
authored
a paper
13 days ago
EnvPool: A Highly Parallel Reinforcement Learning Environment Execution
Engine
Organizations
Collections
1
models
2
datasets
None public yet