arxiv:2105.06413
Alexey G
grib0ed0v
AI & ML interests
LLM / RLHF / AI4Everything.
Recent Activity
liked
a Space
9 days ago
Qwen/Qwen2.5-Coder-Artifacts
upvoted
a
paper
9 days ago
Large Language Models Can Self-Improve in Long-context Reasoning
Organizations
None yet
Papers
3
models
12
grib0ed0v/ppo-LunarLander-v2
Reinforcement Learning
•
Updated
•
3
grib0ed0v/rl_course_vizdoom_health_gathering_supreme
Reinforcement Learning
•
Updated
grib0ed0v/ppo-LunarLander-v2-unit8
Reinforcement Learning
•
Updated
grib0ed0v/poca-SoccerTwos
Reinforcement Learning
•
Updated
•
45
grib0ed0v/a2c-PandaReachDense-v3
Reinforcement Learning
•
Updated
•
1
grib0ed0v/ppo-PyramidsRND
Reinforcement Learning
•
Updated
•
7
grib0ed0v/ppo-SnowballTarget
Reinforcement Learning
•
Updated
•
20
grib0ed0v/Reinforce-Pixelcopter-PLE-v0
Reinforcement Learning
•
Updated
grib0ed0v/Reinforce-CartPole-v1
Reinforcement Learning
•
Updated
grib0ed0v/dqn-SpaceInvadersNoFrameskip-v4
Reinforcement Learning
•
Updated
•
5
datasets
None public yet