A2C Agent playing PandaReachJointsDense-v3

This is a trained model of an A2C agent playing PandaReachJointsDense-v3 using the stable-baselines3 library.

📊 Training Results & Analysis

The complete training metrics proving the model's 100% success rate and rapid convergence can be viewed in detail here: 👉 Full Weights & Biases Training Report

💻 Usage (with Stable-baselines3)

You can load and evaluate this model using the code below:

import gymnasium as gym
import panda_gym
from huggingface_sb3 import load_from_hub
from stable_baselines3 import A2C

# Download the model from the Hub
repo_id = "bsarmento/a2c-panda-reach-td1"
filename = "a2c_panda_reach_model.zip"
checkpoint = load_from_hub(repo_id, filename)

# Load the model into memory
model = A2C.load(checkpoint)

# Create the environment in human render mode
env = gym.make("PandaReachJointsDense-v3", render_mode="human")

# Enjoy the trained agent
obs, info = env.reset()
for i in range(1000):
    action, _states = model.predict(obs, deterministic=True)
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        obs, info = env.reset()

Downloads last month: -

Video Preview

Reinforcement Learning

Evaluation results

mean_reward on PandaReachJointsDense-v3
self-reported

-1.80 +/- 1.71