Reinforcement Learning
stable-baselines3
PandaReachJointsDense-v3
deep-reinforcement-learning
Eval Results (legacy)
Instructions to use bsarmento/a2c-panda-reach-td1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- stable-baselines3
How to use bsarmento/a2c-panda-reach-td1 with stable-baselines3:
from huggingface_sb3 import load_from_hub checkpoint = load_from_hub( repo_id="bsarmento/a2c-panda-reach-td1", filename="{MODEL FILENAME}.zip", ) - Notebooks
- Google Colab
- Kaggle
A2C Agent playing PandaReachJointsDense-v3
This is a trained model of an A2C agent playing PandaReachJointsDense-v3 using the stable-baselines3 library.
π Training Results & Analysis
The complete training metrics proving the model's 100% success rate and rapid convergence can be viewed in detail here: π Full Weights & Biases Training Report
π» Usage (with Stable-baselines3)
You can load and evaluate this model using the code below:
import gymnasium as gym
import panda_gym
from huggingface_sb3 import load_from_hub
from stable_baselines3 import A2C
# Download the model from the Hub
repo_id = "bsarmento/a2c-panda-reach-td1"
filename = "a2c_panda_reach_model.zip"
checkpoint = load_from_hub(repo_id, filename)
# Load the model into memory
model = A2C.load(checkpoint)
# Create the environment in human render mode
env = gym.make("PandaReachJointsDense-v3", render_mode="human")
# Enjoy the trained agent
obs, info = env.reset()
for i in range(1000):
action, _states = model.predict(obs, deterministic=True)
obs, reward, terminated, truncated, info = env.step(action)
if terminated or truncated:
obs, info = env.reset()
- Downloads last month
- -
Evaluation results
- mean_reward on PandaReachJointsDense-v3self-reported-1.80 +/- 1.71