Reinforcement Learning
stable-baselines3
CartPole-v1
deep-reinforcement-learning
Eval Results (legacy)
Instructions to use costus/a2c-CartPole-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- stable-baselines3
How to use costus/a2c-CartPole-v1 with stable-baselines3:
from huggingface_sb3 import load_from_hub checkpoint = load_from_hub( repo_id="costus/a2c-CartPole-v1", filename="{MODEL FILENAME}.zip", ) - Notebooks
- Google Colab
- Kaggle
Agent de résolution A2C, résolution du jeu CartPole-v1
Il s'agit ici d'un agent A2C pour la résolution d'un CartPole-v1 en utilisant la librairie stable-baselines3.
Prérequis
pip install gymnasium pip install stable-baselines3 pip install huggingface-sb3 pip install moviepy
import os
import sys
import gymnasium as gym
from stable_baselines3 import A2C
#Configuration et paramètres
epochs = 10
if len(sys.argv) >= 2:
epochs = int(sys.argv[1])
# Définition des chemins
base_path = os.getcwd()
models_dir = os.path.join(base_path, 'models', 'A2C')
logs_dir = os.path.join(base_path, 'logs')
# Création des dossiers s'ils n'existent pas
os.makedirs(models_dir, exist_ok=True)
os.makedirs(logs_dir, exist_ok=True)
#initialisation
env = gym.make('CartPole-v1')
#Modèle A2C
model = A2C('MlpPolicy', env, verbose=1, tensorboard_log=logs_dir)
#Boucle d'entrainement
timesteps_per_epoch = 10000
for i in range(1, epochs + 1):
model.learn(total_timesteps=timesteps_per_epoch, reset_num_timesteps=False, tb_log_name="A2C_run")
current_step = i * timesteps_per_epoch
model_path = os.path.join(models_dir, f"a2c_cartpole_{current_step}")
model.save(model_path)
print(f"Modèle sauvegardé : {model_path}")
env.close()
print("Entraînement terminé !")
...
- Downloads last month
- -
Evaluation results
- mean_reward on CartPole-v1self-reported500.00 +/- 0.00