asuzuki
/

a2c-AntBulletEnv-v0

Reinforcement Learning

stable-baselines3

AntBulletEnv-v0

deep-reinforcement-learning

Model card Files Files and versions Community

a2c-AntBulletEnv-v0 / README.md

asuzuki's picture

updated code

1b6c68c over 1 year ago

|

raw history blame contribute delete

No virus

2.05 kB

	---
	library_name: stable-baselines3
	tags:
	- AntBulletEnv-v0
	- deep-reinforcement-learning
	- reinforcement-learning
	- stable-baselines3
	model-index:
	- name: A2C
	results:
	- task:
	type: reinforcement-learning
	name: reinforcement-learning
	dataset:
	name: AntBulletEnv-v0
	type: AntBulletEnv-v0
	metrics:
	- type: mean_reward
	value: 1834.41 +/- 107.15
	name: mean_reward
	verified: false
	---

	# A2C Agent playing AntBulletEnv-v0
	This is a trained model of a A2C agent playing AntBulletEnv-v0
	using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).

	## Usage (with Stable-baselines3)
	TODO: Add your code


	```python
	import pybullet_envs
	import panda_gym
	import gym

	import os

	from huggingface_sb3 import load_from_hub, package_to_hub

	from stable_baselines3 import A2C
	from stable_baselines3.common.evaluation import evaluate_policy
	from stable_baselines3.common.vec_env import DummyVecEnv, VecNormalize
	from stable_baselines3.common.env_util import make_vec_env

	from huggingface_hub import notebook_login

	#Environment 1: AntBulletEnv-v0
	env_id = "AntBulletEnv-v0"
	# Create the env
	env = gym.make(env_id)

	env = make_vec_env(env_id, n_envs=4)

	# Adding this wrapper to normalize the observation and the reward
	env = VecNormalize(env, norm_obs=True, norm_reward=True, clip_obs=10)

	#create A2C model
	model = A2C(policy = "MlpPolicy",
	env = env,
	gae_lambda = 0.9,
	gamma = 0.99,
	learning_rate = 0.00096,
	max_grad_norm = 0.5,
	n_steps = 8,
	vf_coef = 0.4,
	ent_coef = 0.0,
	seed=11,
	policy_kwargs=dict(
	log_std_init=-2, ortho_init=False),
	normalize_advantage=False,
	use_rms_prop= True,
	use_sde= True,
	verbose=1)

	#train agent
	model.learn(1_500_000)

	# Save the model and VecNormalize statistics when saving the agent
	model.save("a2c-AntBulletEnv-v0")
	env.save("vec_normalize.pkl")
	```