zjowowen commited on
Commit
e8134c5
1 Parent(s): 5fe63a3

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +17 -68
README.md CHANGED
@@ -6,9 +6,9 @@ tags:
6
  - deep-reinforcement-learning
7
  - reinforcement-learning
8
  - DI-engine
9
- - QbertNoFrameskip-v4-DQN
10
  benchmark_name: OpenAI/Gym/Atari
11
- task_name: QbertNoFrameskip-v4-DQN
12
  pipeline_tag: reinforcement-learning
13
  model-index:
14
  - name: DQN
@@ -17,19 +17,19 @@ model-index:
17
  type: reinforcement-learning
18
  name: reinforcement-learning
19
  dataset:
20
- name: OpenAI/Gym/Atari-QbertNoFrameskip-v4-DQN
21
- type: OpenAI/Gym/Atari-QbertNoFrameskip-v4-DQN
22
  metrics:
23
  - type: mean_reward
24
- value: 6087.5 +/- 1837.5
25
  name: mean_reward
26
  ---
27
 
28
- # Play **QbertNoFrameskip-v4-DQN** with **DQN** Policy
29
 
30
  ## Model Description
31
  <!-- Provide a longer summary of what this model is. -->
32
- This is a simple **DQN** implementation to OpenAI/Gym/Atari **QbertNoFrameskip-v4-DQN** using the [DI-engine library](https://github.com/opendilab/di-engine) and the [DI-zoo](https://github.com/opendilab/DI-engine/tree/main/dizoo).
33
 
34
  **DI-engine** is a python library for solving general decision intelligence problems, which is based on implementations of reinforcement learning framework using PyTorch or JAX. This library aims to standardize the reinforcement learning framework across different algorithms, benchmarks, environments, and to support both academic researches and prototype applications. Besides, self-customized training pipelines and applications are supported by reusing different abstraction levels of DI-engine reinforcement learning framework.
35
 
@@ -60,23 +60,7 @@ python3 -u run.py
60
  ```
61
  **run.py**
62
  ```python
63
- from ding.bonus import DQNAgent
64
- from ding.config import Config
65
- from easydict import EasyDict
66
- import torch
67
-
68
- # Pull model from files which are git cloned from huggingface
69
- policy_state_dict = torch.load("pytorch_model.bin", map_location=torch.device("cpu"))
70
- cfg = EasyDict(Config.file_to_dict("policy_config.py"))
71
- # Instantiate the agent
72
- agent = DQNAgent(
73
- env="QbertNoFrameskip", exp_name="QbertNoFrameskip-v4-DQN", cfg=cfg.exp_config, policy_state_dict=policy_state_dict
74
- )
75
- # Continue training
76
- agent.train(step=5000)
77
- # Render the new agent performance
78
- agent.deploy(enable_save_replay=True)
79
-
80
  ```
81
  </details>
82
 
@@ -91,20 +75,7 @@ python3 -u run.py
91
  ```
92
  **run.py**
93
  ```python
94
- from ding.bonus import DQNAgent
95
- from huggingface_ding import pull_model_from_hub
96
-
97
- # Pull model from Hugggingface hub
98
- policy_state_dict, cfg = pull_model_from_hub(repo_id="OpenDILabCommunity/QbertNoFrameskip")
99
- # Instantiate the agent
100
- agent = DQNAgent(
101
- env="QbertNoFrameskip", exp_name="QbertNoFrameskip-v4-DQN", cfg=cfg.exp_config, policy_state_dict=policy_state_dict
102
- )
103
- # Continue training
104
- agent.train(step=5000)
105
- # Render the new agent performance
106
- agent.deploy(enable_save_replay=True)
107
-
108
  ```
109
  </details>
110
 
@@ -121,31 +92,7 @@ python3 -u train.py
121
  ```
122
  **train.py**
123
  ```python
124
- from ding.bonus import DQNAgent
125
- from huggingface_ding import push_model_to_hub
126
-
127
- # Instantiate the agent
128
- agent = DQNAgent(env="QbertNoFrameskip", exp_name="QbertNoFrameskip-v4-DQN")
129
- # Train the agent
130
- return_ = agent.train(step=int(10000000), collector_env_num=8, evaluator_env_num=8, debug=False)
131
- print("-----wandb url is----:", return_.wandb_url)
132
- # Push model to huggingface hub
133
- push_model_to_hub(
134
- agent=agent.best,
135
- env_name="OpenAI/Gym/Atari",
136
- task_name="QbertNoFrameskip-v4-DQN",
137
- algo_name="DQN",
138
- wandb_url=return_.wandb_url,
139
- github_repo_url="https://github.com/opendilab/DI-engine",
140
- github_doc_model_url="https://di-engine-docs.readthedocs.io/en/latest/12_policies/dqn.html",
141
- github_doc_env_url="https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html",
142
- installation_guide="pip3 install DI-engine[common_env]",
143
- usage_file_by_git_clone="./dqn/qbert_dqn_deploy.py",
144
- usage_file_by_huggingface_ding="./dqn/qbert_dqn_download.py",
145
- train_file="./dqn/qbert_dqn.py",
146
- repo_id="OpenDILabCommunity/QbertNoFrameskip-v4-DQN"
147
- )
148
-
149
  ```
150
  </details>
151
 
@@ -172,7 +119,8 @@ exp_config = {
172
  'env_id': 'QbertNoFrameskip-v4',
173
  'collector_env_num': 8,
174
  'evaluator_env_num': 8,
175
- 'fram_stack': 4
 
176
  },
177
  'policy': {
178
  'model': {
@@ -214,6 +162,7 @@ exp_config = {
214
  'render_freq': -1,
215
  'mode': 'train_iter'
216
  },
 
217
  'cfg_type': 'InteractionSerialEvaluatorDict',
218
  'stop_value': 30000,
219
  'n_episode': 8
@@ -258,7 +207,7 @@ exp_config = {
258
 
259
  **Training Procedure**
260
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
261
- - **Weights & Biases (wandb):** [monitor link](https://wandb.ai/ruoyugao/QbertNoFrameskip-v4-DQN)
262
 
263
  ## Model Information
264
  <!-- Provide the basic links for the model. -->
@@ -268,13 +217,13 @@ exp_config = {
268
  - **Demo:** [video](https://huggingface.co/OpenDILabCommunity/QbertNoFrameskip-v4-DQN/blob/main/replay.mp4)
269
  <!-- Provide the size information for the model. -->
270
  - **Parameters total size:** 55703.03 KB
271
- - **Last Update Date:** 2023-06-14
272
 
273
  ## Environments
274
  <!-- Address questions around what environment the model is intended to be trained and deployed at, including the necessary information needed to be provided for future users. -->
275
  - **Benchmark:** OpenAI/Gym/Atari
276
- - **Task:** QbertNoFrameskip-v4-DQN
277
  - **Gym version:** 0.25.1
278
  - **DI-engine version:** v0.4.8
279
- - **PyTorch version:** 1.7.1
280
  - **Doc**: [DI-engine-docs Environments link](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html)
 
6
  - deep-reinforcement-learning
7
  - reinforcement-learning
8
  - DI-engine
9
+ - QbertNoFrameskip-v4
10
  benchmark_name: OpenAI/Gym/Atari
11
+ task_name: QbertNoFrameskip-v4
12
  pipeline_tag: reinforcement-learning
13
  model-index:
14
  - name: DQN
 
17
  type: reinforcement-learning
18
  name: reinforcement-learning
19
  dataset:
20
+ name: OpenAI/Gym/Atari-QbertNoFrameskip-v4
21
+ type: OpenAI/Gym/Atari-QbertNoFrameskip-v4
22
  metrics:
23
  - type: mean_reward
24
+ value: 16375.0 +/- 0.0
25
  name: mean_reward
26
  ---
27
 
28
+ # Play **QbertNoFrameskip-v4** with **DQN** Policy
29
 
30
  ## Model Description
31
  <!-- Provide a longer summary of what this model is. -->
32
+ This is a simple **DQN** implementation to OpenAI/Gym/Atari **QbertNoFrameskip-v4** using the [DI-engine library](https://github.com/opendilab/di-engine) and the [DI-zoo](https://github.com/opendilab/DI-engine/tree/main/dizoo).
33
 
34
  **DI-engine** is a python library for solving general decision intelligence problems, which is based on implementations of reinforcement learning framework using PyTorch or JAX. This library aims to standardize the reinforcement learning framework across different algorithms, benchmarks, environments, and to support both academic researches and prototype applications. Besides, self-customized training pipelines and applications are supported by reusing different abstraction levels of DI-engine reinforcement learning framework.
35
 
 
60
  ```
61
  **run.py**
62
  ```python
63
+ # [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
  ```
65
  </details>
66
 
 
75
  ```
76
  **run.py**
77
  ```python
78
+ # [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
79
  ```
80
  </details>
81
 
 
92
  ```
93
  **train.py**
94
  ```python
95
+ # [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
96
  ```
97
  </details>
98
 
 
119
  'env_id': 'QbertNoFrameskip-v4',
120
  'collector_env_num': 8,
121
  'evaluator_env_num': 8,
122
+ 'fram_stack': 4,
123
+ 'env_wrapper': 'atari_default'
124
  },
125
  'policy': {
126
  'model': {
 
162
  'render_freq': -1,
163
  'mode': 'train_iter'
164
  },
165
+ 'figure_path': None,
166
  'cfg_type': 'InteractionSerialEvaluatorDict',
167
  'stop_value': 30000,
168
  'n_episode': 8
 
207
 
208
  **Training Procedure**
209
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
210
+ - **Weights & Biases (wandb):** [monitor link](https://wandb.ai/zjowowen/QbertNoFrameskip-v4-DQN)
211
 
212
  ## Model Information
213
  <!-- Provide the basic links for the model. -->
 
217
  - **Demo:** [video](https://huggingface.co/OpenDILabCommunity/QbertNoFrameskip-v4-DQN/blob/main/replay.mp4)
218
  <!-- Provide the size information for the model. -->
219
  - **Parameters total size:** 55703.03 KB
220
+ - **Last Update Date:** 2023-07-23
221
 
222
  ## Environments
223
  <!-- Address questions around what environment the model is intended to be trained and deployed at, including the necessary information needed to be provided for future users. -->
224
  - **Benchmark:** OpenAI/Gym/Atari
225
+ - **Task:** QbertNoFrameskip-v4
226
  - **Gym version:** 0.25.1
227
  - **DI-engine version:** v0.4.8
228
+ - **PyTorch version:** 2.0.1+cu117
229
  - **Doc**: [DI-engine-docs Environments link](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html)