JiajingChen
commited on
Commit
•
b35bb01
1
Parent(s):
22b1fc0
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -1,37 +1,56 @@
|
|
1 |
---
|
2 |
-
library_name:
|
3 |
tags:
|
4 |
-
- PandaReachDense-v3
|
5 |
- deep-reinforcement-learning
|
6 |
- reinforcement-learning
|
7 |
-
-
|
8 |
model-index:
|
9 |
-
- name:
|
10 |
results:
|
11 |
- task:
|
12 |
type: reinforcement-learning
|
13 |
name: reinforcement-learning
|
14 |
dataset:
|
15 |
-
name:
|
16 |
-
type:
|
17 |
metrics:
|
18 |
- type: mean_reward
|
19 |
-
value:
|
20 |
name: mean_reward
|
21 |
verified: false
|
22 |
---
|
23 |
|
24 |
-
|
25 |
-
This is a trained model of a **A2C** agent playing **PandaReachDense-v3**
|
26 |
-
using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
|
27 |
|
28 |
-
|
29 |
-
|
30 |
|
31 |
|
32 |
-
|
33 |
-
from stable_baselines3 import ...
|
34 |
-
from huggingface_sb3 import load_from_hub
|
35 |
|
36 |
-
|
37 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
library_name: sample-factory
|
3 |
tags:
|
|
|
4 |
- deep-reinforcement-learning
|
5 |
- reinforcement-learning
|
6 |
+
- sample-factory
|
7 |
model-index:
|
8 |
+
- name: APPO
|
9 |
results:
|
10 |
- task:
|
11 |
type: reinforcement-learning
|
12 |
name: reinforcement-learning
|
13 |
dataset:
|
14 |
+
name: doom_health_gathering_supreme
|
15 |
+
type: doom_health_gathering_supreme
|
16 |
metrics:
|
17 |
- type: mean_reward
|
18 |
+
value: 9.92 +/- 2.85
|
19 |
name: mean_reward
|
20 |
verified: false
|
21 |
---
|
22 |
|
23 |
+
A(n) **APPO** model trained on the **doom_health_gathering_supreme** environment.
|
|
|
|
|
24 |
|
25 |
+
This model was trained using Sample-Factory 2.0: https://github.com/alex-petrenko/sample-factory.
|
26 |
+
Documentation for how to use Sample-Factory can be found at https://www.samplefactory.dev/
|
27 |
|
28 |
|
29 |
+
## Downloading the model
|
|
|
|
|
30 |
|
31 |
+
After installing Sample-Factory, download the model with:
|
32 |
```
|
33 |
+
python -m sample_factory.huggingface.load_from_hub -r JiajingChen/1
|
34 |
+
```
|
35 |
+
|
36 |
+
|
37 |
+
## Using the model
|
38 |
+
|
39 |
+
To run the model after download, use the `enjoy` script corresponding to this environment:
|
40 |
+
```
|
41 |
+
python -m .usr.local.lib.python3.10.dist-packages.colab_kernel_launcher --algo=APPO --env=doom_health_gathering_supreme --train_dir=./train_dir --experiment=1
|
42 |
+
```
|
43 |
+
|
44 |
+
|
45 |
+
You can also upload models to the Hugging Face Hub using the same script with the `--push_to_hub` flag.
|
46 |
+
See https://www.samplefactory.dev/10-huggingface/huggingface/ for more details
|
47 |
+
|
48 |
+
## Training with this model
|
49 |
+
|
50 |
+
To continue training with this model, use the `train` script corresponding to this environment:
|
51 |
+
```
|
52 |
+
python -m .usr.local.lib.python3.10.dist-packages.colab_kernel_launcher --algo=APPO --env=doom_health_gathering_supreme --train_dir=./train_dir --experiment=1 --restart_behavior=resume --train_for_env_steps=10000000000
|
53 |
+
```
|
54 |
+
|
55 |
+
Note, you may have to adjust `--train_for_env_steps` to a suitably high number as the experiment will resume at the number of steps it concluded at.
|
56 |
+
|