willco-afk
commited on
Commit
•
63dcee5
1
Parent(s):
f5e1757
Upload folder using huggingface_hub
Browse files- README.md +8 -100
- q-learning.pkl +2 -2
- replay.mp4 +0 -0
- results.json +1 -0
README.md
CHANGED
@@ -1,102 +1,10 @@
|
|
1 |
-
---
|
2 |
-
tags:
|
3 |
-
- reinforcement-learning
|
4 |
-
- q-learning
|
5 |
-
- frozenlake
|
6 |
-
license: mit
|
7 |
-
library: gym
|
8 |
-
---
|
9 |
|
10 |
-
# Q-Learning
|
|
|
11 |
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
|
18 |
-
- **Environment**: FrozenLake-v1 (4x4 grid, no slippery surface)
|
19 |
-
- **Algorithm**: Q-learning
|
20 |
-
- **Action space**: 4 discrete actions (left, down, right, up)
|
21 |
-
- **State space**: 16 discrete states (grid cells)
|
22 |
-
- **Training duration**: Approximately [X hours] of training time.
|
23 |
-
|
24 |
-
## Usage
|
25 |
-
|
26 |
-
To use this model, you can load the trained Q-learning model from Hugging Face and run it in your environment.
|
27 |
-
|
28 |
-
```python
|
29 |
-
import gym
|
30 |
-
from huggingface_hub import hf_hub_download
|
31 |
-
import pickle
|
32 |
-
|
33 |
-
# Load the model
|
34 |
-
model_path = hf_hub_download(repo_id="willco-afk/q-FrozenLake-v1-4x4-noSlippery", filename="q-learning.pkl")
|
35 |
-
|
36 |
-
with open(model_path, 'rb') as f:
|
37 |
-
model = pickle.load(f)
|
38 |
-
|
39 |
-
# Setup the environment
|
40 |
-
env = gym.make("FrozenLake-v1", is_slippery=False)
|
41 |
-
|
42 |
-
# Run your agent
|
43 |
-
state = env.reset()
|
44 |
-
done = False
|
45 |
-
|
46 |
-
while not done:
|
47 |
-
action = model["qtable"].argmax(axis=1)[state] # Choose the action with the highest Q-value
|
48 |
-
state, reward, done, info = env.step(action)
|
49 |
-
|
50 |
-
if done:
|
51 |
-
print(f"Episode finished with reward: {reward}")
|
52 |
-
|
53 |
-
# Q-Learning Model for FrozenLake
|
54 |
-
|
55 |
-
This model is a **Q-learning** agent trained to solve the **FrozenLake-v1** environment from OpenAI Gym.
|
56 |
-
|
57 |
-
## Model Description
|
58 |
-
|
59 |
-
The model uses Q-learning, a reinforcement learning algorithm, to navigate the FrozenLake environment. The agent learns by interacting with the environment, receiving rewards or penalties, and updating its Q-table accordingly.
|
60 |
-
|
61 |
-
- **Environment**: FrozenLake-v1 (4x4 grid, no slippery surface)
|
62 |
-
- **Algorithm**: Q-learning
|
63 |
-
- **Action space**: 4 discrete actions (left, down, right, up)
|
64 |
-
- **State space**: 16 discrete states (grid cells)
|
65 |
-
- **Training duration**: Approximately [X hours] of training time.
|
66 |
-
|
67 |
-
## Usage
|
68 |
-
|
69 |
-
To use this model, you can load the trained Q-learning model from Hugging Face and run it in your environment.
|
70 |
-
|
71 |
-
```python
|
72 |
-
import gym
|
73 |
-
from huggingface_hub import hf_hub_download
|
74 |
-
import pickle
|
75 |
-
|
76 |
-
# Load the model
|
77 |
-
model_path = hf_hub_download(repo_id="willco-afk/q-FrozenLake-v1-4x4-noSlippery", filename="q-learning.pkl")
|
78 |
-
|
79 |
-
with open(model_path, 'rb') as f:
|
80 |
-
model = pickle.load(f)
|
81 |
-
|
82 |
-
# Setup the environment
|
83 |
-
env = gym.make("FrozenLake-v1", is_slippery=False)
|
84 |
-
|
85 |
-
# Run your agent
|
86 |
-
state = env.reset()
|
87 |
-
done = False
|
88 |
-
|
89 |
-
while not done:
|
90 |
-
action = model["qtable"].argmax(axis=1)[state] # Choose the action with the highest Q-value
|
91 |
-
state, reward, done, info = env.step(action)
|
92 |
-
|
93 |
-
if done:
|
94 |
-
print(f"Episode finished with reward: {reward}")
|
95 |
-
|
96 |
-
|
97 |
-
@misc{q-learning-frozenlake,
|
98 |
-
author = {William Copper},
|
99 |
-
title = {Q-Learning for FrozenLake-v1},
|
100 |
-
year = {2024},
|
101 |
-
howpublished = {\url{https://huggingface.co/willco-afk/q-FrozenLake-v1-4x4-noSlippery}},
|
102 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
|
2 |
+
# **Q-Learning** Agent playing FrozenLake-v1-4x4-no_slippery
|
3 |
+
This is a trained model of a **Q-Learning** agent playing **FrozenLake-v1-4x4-no_slippery**.
|
4 |
|
5 |
+
## Usage
|
6 |
+
```python
|
7 |
+
model = load_from_hub(repo_id="willco-afk/q-FrozenLake-v1-4x4-noSlippery", filename="q-learning.pkl")
|
8 |
+
env = gym.make(model["env_id"])
|
9 |
+
```
|
10 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
q-learning.pkl
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9e98a2445193951bbe55c4d271cd3af3deb7f204d9f8899f7872794548ec2640
|
3 |
+
size 914
|
replay.mp4
ADDED
Binary file (31.1 kB). View file
|
|
results.json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"env_id": "FrozenLake-v1", "mean_reward": 1.0, "n_eval_episodes": 100, "eval_datetime": "2024-12-22T15:27:53.029538"}
|