willco-afk
commited on
Commit
•
f5e1757
1
Parent(s):
ef3ddcc
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,55 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# Q-Learning Model for FrozenLake
|
2 |
|
3 |
This model is a **Q-learning** agent trained to solve the **FrozenLake-v1** environment from OpenAI Gym.
|
|
|
1 |
+
---
|
2 |
+
tags:
|
3 |
+
- reinforcement-learning
|
4 |
+
- q-learning
|
5 |
+
- frozenlake
|
6 |
+
license: mit
|
7 |
+
library: gym
|
8 |
+
---
|
9 |
+
|
10 |
+
# Q-Learning Model for FrozenLake
|
11 |
+
|
12 |
+
This model is a **Q-learning** agent trained to solve the **FrozenLake-v1** environment from OpenAI Gym.
|
13 |
+
|
14 |
+
## Model Description
|
15 |
+
|
16 |
+
The model uses Q-learning, a reinforcement learning algorithm, to navigate the FrozenLake environment. The agent learns by interacting with the environment, receiving rewards or penalties, and updating its Q-table accordingly.
|
17 |
+
|
18 |
+
- **Environment**: FrozenLake-v1 (4x4 grid, no slippery surface)
|
19 |
+
- **Algorithm**: Q-learning
|
20 |
+
- **Action space**: 4 discrete actions (left, down, right, up)
|
21 |
+
- **State space**: 16 discrete states (grid cells)
|
22 |
+
- **Training duration**: Approximately [X hours] of training time.
|
23 |
+
|
24 |
+
## Usage
|
25 |
+
|
26 |
+
To use this model, you can load the trained Q-learning model from Hugging Face and run it in your environment.
|
27 |
+
|
28 |
+
```python
|
29 |
+
import gym
|
30 |
+
from huggingface_hub import hf_hub_download
|
31 |
+
import pickle
|
32 |
+
|
33 |
+
# Load the model
|
34 |
+
model_path = hf_hub_download(repo_id="willco-afk/q-FrozenLake-v1-4x4-noSlippery", filename="q-learning.pkl")
|
35 |
+
|
36 |
+
with open(model_path, 'rb') as f:
|
37 |
+
model = pickle.load(f)
|
38 |
+
|
39 |
+
# Setup the environment
|
40 |
+
env = gym.make("FrozenLake-v1", is_slippery=False)
|
41 |
+
|
42 |
+
# Run your agent
|
43 |
+
state = env.reset()
|
44 |
+
done = False
|
45 |
+
|
46 |
+
while not done:
|
47 |
+
action = model["qtable"].argmax(axis=1)[state] # Choose the action with the highest Q-value
|
48 |
+
state, reward, done, info = env.step(action)
|
49 |
+
|
50 |
+
if done:
|
51 |
+
print(f"Episode finished with reward: {reward}")
|
52 |
+
|
53 |
# Q-Learning Model for FrozenLake
|
54 |
|
55 |
This model is a **Q-learning** agent trained to solve the **FrozenLake-v1** environment from OpenAI Gym.
|