Add model card for ablation_baseline_MiniHack_Room_5x5_v0_20250919-143836
Browse files
README.md
CHANGED
|
@@ -19,7 +19,7 @@ This repository contains a complete Sequential Skill RL model trained on NetHack
|
|
| 19 |
|
| 20 |
### 1. PPO Policy (`ppo_policy.pth`)
|
| 21 |
- **Type**: Proximal Policy Optimization agent
|
| 22 |
-
- **Environment**: MiniHack-Room-
|
| 23 |
- **Training Steps**: 50,000
|
| 24 |
- **Features**:
|
| 25 |
- Curiosity-driven exploration: True
|
|
@@ -52,7 +52,7 @@ hmm_data = torch.load('hmm_model.pth', map_location=device)
|
|
| 52 |
|
| 53 |
# Use for inference or continued training
|
| 54 |
results = train_online_ppo_with_pretrained_models(
|
| 55 |
-
env_name="MiniHack-Room-
|
| 56 |
vae_repo_id="CatkinChen/nethack-vae-hmm",
|
| 57 |
hmm_repo_id="CatkinChen/nethack-hmm",
|
| 58 |
test_mode=True
|
|
@@ -61,10 +61,10 @@ results = train_online_ppo_with_pretrained_models(
|
|
| 61 |
|
| 62 |
## Training Configuration
|
| 63 |
|
| 64 |
-
- **Environment**: MiniHack-Room-
|
| 65 |
- **Learning Rate**: 0.0005
|
| 66 |
- **Batch Size**: 32
|
| 67 |
-
- **Training Time**:
|
| 68 |
- **Device**: cuda
|
| 69 |
- **Seed**: None
|
| 70 |
|
|
@@ -74,4 +74,4 @@ Training completed successfully with the following configuration:
|
|
| 74 |
- Curiosity-driven exploration: True
|
| 75 |
- Random Network Distillation: False
|
| 76 |
|
| 77 |
-
Generated on: 2025-09-19 14:
|
|
|
|
| 19 |
|
| 20 |
### 1. PPO Policy (`ppo_policy.pth`)
|
| 21 |
- **Type**: Proximal Policy Optimization agent
|
| 22 |
+
- **Environment**: MiniHack-Room-5x5-v0
|
| 23 |
- **Training Steps**: 50,000
|
| 24 |
- **Features**:
|
| 25 |
- Curiosity-driven exploration: True
|
|
|
|
| 52 |
|
| 53 |
# Use for inference or continued training
|
| 54 |
results = train_online_ppo_with_pretrained_models(
|
| 55 |
+
env_name="MiniHack-Room-5x5-v0",
|
| 56 |
vae_repo_id="CatkinChen/nethack-vae-hmm",
|
| 57 |
hmm_repo_id="CatkinChen/nethack-hmm",
|
| 58 |
test_mode=True
|
|
|
|
| 61 |
|
| 62 |
## Training Configuration
|
| 63 |
|
| 64 |
+
- **Environment**: MiniHack-Room-5x5-v0
|
| 65 |
- **Learning Rate**: 0.0005
|
| 66 |
- **Batch Size**: 32
|
| 67 |
+
- **Training Time**: 0.02 seconds
|
| 68 |
- **Device**: cuda
|
| 69 |
- **Seed**: None
|
| 70 |
|
|
|
|
| 74 |
- Curiosity-driven exploration: True
|
| 75 |
- Random Network Distillation: False
|
| 76 |
|
| 77 |
+
Generated on: 2025-09-19 14:38:50
|