ppo-Pyramids / README.md

sam522

Upload Pyramids PPO model for Deep RL Course Unit 5

bfadc6c verified 18 days ago

preview code

raw

history blame contribute delete

2.24 kB

metadata

tags:
  - ML-Agents-Pyramids
  - ppo
  - deep-reinforcement-learning
  - reinforcement-learning
  - ml-agents
model-index:
  - name: PPO
    results:
      - task:
          type: reinforcement-learning
          name: reinforcement-learning
        dataset:
          name: ML-Agents-Pyramids
          type: ML-Agents-Pyramids
        metrics:
          - type: mean_reward
            value: 5.10 +/- 0.85
            name: mean_reward
            verified: false

PPO Agent playing ML-Agents-Pyramids

This is a trained model of a PPO agent playing ML-Agents-Pyramids using Unity ML-Agents.

Usage

import torch
import numpy as np

# Load the model (you'll need the network architecture)
checkpoint = torch.load("model.pt", map_location='cpu')

# The model can be used with the Pyramids environment
# See the repository for complete usage instructions

Training Results

Mean reward: 5.10 ± 0.85
Average pyramids completed: 5.0 per episode
Training episodes: 3,000
Target achievement: ✅ SUCCESS (target: 1.75)

Algorithm Details

Algorithm: Proximal Policy Optimization (PPO)
Environment: ML-Agents-Pyramids
Task: Multi-step pyramid completion with curiosity-driven exploration
Network: Deep neural network with curiosity mechanism
Training Framework: PyTorch

Task Description

The agent learns to:

Find and press buttons to spawn pyramids
Navigate to pyramids and knock them over
Collect gold bricks from fallen pyramids
Repeat efficiently to maximize score

This complex task requires:

Exploration in sparse reward environment
Multi-step planning and execution
Spatial navigation and object interaction

Performance Milestones

Episodes 0-500: Learning basic movement and object interaction
Episodes 500-1500: Developing pyramid completion strategy
Episodes 1500-3000: Optimizing efficiency and consistency

Training Environment

Environment: ML-Agents-Pyramids
Framework: Custom PyTorch implementation with ML-Agents compatibility
Training date: 2025-09-05
Course: Hugging Face Deep RL Course Unit 5

This model was trained as part of the Hugging Face Deep RL Course.