ppo Agent playing SnowballTarget

This is a trained model of a ppo agent playing SnowballTarget using the Unity ML-Agents Library.

Results

-[INFO] SnowballTarget. -Step: 400000. -Time Elapsed: 903.639 s. -Mean Reward: 25.591. -Std of Reward: 1.992.

Hyperparameters

%%file /content/ml-agents/config/ppo/SnowballTarget.yaml

behaviors:
  SnowballTarget:
    trainer_type: ppo
    summary_freq: 10000
    keep_checkpoints: 10
    checkpoint_interval: 50000
    max_steps: 400000
    time_horizon: 32
    threaded: true
    hyperparameters:
      learning_rate: 0.0003
      learning_rate_schedule: linear
      batch_size: 128
      buffer_size: 2048
      beta: 0.005
      epsilon: 0.2
      lambd: 0.95
      num_epoch: 3
    network_settings:
      normalize: false
      hidden_units: 256
      num_layers: 3
      vis_encode_type: nature_cnn
    reward_signals:
      extrinsic:
        gamma: 0.9
        strength: 1.0

Resume the training

mlagents-learn <your_configuration_file_path.yaml> --run-id=<run_id> --resume

Watch your Agent play

You can watch your agent playing directly in your browser

If the environment is part of ML-Agents official environments, go to https://huggingface.co/unity
Step 1: Find your model_id: enrique2701/ppo-SnowballTarget
Step 2: Select your .nn /.onnx file
Click on Watch the agent play 👀