POCA SoccerTwos Self-Play Agent

This repository contains a trained :contentReference[oaicite:0]{index=0} POCA self-play agent for the SoccerTwos environment.

The model was trained using:

  • Proximal Policy Optimization with Centralized Critic (POCA)
  • Self-play training
  • Unity ML-Agents
  • Multi-agent reinforcement learning

Environment

The agent was trained on the official Unity SoccerTwos environment.

You can watch or run the environment here:

:contentReference[oaicite:1]{index=1}


Training Details

Trainer

  • POCA (multi-agent centralized critic)

Environment

  • SoccerTwos
  • 2v2 competitive soccer environment

Training Setup

  • Self-play enabled
  • Multi-agent cooperative + competitive training
  • Long-horizon reinforcement learning

Hyperparameters

behaviors:
  SoccerTwos:
    trainer_type: poca

    hyperparameters:
      batch_size: 2048
      buffer_size: 20480
      learning_rate: 0.0003
      beta: 0.005
      epsilon: 0.2
      lambd: 0.95
      num_epoch: 3
      learning_rate_schedule: constant

    network_settings:
      normalize: false
      hidden_units: 512
      num_layers: 2
      vis_encode_type: simple

    reward_signals:
      extrinsic:
        gamma: 0.99
        strength: 1.0

    keep_checkpoints: 5
    max_steps: 50000000
    time_horizon: 1000
Downloads last month
37
Video Preview
loading