A(n) APPO model trained on the atari_atlantis environment.
This model was trained using Sample-Factory 2.0: https://github.com/alex-petrenko/sample-factory. Documentation for how to use Sample-Factory can be found at https://www.samplefactory.dev/
Downloading the model
After installing Sample-Factory, download the model with:
python -m sample_factory.huggingface.load_from_hub -r MattStammers/appo-atari-atlantis
Using the model
To run the model after download, use the enjoy
script corresponding to this environment:
python -m sf_examples.atari.enjoy_atari --algo=APPO --env=atari_atlantis --train_dir=./train_dir --experiment=appo-atari-atlantis
You can also upload models to the Hugging Face Hub using the same script with the --push_to_hub
flag.
See https://www.samplefactory.dev/10-huggingface/huggingface/ for more details
Training with this model
To continue training with this model, use the train
script corresponding to this environment:
python -m sf_examples.atari.train_atari --algo=APPO --env=atari_atlantis --train_dir=./train_dir --experiment=appo-atari-atlantis --restart_behavior=resume --train_for_env_steps=10000000000
Note, you may have to adjust --train_for_env_steps
to a suitably high number as the experiment will resume at the number of steps it concluded at.
SOTA Performance
This model as with all the others was trained at 10 million steps to create a baseline. Interestingly, in this environment, it reaches SOTA performance at even this level suggesting that the Atlantis game is pretty easy to beat.
For more information on this environment see: https://www.endtoend.ai/envs/gym/atari/atlantis/. Because rewards are plentiful and the Gorgons have to pass 4 times to reach attack range the environment is relatively easy to reach SOTA on.
I have now compared this with the performance of the TQC, SAC and the DQN models which all underperformed PPO. I now consider this atari environment solved.
- Downloads last month
- 0
Evaluation results
- mean_reward on atari_atlantisself-reported927640.00 +/- 10444.54