What is the training command for this model (mainly about the 2B)?

#2
by ksridhar - opened

Hey @edbeeching ,

Thank you very much for these policies and for the JAT dataset created with these policies.

I'm trying to create similar policies for some Atari envs outside the Atari 57.

Is the training command used to create these with sample factory the following:

python -m sf_examples.atari.train_atari --algo=APPO --env=${ENV} --train_for_env_steps=2000000000 --experiment="atari_2B_${ENV}_1111"

In particular, does the 2B in model name mean 2 billion train_for_env_steps? The default value of this appears to be 100 million.

Thank you,
Kaustubh

Actually, nvm, I think I found the training command inside here: https://huggingface.co/edbeeching/atari_2B_atari_mspacman_1111/blob/main/cfg.json#L124

One last question: When I try to run the command above, I get an error as follows

train_atari.py: error: unrecognized arguments: --env_agents=512

Can I just remove --env_agents=512? @edbeeching

Edit:

The reason I got the above error was becuase I was using sf_examples.atari.train_atari instead of sf_examples.envpool.atari.train_envpool_atari.

The correct training command is

python -m sf_examples.envpool.atari.train_envpool_atari --seed=1111 --experiment=atari_2B_${ENV}_1111 --env=${ENV} --train_for_seconds=3600000 --algo=APPO --gamma=0.99 --num_workers=4 --num_envs_per_worker=1 --worker_num_splits=1 --env_agents=512 --benchmark=False --max_grad_norm=0.0 --decorrelate_experience_max_seconds=1 --encoder_conv_architecture=convnet_atari --encoder_conv_mlp_layers 512 --nonlinearity=relu --num_policies=1 --normalize_input=True --normalize_input_keys obs --normalize_returns=True --async_rl=True --batched_sampling=True --train_for_env_steps=2000000000 --save_milestones_sec=1200 --train_dir train_dir --rollout 64 --exploration_loss_coeff 0.0004677351413 --num_epochs 2 --batch_size 1024 --num_batches_per_epoch 8 --learning_rate 0.0003033891184
ksridhar changed discussion status to closed

Sign up or log in to comment