second experiment with PPO model: more epoch, ent_coef tune 0ec2485 adhisetiawan commited on Feb 23, 2023