RL Environment Wrappers

class simulate.RLEnv

< source >

( scene: Scene time_step: typing.Optional[float] = 0.03333333333333333 frame_skip: typing.Optional[int] = 4 )

The basic RL environment wrapper for Simulate scene following the Gym API.

reset

< source >

( ) → obs (Dict)

Returns

obs (Dict)

the observation of the environment after reset.

Resets the actors and the scene of the environment.

sample_action

< source >

( ) → action

Returns

action

TODO

Samples an action from the actors in the environment. This function loads the configuration of maps and actors to return the correct shape across multiple configurations.

step

< source >

( action: typing.Union[typing.Dict, typing.List, numpy.ndarray] ) → observation (Dict)

Parameters

action (Dict or List) — TODO verify, a dict with actuator tags as keys and as values a Tensor of shape (n_show, n_actors, n_actions)

Returns

observation (Dict)

TODO reward (float): TODO done (bool): TODO info: TODO

The step function for the environment, follows the API from OpenAI Gym.

class simulate.ParallelRLEnv

< source >

( scene_or_map_fn: typing.Union[typing.Callable, simulate.scene.Scene] n_maps: typing.Optional[int] = 1 n_show: typing.Optional[int] = 1 time_step: typing.Optional[float] = 0.03333333333333333 frame_skip: typing.Optional[int] = 4 **engine_kwargs )

RL environment wrapper for Simulate scene. Uses functionality from the VecEnv in stable baselines 3 For more information on VecEnv, see the source https://stable-baselines3.readthedocs.io/en/master/guide/vec_envs.html

reset

< source >

( ) → obs (Dict)

Returns

obs (Dict)

the observation of the environment after reset.

Resets the actors and the scene of the environment.

sample_action

< source >

( ) → action

Returns

action

TODO

Samples an action from the actors in the environment. This function loads the configuration of maps and actors to return the correct shape across multiple configurations.

step

< source >

( action: typing.Union[typing.Dict, typing.List, numpy.ndarray] ) → observation (Dict)

Parameters

action (Dict or List) — TODO verify, a dict with actuator tags as keys and as values a Tensor of shape (n_show, n_actors, n_actions)

Returns

observation (Dict)

TODO reward (float): TODO done (bool): TODO info: TODO

The step function for the environment, follows the API from OpenAI Gym.

class simulate.MultiProcessRLEnv

< source >

( env_fn: typing.Callable n_parallel: int starting_port: int = 55001 )

Parameters

env_fn (Callable) — a generator function that returns a RLEnv / ParallelRLEnv for generating instances of the desired environment.
n_parallel (int) — the number of executable instances to create.
starting_port (int) — initial communication port for spawned executables.

Multi-process RL environment wrapper for Simulate scene. Spawns multiple backend executables to run in parallel, in addition to the optionality of multiple maps. Uses functionality from the VecEnv in stable baselines 3. For more information on VecEnv, see the source https://stable-baselines3.readthedocs.io/en/master/guide/vec_envs.html

step

< source >

( actions: typing.Union[<built-in function array>, NoneType] = None ) → all_observation (Dict)

Parameters

actions (Dict or List) — TODO verify, a dict with actuator tags as keys and as values a Tensor of shape (n_show, n_actors, n_actions)

Returns

all_observation (Dict)

TODO all_reward (float): TODO all_done (bool): TODO all_info: TODO

The step function for the environment, follows the API from OpenAI Gym.