RL Environment Wrappers
class simulate.RLEnv
< source >( scene: Scene time_step: typing.Optional[float] = 0.03333333333333333 frame_skip: typing.Optional[int] = 4 )
Parameters
The basic RL environment wrapper for Simulate scene following the Gym API.
Close the scene.
Return a class attribute by name.
reset
< source >(
)
β
obs (Dict
)
Returns
obs (Dict
)
the observation of the environment after reset.
Resets the actors and the scene of the environment.
sample_action
< source >(
)
β
action (list[list[list[float]]]
)
Returns
action (list[list[list[float]]]
)
Lists of the actions, dimensions are n-maps, n-actors, action-dim.
Samples an action from the actors in the environment. This function loads the configuration of maps and actors to return the correct shape across multiple configurations.
step
< source >(
action: typing.Union[typing.Dict, typing.List, numpy.ndarray]
)
β
observation (Dict
)
The step function for the environment, follows the API from OpenAI Gym.
TODO verify, a dict with actuator tags as keys and as values a Tensor of shape (n_show, n_actors, n_actions)
Step the environment asynchronously.
step_recv_async
< source >(
)
β
observation (Dict
)
Returns
observation (Dict
)
A dictionary containing the observation from the environment.
reward (float
):
The reward for the action.
done (bool
):
Whether the episode has ended.
info (Dict
):
A dictionary of additional information.
Receive the response from the environment asynchronously.
step_send_async
< source >( action: typing.Union[typing.Dict, typing.List, numpy.ndarray] )
Send action for execution asynchronously.
class simulate.ParallelRLEnv
< source >( map_fn: typing.Union[typing.Callable, simulate.scene.Scene] n_maps: typing.Optional[int] = 1 n_show: typing.Optional[int] = 1 time_step: typing.Optional[float] = 0.03333333333333333 frame_skip: typing.Optional[int] = 4 **engine_kwargs )
RL environment wrapper for Simulate scene. Uses functionality from the VecEnv in stable baselines 3 For more information on VecEnv, see the source https://stable-baselines3.readthedocs.io/en/master/guide/vec_envs.html
Close the environment.
env_is_wrapped
< source >( wrapper_class: typing.Type[gym.core.Wrapper] indices: typing.Union[NoneType, int, typing.Iterable[int]] = None )
Check if the environment is wrapped.
reset
< source >(
)
β
obs (Dict
)
Returns
obs (Dict
)
the observation of the environment after reset.
Resets the actors and the scene of the environment.
sample_action
< source >(
)
β
action (list[list[list[float]]]
)
Returns
action (list[list[list[float]]]
)
Lists of the actions, dimensions are n-maps, n-actors, action-dim.
Samples an action from the actors in the environment. This function loads the configuration of maps and actors to return the correct shape across multiple configurations.
step
< source >(
action: typing.Union[typing.Dict, typing.List, numpy.ndarray]
)
β
all_observation (List[Dict]
)
Parameters
Returns
all_observation (List[Dict]
)
a list of dict of observations for each sensor.
all_reward (List[float]
):
all the rewards for the current step.
all_done (List[bool]
):
whether each episode is done.
all_info (List[Dict]
):
a list of dict of additional information.
The step function for the environment, follows the API from OpenAI Gym.
TODO verify, a dict with actuator tags as keys and as values a Tensor of shape (n_show, n_actors, n_actions)
step_recv_async
< source >(
)
β
obs (Dict
)
Returns
obs (Dict
)
A dict of observations for each sensor.
reward (float
):
The reward for the current step.
done (bool
):
Whether the episode is done.
info (Dict
):
A dict of additional information.
Receive the response of a step from the environment asynchronously.
step_send_async
< source >( action: typing.Union[typing.Dict, typing.List, numpy.ndarray] )
Send a step to the environment asynchronously.
class simulate.MultiProcessRLEnv
< source >( env_fn: typing.Callable n_parallel: int starting_port: int = 55001 )
Multi-process RL environment wrapper for Simulate scene. Spawns multiple backend executables to run in parallel, in addition to the optionality of multiple maps. Uses functionality from the VecEnv in stable baselines 3. For more information on VecEnv, see the source https://stable-baselines3.readthedocs.io/en/master/guide/vec_envs.html
step
< source >(
actions: typing.Union[list, <built-in function array>, NoneType] = None
)
β
all_observation (Dict
)
The step function for the environment, follows the API from OpenAI Gym.