Vector Environment Wrappers#

class gymnasium.experimental.vector.VectorWrapper(env: VectorEnv)[source]#

Wraps the vectorized environment to allow a modular transformation.

This class is the base class for all wrappers for vectorized environments. The subclass could override some methods to change the behavior of the original vectorized environment without touching the original code.

Note

Don’t forget to call super().__init__(env) if the subclass overrides __init__().

Initialize the vectorized environment wrapper.

Vector Observation Wrappers#

class gymnasium.experimental.vector.VectorObservationWrapper(env: VectorEnv)[source]#

Wraps the vectorized environment to allow a modular transformation of the observation. Equivalent to gym.ObservationWrapper for vectorized environments.

Initialize the vectorized environment wrapper.

class gymnasium.experimental.wrappers.vector.LambdaObservationV0(env: VectorEnv, vector_func: Callable[[ObsType], Any], single_func: Callable[[ObsType], Any], observation_space: Space | None = None)[source]#

Transforms an observation via a function provided to the wrapper.

The function func will be applied to all vector observations. If the observations from func are outside the bounds of the env’s observation space, provide an observation_space.

Parameters:

env – The vector environment to wrap
vector_func – A function that will transform the vector observation. If this transformed observation is outside the observation space of env.observation_space then provide an observation_space.
single_func – A function that will transform an individual observation.
observation_space – The observation spaces of the wrapper, if None, then it is assumed the same as env.observation_space.

class gymnasium.experimental.wrappers.vector.FilterObservationV0(env: VectorEnv, filter_keys: Sequence[str | int])[source]#

Vector wrapper for filtering dict or tuple observation spaces.

Parameters:

env – The vector environment to wrap
filter_keys – The subspaces to be included, use a list of strings or integers for Dict and Tuple spaces respectivesly

class gymnasium.experimental.wrappers.vector.FlattenObservationV0(env: VectorEnv)[source]#

Observation wrapper that flattens the observation.

Parameters:: env – The vector environment to wrap

class gymnasium.experimental.wrappers.vector.GrayscaleObservationV0(env: VectorEnv, keep_dim: bool = False)[source]#

Observation wrapper that converts an RGB image to grayscale.

Parameters:

env – The vector environment to wrap
keep_dim – If to keep the channel in the observation, if True, obs.shape == 3 else obs.shape == 2

class gymnasium.experimental.wrappers.vector.ResizeObservationV0(env: VectorEnv, shape: tuple[int, ...])[source]#

Resizes image observations using OpenCV to shape.

Parameters:

env – The vector environment to wrap
shape – The resized observation shape

class gymnasium.experimental.wrappers.vector.ReshapeObservationV0(env: VectorEnv, shape: int | tuple[int, ...])[source]#

Reshapes array based observations to shapes.

Parameters:

env – The vector environment to wrap
shape – The reshaped observation space

class gymnasium.experimental.wrappers.vector.RescaleObservationV0(env: VectorEnv, min_obs: np.floating | np.integer | np.ndarray, max_obs: np.floating | np.integer | np.ndarray)[source]#

Linearly rescales observation to between a minimum and maximum value.

Parameters:

env – The vector environment to wrap
min_obs – The new minimum observation bound
max_obs – The new maximum observation bound

class gymnasium.experimental.wrappers.vector.DtypeObservationV0(env: VectorEnv, dtype: Any)[source]#

Observation wrapper for transforming the dtype of an observation.

Parameters:

env – The vector environment to wrap
dtype – The new dtype of the observation

Vector Action Wrappers#

class gymnasium.experimental.vector.VectorActionWrapper(env: VectorEnv)[source]#

Wraps the vectorized environment to allow a modular transformation of the actions. Equivalent of ActionWrapper for vectorized environments.

Initialize the vectorized environment wrapper.

class gymnasium.experimental.wrappers.vector.LambdaActionV0(env: VectorEnv, func: Callable[[ActType], Any], action_space: Space | None = None)[source]#

Transforms an action via a function provided to the wrapper.

The function func will be applied to all vector actions. If the observations from func are outside the bounds of the env’s action space, provide an action_space.

Parameters:

env – The vector environment to wrap
func – A function that will transform an action. If this transformed action is outside the action space of env.action_space then provide an action_space.
action_space – The action spaces of the wrapper, if None, then it is assumed the same as env.action_space.

class gymnasium.experimental.wrappers.vector.ClipActionV0(env: VectorEnv)[source]#

Clip the continuous action within the valid Box observation space bound.

Parameters:: env – The vector environment to wrap

class gymnasium.experimental.wrappers.vector.RescaleActionV0(env: VectorEnv, min_action: float | int | np.ndarray, max_action: float | int | np.ndarray)[source]#

Affinely rescales the continuous action space of the environment to the range [min_action, max_action].

Parameters:

env (Env) – The vector environment to wrap
min_action (float, int or np.ndarray) – The min values for each action. This may be a numpy array or a scalar.
max_action (float, int or np.ndarray) – The max values for each action. This may be a numpy array or a scalar.

Vector Reward Wrappers#

class gymnasium.experimental.vector.VectorRewardWrapper(env: VectorEnv)[source]#

Wraps the vectorized environment to allow a modular transformation of the reward. Equivalent of RewardWrapper for vectorized environments.

Initialize the vectorized environment wrapper.

class gymnasium.experimental.wrappers.vector.LambdaRewardV0(env: VectorEnv, func: Callable[[ArrayType], ArrayType])[source]#

A reward wrapper that allows a custom function to modify the step reward.

Parameters:

env (Env) – The vector environment to wrap
func – (Callable): The function to apply to reward

class gymnasium.experimental.wrappers.vector.ClipRewardV0(env: VectorEnv, min_reward: float | np.ndarray | None = None, max_reward: float | np.ndarray | None = None)[source]#

A wrapper that clips the rewards for an environment between an upper and lower bound.

Parameters:

env – The vector environment to wrap
min_reward – The min reward for each step
max_reward – the max reward for each step

More Vector Wrappers#

class gymnasium.experimental.wrappers.vector.RecordEpisodeStatisticsV0(env: VectorEnv, deque_size: int = 100)[source]#

This wrapper will keep track of cumulative rewards and episode lengths.

At the end of an episode, the statistics of the episode will be added to info using the key episode. If using a vectorized environment also the key _episode is used which indicates whether the env at the respective index has the episode statistics.

After the completion of an episode, info will look like this:

>>> info = {  
...     ...
...     "episode": {
...         "r": "<cumulative reward>",
...         "l": "<episode length>",
...         "t": "<elapsed time since beginning of episode>"
...     },
... }

For a vectorized environments the output will be in the form of:

>>> infos = {  
...     ...
...     "episode": {
...         "r": "<array of cumulative reward for each done sub-environment>",
...         "l": "<array of episode length for each done sub-environment>",
...         "t": "<array of elapsed time since beginning of episode for each done sub-environment>"
...     },
...     "_episode": "<boolean array of length num-envs>"
... }

Moreover, the most recent rewards and episode lengths are stored in buffers that can be accessed via wrapped_env.return_queue and wrapped_env.length_queue respectively.

Variables:

return_queue – The cumulative rewards of the last deque_size-many episodes
length_queue – The lengths of the last deque_size-many episodes

Parameters:

env (Env) – The environment to apply the wrapper
deque_size – The size of the buffers return_queue and length_queue

class gymnasium.experimental.wrappers.vector.DictInfoToListV0(env: VectorEnv)[source]#

Converts infos of vectorized environments from dict to List[dict].

This wrapper converts the info format of a vector environment from a dictionary to a list of dictionaries. This wrapper is intended to be used around vectorized environments. If using other wrappers that perform operation on info like RecordEpisodeStatistics this need to be the outermost wrapper.

i.e. DictInfoToListV0(RecordEpisodeStatisticsV0(vector_env))

Example:

>>> import numpy as np
>>> dict_info = {
...      "k": np.array([0., 0., 0.5, 0.3]),
...      "_k": np.array([False, False, True, True])
...  }
>>> list_info = [{}, {}, {"k": 0.5}, {"k": 0.3}]

Parameters:: env (Env) – The environment to apply the wrapper

class gymnasium.experimental.wrappers.vector.VectorizeLambdaObservationV0(env: VectorEnv, wrapper: type[LambdaObservationV0], **kwargs: Any)[source]#

Vectori`es a single-agent lambda observation wrapper for vector environments.

Parameters:

env – The vector environment to wrap.
wrapper – The wrapper to vectorize
**kwargs – Keyword argument for the wrapper

class gymnasium.experimental.wrappers.vector.VectorizeLambdaActionV0(env: VectorEnv, wrapper: type[LambdaActionV0], **kwargs: Any)[source]#

Vectorizes a single-agent lambda action wrapper for vector environments.

Parameters:

env – The vector environment to wrap
wrapper – The wrapper to vectorize
**kwargs – Arguments for the LambdaActionV0 wrapper

class gymnasium.experimental.wrappers.vector.VectorizeLambdaRewardV0(env: VectorEnv, wrapper: type[LambdaRewardV0], **kwargs: Any)[source]#

Vectorizes a single-agent lambda reward wrapper for vector environments.

Parameters:

env – The vector environment to wrap.
wrapper – The wrapper to vectorize
**kwargs – Keyword argument for the wrapper

class gymnasium.experimental.wrappers.vector.JaxToNumpyV0(env: VectorEnv)[source]#

Wraps a jax vector environment so that it can be interacted with through numpy arrays.

Notes

A vectorized version of gymnasium.experimental.wrappers.JaxToNumpyV0

Actions must be provided as numpy arrays and observations, rewards, terminations and truncations will be returned as numpy arrays.

Parameters:: env – the vector jax environment to wrap

class gymnasium.experimental.wrappers.vector.JaxToTorchV0(env: VectorEnv, device: Device | None = None)[source]#

Wraps a Jax-based vector environment so that it can be interacted with through PyTorch Tensors.

Actions must be provided as PyTorch Tensors and observations, rewards, terminations and truncations will be returned as PyTorch Tensors.

Parameters:

env – The Jax-based vector environment to wrap
device – The device the torch Tensors should be moved to

class gymnasium.experimental.wrappers.vector.NumpyToTorchV0(env: VectorEnv, device: Device | None = None)[source]#

Wraps a numpy-based environment so that it can be interacted with through PyTorch Tensors.

Parameters:

env – The Jax-based vector environment to wrap
device – The device the torch Tensors should be moved to