Observation Wrappers#

class gymnasium.ObservationWrapper(env: Env[ObsType, ActType])[source]#

Modify observations from Env.reset() and Env.step() using observation() function.

If you would like to apply a function to only the observation before passing it to the learning code, you can simply inherit from ObservationWrapper and overwrite the method observation() to implement that transformation. The transformation defined in that method must be reflected by the env observation space. Otherwise, you need to specify the new observation space of the wrapper by setting self.observation_space in the __init__() method of your wrapper.

Parameters:

env – Environment to be wrapped.

observation(observation: ObsType) WrapperObsType[source]#

Returns a modified observation.

Parameters:

observation – The env observation

Returns:

The modified observation

Implemented Wrappers#

class gymnasium.wrappers.TransformObservation(env: gym.Env[ObsType, ActType], func: Callable[[ObsType], Any], observation_space: gym.Space[WrapperObsType] | None)[source]#

Applies a function to the observation received from the environment’s Env.reset() and Env.step() that is passed back to the user.

The function func will be applied to all observations. If the observations from func are outside the bounds of the env’s observation space, provide an updated observation_space.

A vector version of the wrapper exists gymnasium.wrappers.vector.TransformObservation.

Example

>>> import gymnasium as gym
>>> from gymnasium.wrappers import TransformObservation
>>> import numpy as np
>>> np.random.seed(0)
>>> env = gym.make("CartPole-v1")
>>> env.reset(seed=42)
(array([ 0.0273956 , -0.00611216,  0.03585979,  0.0197368 ], dtype=float32), {})
>>> env = gym.make("CartPole-v1")
>>> env = TransformObservation(env, lambda obs: obs + 0.1 * np.random.random(obs.shape), env.observation_space)
>>> env.reset(seed=42)
(array([0.08227695, 0.06540678, 0.09613613, 0.07422512]), {})
Change logs:
  • v0.15.4 - Initially added

  • v1.0.0 - Add requirement of observation_space

Parameters:
  • env – The environment to wrap

  • func – A function that will transform an observation. If this transformed observation is outside the observation space of env.observation_space then provide an observation_space.

  • observation_space – The observation spaces of the wrapper, if None, then it is assumed the same as env.observation_space.

class gymnasium.wrappers.DelayObservation(env: Env[ObsType, ActType], delay: int)[source]#

Adds a delay to the returned observation from the environment.

Before reaching the delay number of timesteps, returned observations is an array of zeros with the same shape as the observation space.

No vector version of the wrapper exists.

Note

This does not support random delay values, if users are interested, please raise an issue or pull request to add this feature.

Example

>>> import gymnasium as gym
>>> env = gym.make("CartPole-v1")
>>> env.reset(seed=123)
(array([ 0.01823519, -0.0446179 , -0.02796401, -0.03156282], dtype=float32), {})
>>> env = DelayObservation(env, delay=2)
>>> env.reset(seed=123)
(array([0., 0., 0., 0.], dtype=float32), {})
>>> env.step(env.action_space.sample())
(array([0., 0., 0., 0.], dtype=float32), 1.0, False, False, {})
>>> env.step(env.action_space.sample())
(array([ 0.01823519, -0.0446179 , -0.02796401, -0.03156282], dtype=float32), 1.0, False, False, {})
Change logs:
  • v1.0.0 - Initially added

Parameters:
  • env – The environment to wrap

  • delay – The number of timesteps to delay observations

class gymnasium.wrappers.DtypeObservation(env: Env[ObsType, ActType], dtype: Any)[source]#

Modifies the dtype of an observation array to a specified dtype.

Note

This is only compatible with Box, Discrete, MultiDiscrete and MultiBinary observation spaces

A vector version of the wrapper exists gymnasium.wrappers.vector.DtypeObservation.

Change logs:
  • v1.0.0 - Initially added

Parameters:
  • env – The environment to wrap

  • dtype – The new dtype of the observation

class gymnasium.wrappers.FilterObservation(env: gym.Env[ObsType, ActType], filter_keys: Sequence[str | int])[source]#

Filters a Dict or Tuple observation spaces by a set of keys or indexes.

A vector version of the wrapper exists gymnasium.wrappers.vector.FilterObservation.

Example

>>> import gymnasium as gym
>>> from gymnasium.wrappers import FilterObservation
>>> env = gym.make("CartPole-v1")
>>> env = gym.wrappers.TimeAwareObservation(env, flatten=False)
>>> env.observation_space
Dict('obs': Box([-4.8000002e+00 -3.4028235e+38 -4.1887903e-01 -3.4028235e+38], [4.8000002e+00 3.4028235e+38 4.1887903e-01 3.4028235e+38], (4,), float32), 'time': Box(0, 500, (1,), int32))
>>> env.reset(seed=42)
({'obs': array([ 0.0273956 , -0.00611216,  0.03585979,  0.0197368 ], dtype=float32), 'time': array([0], dtype=int32)}, {})
>>> env = FilterObservation(env, filter_keys=['time'])
>>> env.reset(seed=42)
({'time': array([0], dtype=int32)}, {})
>>> env.step(0)
({'time': array([1], dtype=int32)}, 1.0, False, False, {})
Change logs:
  • v0.12.3 - Initially added, originally called FilterObservationWrapper

  • v1.0.0 - Rename to FilterObservation and add support for tuple observation spaces with integer filter_keys

Parameters:
  • env – The environment to wrap

  • filter_keys – The set of subspaces to be included, use a list of strings for Dict and integers for Tuple spaces

class gymnasium.wrappers.FlattenObservation(env: Env[ObsType, ActType])[source]#

Flattens the environment’s observation space and each observation from reset and step functions.

A vector version of the wrapper exists gymnasium.wrappers.vector.FlattenObservation.

Example

>>> import gymnasium as gym
>>> from gymnasium.wrappers import FlattenObservation
>>> env = gym.make("CarRacing-v2")
>>> env.observation_space.shape
(96, 96, 3)
>>> env = FlattenObservation(env)
>>> env.observation_space.shape
(27648,)
>>> obs, _ = env.reset()
>>> obs.shape
(27648,)
Change logs:
  • v0.15.0 - Initially added

Parameters:

env – The environment to wrap

class gymnasium.wrappers.FrameStackObservation(env: gym.Env[ObsType, ActType], stack_size: int, *, padding_type: str | ObsType = 'reset')[source]#

Stacks the observations from the last N time steps in a rolling manner.

For example, if the number of stacks is 4, then the returned observation contains the most recent 4 observations. For environment ‘Pendulum-v1’, the original observation is an array with shape [3], so if we stack 4 observations, the processed observation has shape [4, 3].

Users have options for the padded observation used:

  • “reset” (default) - The reset value is repeated

  • “zero” - A “zero”-like instance of the observation space

  • custom - An instance of the observation space

No vector version of the wrapper exists.

Example

>>> import gymnasium as gym
>>> from gymnasium.wrappers import FrameStackObservation
>>> env = gym.make("CarRacing-v2")
>>> env = FrameStackObservation(env, stack_size=4)
>>> env.observation_space
Box(0, 255, (4, 96, 96, 3), uint8)
>>> obs, _ = env.reset()
>>> obs.shape
(4, 96, 96, 3)
Example with different padding observations:
>>> env = gym.make("CartPole-v1")
>>> env.reset(seed=123)
(array([ 0.01823519, -0.0446179 , -0.02796401, -0.03156282], dtype=float32), {})
>>> stacked_env = FrameStackObservation(env, 3)   # the default is padding_type="reset"
>>> stacked_env.reset(seed=123)
(array([[ 0.01823519, -0.0446179 , -0.02796401, -0.03156282],
       [ 0.01823519, -0.0446179 , -0.02796401, -0.03156282],
       [ 0.01823519, -0.0446179 , -0.02796401, -0.03156282]],
      dtype=float32), {})
>>> stacked_env = FrameStackObservation(env, 3, padding_type="zero")
>>> stacked_env.reset(seed=123)
(array([[ 0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.01823519, -0.0446179 , -0.02796401, -0.03156282]],
      dtype=float32), {})
>>> stacked_env = FrameStackObservation(env, 3, padding_type=np.array([1, -1, 0, 2], dtype=np.float32))
>>> stacked_env.reset(seed=123)
(array([[ 1.        , -1.        ,  0.        ,  2.        ],
       [ 1.        , -1.        ,  0.        ,  2.        ],
       [ 0.01823519, -0.0446179 , -0.02796401, -0.03156282]],
      dtype=float32), {})
Change logs:
  • v0.15.0 - Initially add as FrameStack with support for lz4

  • v1.0.0 - Rename to FrameStackObservation and remove lz4 and LazyFrame support

    along with adding the padding_type parameter

Parameters:
  • env – The environment to apply the wrapper

  • stack_size – The number of frames to stack.

  • padding_type – The padding type to use when stacking the observations, options: “reset”, “zero”, custom obs

class gymnasium.wrappers.GrayscaleObservation(env: Env[ObsType, ActType], keep_dim: bool = False)[source]#

Converts an image observation computed by reset and step from RGB to Grayscale.

The keep_dim will keep the channel dimension.

A vector version of the wrapper exists gymnasium.wrappers.vector.GrayscaleObservation.

Example

>>> import gymnasium as gym
>>> from gymnasium.wrappers import GrayscaleObservation
>>> env = gym.make("CarRacing-v2")
>>> env.observation_space.shape
(96, 96, 3)
>>> grayscale_env = GrayscaleObservation(env)
>>> grayscale_env.observation_space.shape
(96, 96)
>>> grayscale_env = GrayscaleObservation(env, keep_dim=True)
>>> grayscale_env.observation_space.shape
(96, 96, 1)
Change logs:
  • v0.15.0 - Initially added, originally called GrayScaleObservation

  • v1.0.0 - Renamed to GrayscaleObservation

Parameters:
  • env – The environment to wrap

  • keep_dim – If to keep the channel in the observation, if True, obs.shape == 3 else obs.shape == 2

class gymnasium.wrappers.MaxAndSkipObservation(env: Env[ObsType, ActType], skip: int = 4)[source]#

Skips the N-th frame (observation) and return the max values between the two last observations.

No vector version of the wrapper exists.

Example

>>> import gymnasium as gym
>>> env = gym.make("CartPole-v1")
>>> obs0, *_ = env.reset(seed=123)
>>> obs1, *_ = env.step(1)
>>> obs2, *_ = env.step(1)
>>> obs3, *_ = env.step(1)
>>> obs4, *_ = env.step(1)
>>> skip_and_max_obs = np.max(np.stack([obs3, obs4], axis=0), axis=0)
>>> env = gym.make("CartPole-v1")
>>> wrapped_env = MaxAndSkipObservation(env)
>>> wrapped_obs0, *_ = wrapped_env.reset(seed=123)
>>> wrapped_obs1, *_ = wrapped_env.step(1)
>>> np.all(obs0 == wrapped_obs0)
True
>>> np.all(wrapped_obs1 == skip_and_max_obs)
True
Change logs:
  • v1.0.0 - Initially add

Parameters:
  • env (Env) – The environment to apply the wrapper

  • skip – The number of frames to skip

class gymnasium.wrappers.NormalizeObservation(env: Env[ObsType, ActType], epsilon: float = 1e-8)[source]#

Normalizes observations to be centered at the mean with unit variance.

The property update_running_mean allows to freeze/continue the running mean calculation of the observation statistics. If True (default), the RunningMeanStd will get updated every time step or reset is called. If False, the calculated statistics are used but not updated anymore; this may be used during evaluation.

A vector version of the wrapper exists gymnasium.wrappers.vector.NormalizeObservation.

Note

The normalization depends on past trajectories and observations will not be normalized correctly if the wrapper was newly instantiated or the policy was changed recently.

Example

>>> import numpy as np
>>> import gymnasium as gym
>>> env = gym.make("CartPole-v1")
>>> obs, info = env.reset(seed=123)
>>> term, trunc = False, False
>>> while not (term or trunc):
...     obs, _, term, trunc, _ = env.step(1)
...
>>> obs
array([ 0.1511158 ,  1.7183299 , -0.25533703, -2.8914354 ], dtype=float32)
>>> env = gym.make("CartPole-v1")
>>> env = NormalizeObservation(env)
>>> obs, info = env.reset(seed=123)
>>> term, trunc = False, False
>>> while not (term or trunc):
...     obs, _, term, trunc, _ = env.step(1)
>>> obs
array([ 2.0059888,  1.5676788, -1.9944268, -1.6120394], dtype=float32)
Change logs:
  • v0.21.0 - Initially add

  • v1.0.0 - Add update_running_mean attribute to allow disabling of updating the running mean / standard, particularly useful for evaluation time.

Parameters:
  • env (Env) – The environment to apply the wrapper

  • epsilon – A stability parameter that is used when scaling the observations.

class gymnasium.wrappers.AddRenderObservation(env: Env[ObsType, ActType], render_only: bool = True, render_key: str = 'pixels', obs_key: str = 'state')[source]#

Includes the rendered observations in the environment’s observations.

Notes

This was previously called PixelObservationWrapper.

No vector version of the wrapper exists.

Example - Replace the observation with the rendered image:
>>> env = gym.make("CartPole-v1", render_mode="rgb_array")
>>> env = AddRenderObservation(env, render_only=True)
>>> env.observation_space
Box(0, 255, (400, 600, 3), uint8)
>>> obs, _ = env.reset(seed=123)
>>> image = env.render()
>>> np.all(obs == image)
True
>>> obs, *_ = env.step(env.action_space.sample())
>>> image = env.render()
>>> np.all(obs == image)
True
Example - Add the rendered image to the original observation as a dictionary item:
>>> env = gym.make("CartPole-v1", render_mode="rgb_array")
>>> env = AddRenderObservation(env, render_only=False)
>>> env.observation_space
Dict('pixels': Box(0, 255, (400, 600, 3), uint8), 'state': Box([-4.8000002e+00 -3.4028235e+38 -4.1887903e-01 -3.4028235e+38], [4.8000002e+00 3.4028235e+38 4.1887903e-01 3.4028235e+38], (4,), float32))
>>> obs, info = env.reset(seed=123)
>>> obs.keys()
dict_keys(['state', 'pixels'])
>>> obs["state"]
array([ 0.01823519, -0.0446179 , -0.02796401, -0.03156282], dtype=float32)
>>> np.all(obs["pixels"] == env.render())
True
>>> obs, reward, terminates, truncates, info = env.step(env.action_space.sample())
>>> image = env.render()
>>> np.all(obs["pixels"] == image)
True
Change logs:
  • v0.15.0 - Initially added as PixelObservationWrapper

  • v1.0.0 - Renamed to AddRenderObservation

Parameters:
  • env – The environment to wrap.

  • render_only (bool) – If True (default), the original observation returned by the wrapped environment will be discarded, and a dictionary observation will only include pixels. If False, the observation dictionary will contain both the original observations and the pixel observations.

  • render_key – Optional custom string specifying the pixel key. Defaults to “pixels”

  • obs_key – Optional custom string specifying the obs key. Defaults to “state”

class gymnasium.wrappers.ResizeObservation(env: Env[ObsType, ActType], shape: tuple[int, int])[source]#

Resizes image observations using OpenCV to a specified shape.

A vector version of the wrapper exists gymnasium.wrappers.vector.ResizeObservation.

Example

>>> import gymnasium as gym
>>> from gymnasium.wrappers import ResizeObservation
>>> env = gym.make("CarRacing-v2")
>>> env.observation_space.shape
(96, 96, 3)
>>> resized_env = ResizeObservation(env, (32, 32))
>>> resized_env.observation_space.shape
(32, 32, 3)
Change logs:
  • v0.12.6 - Initially added

  • v1.0.0 - Requires shape with a tuple of two integers

Parameters:
  • env – The environment to wrap

  • shape – The resized observation shape

class gymnasium.wrappers.ReshapeObservation(env: gym.Env[ObsType, ActType], shape: int | tuple[int, ...])[source]#

Reshapes Array based observations to a specified shape.

A vector version of the wrapper exists gymnasium.wrappers.vector.RescaleObservation.

Example

>>> import gymnasium as gym
>>> from gymnasium.wrappers import ReshapeObservation
>>> env = gym.make("CarRacing-v2")
>>> env.observation_space.shape
(96, 96, 3)
>>> reshape_env = ReshapeObservation(env, (24, 4, 96, 1, 3))
>>> reshape_env.observation_space.shape
(24, 4, 96, 1, 3)
Change logs:
  • v1.0.0 - Initially added

Parameters:
  • env – The environment to wrap

  • shape – The reshaped observation space

class gymnasium.wrappers.RescaleObservation(env: gym.Env[ObsType, ActType], min_obs: np.floating | np.integer | np.ndarray, max_obs: np.floating | np.integer | np.ndarray)[source]#

Affinely (linearly) rescales a Box observation space of the environment to within the range of [min_obs, max_obs].

A vector version of the wrapper exists gymnasium.wrappers.vector.RescaleObservation.

Example

>>> import gymnasium as gym
>>> from gymnasium.wrappers import RescaleObservation
>>> env = gym.make("Pendulum-v1")
>>> env.observation_space
Box([-1. -1. -8.], [1. 1. 8.], (3,), float32)
>>> env = RescaleObservation(env, np.array([-2, -1, -10], dtype=np.float32), np.array([1, 0, 1], dtype=np.float32))
>>> env.observation_space
Box([ -2.  -1. -10.], [1. 0. 1.], (3,), float32)
Change logs:
  • v1.0.0 - Initially added

Parameters:
  • env – The environment to wrap

  • min_obs – The new minimum observation bound

  • max_obs – The new maximum observation bound

class gymnasium.wrappers.TimeAwareObservation(env: Env[ObsType, ActType], flatten: bool = True, normalize_time: bool = False, *, dict_time_key: str = 'time')[source]#

Augment the observation with the number of time steps taken within an episode.

The normalize_time if True represents time as a normalized value between [0,1] otherwise if False, the current timestep is an integer.

For environments with Dict observation spaces, the time information is automatically added in the key “time” (can be changed through dict_time_key) and for environments with Tuple observation space, the time information is added as the final element in the tuple. Otherwise, the observation space is transformed into a Dict observation space with two keys, “obs” for the base environment’s observation and “time” for the time information.

To flatten the observation, use the flatten parameter which will use the gymnasium.spaces.utils.flatten() function.

No vector version of the wrapper exists.

Example

>>> import gymnasium as gym
>>> from gymnasium.wrappers import TimeAwareObservation
>>> env = gym.make("CartPole-v1")
>>> env = TimeAwareObservation(env)
>>> env.observation_space
Box([-4.80000019e+00 -3.40282347e+38 -4.18879032e-01 -3.40282347e+38
  0.00000000e+00], [4.80000019e+00 3.40282347e+38 4.18879032e-01 3.40282347e+38
 5.00000000e+02], (5,), float64)
>>> env.reset(seed=42)[0]
array([ 0.0273956 , -0.00611216,  0.03585979,  0.0197368 ,  0.        ])
>>> _ = env.action_space.seed(42)
>>> env.step(env.action_space.sample())[0]
array([ 0.02727336, -0.20172954,  0.03625453,  0.32351476,  1.        ])
Normalize time observation space example:
>>> env = gym.make('CartPole-v1')
>>> env = TimeAwareObservation(env, normalize_time=True)
>>> env.observation_space
Box([-4.8000002e+00 -3.4028235e+38 -4.1887903e-01 -3.4028235e+38
  0.0000000e+00], [4.8000002e+00 3.4028235e+38 4.1887903e-01 3.4028235e+38 1.0000000e+00], (5,), float32)
>>> env.reset(seed=42)[0]
array([ 0.0273956 , -0.00611216,  0.03585979,  0.0197368 ,  0.        ],
      dtype=float32)
>>> _ = env.action_space.seed(42)
>>> env.step(env.action_space.sample())[0]
array([ 0.02727336, -0.20172954,  0.03625453,  0.32351476,  0.002     ],
      dtype=float32)
Flatten observation space example:
>>> env = gym.make("CartPole-v1")
>>> env = TimeAwareObservation(env, flatten=False)
>>> env.observation_space
Dict('obs': Box([-4.8000002e+00 -3.4028235e+38 -4.1887903e-01 -3.4028235e+38], [4.8000002e+00 3.4028235e+38 4.1887903e-01 3.4028235e+38], (4,), float32), 'time': Box(0, 500, (1,), int32))
>>> env.reset(seed=42)[0]
{'obs': array([ 0.0273956 , -0.00611216,  0.03585979,  0.0197368 ], dtype=float32), 'time': array([0], dtype=int32)}
>>> _ = env.action_space.seed(42)
>>> env.step(env.action_space.sample())[0]
{'obs': array([ 0.02727336, -0.20172954,  0.03625453,  0.32351476], dtype=float32), 'time': array([1], dtype=int32)}
Change logs:
  • v0.18.0 - Initially added

  • v1.0.0 - Remove vector environment support, add flatten and normalize_time parameters

Parameters:
  • env – The environment to apply the wrapper

  • flatten – Flatten the observation to a Box of a single dimension

  • normalize_time – if True return time in the range [0,1] otherwise return time as remaining timesteps before truncation

  • dict_time_key – For environment with a Dict observation space, the key for the time space. By default, “time”.