Wrappers#
Module of wrapper classes.
Wrappers are a convenient way to modify an existing environment without having to alter the underlying code directly.
Using wrappers will allow you to avoid a lot of boilerplate code and make your environment more modular. Wrappers can
also be chained to combine their effects.
Most environments that are generated via gymnasium.make()
will already be wrapped by default.
In order to wrap an environment, you must first initialize a base environment. Then you can pass this environment along with (possibly optional) parameters to the wrapper’s constructor.
>>> import gymnasium as gym
>>> from gymnasium.wrappers import RescaleAction
>>> base_env = gym.make("BipedalWalker-v3")
>>> base_env.action_space
Box([-1. -1. -1. -1.], [1. 1. 1. 1.], (4,), float32)
>>> wrapped_env = RescaleAction(base_env, min_action=0, max_action=1)
>>> wrapped_env.action_space
Box([0. 0. 0. 0.], [1. 1. 1. 1.], (4,), float32)
You can access the environment underneath the first wrapper by using the gymnasium.Wrapper.env
attribute.
As the gymnasium.Wrapper
class inherits from gymnasium.Env
then gymnasium.Wrapper.env
can be another wrapper.
>>> wrapped_env
<RescaleAction<TimeLimit<OrderEnforcing<BipedalWalker<BipedalWalker-v3>>>>>
>>> wrapped_env.env
<TimeLimit<OrderEnforcing<BipedalWalker<BipedalWalker-v3>>>>
If you want to get to the environment underneath all of the layers of wrappers, you can use the
gymnasium.Wrapper.unwrapped
attribute.
If the environment is already a bare environment, the gymnasium.Wrapper.unwrapped
attribute will just return itself.
>>> wrapped_env
<RescaleAction<TimeLimit<OrderEnforcing<BipedalWalker<BipedalWalker-v3>>>>>
>>> wrapped_env.unwrapped
<gymnasium.envs.box2d.bipedal_walker.BipedalWalker object at 0x7f87d70712d0>
There are three common things you might want a wrapper to do:
Transform actions before applying them to the base environment
Transform observations that are returned by the base environment
Transform rewards that are returned by the base environment
Such wrappers can be easily implemented by inheriting from gymnasium.ActionWrapper
,
gymnasium.ObservationWrapper
, or gymnasium.RewardWrapper
and implementing the respective transformation.
If you need a wrapper to do more complicated tasks, you can inherit from the gymnasium.Wrapper
class directly.
If you’d like to implement your own custom wrapper, check out the corresponding tutorial.
gymnasium.Wrapper#
- class gymnasium.Wrapper(env: Env[ObsType, ActType])#
Wraps a
gymnasium.Env
to allow a modular transformation of thestep()
andreset()
methods.This class is the base class of all wrappers to change the behavior of the underlying environment. Wrappers that inherit from this class can modify the
action_space
,observation_space
,reward_range
andmetadata
attributes, without changing the underlying environment’s attributes. Moreover, the behavior of thestep()
andreset()
methods can be changed by these wrappers.Some attributes (
spec
,render_mode
,np_random
) will point back to the wrapper’s environment (i.e. to the corresponding attributes ofenv
).Note
If you inherit from
Wrapper
, don’t forget to callsuper().__init__(env)
Wraps an environment to allow a modular transformation of the
step()
andreset()
methods.- Parameters:
env – The environment to wrap
Methods#
- gymnasium.Wrapper.step(self, action: WrapperActType) tuple[WrapperObsType, SupportsFloat, bool, bool, dict[str, Any]] #
Uses the
step()
of theenv
that can be overwritten to change the returned data.
- gymnasium.Wrapper.reset(self, *, seed: int | None = None, options: dict[str, Any] | None = None) tuple[WrapperObsType, dict[str, Any]] #
Uses the
reset()
of theenv
that can be overwritten to change the returned data.
- gymnasium.Wrapper.close(self)#
Closes the wrapper and
env
.
Attributes#
- property Wrapper.action_space: spaces.Space[ActType] | spaces.Space[WrapperActType]#
Return the
Env
action_space
unless overwritten then the wrapperaction_space
is used.
- property Wrapper.observation_space: spaces.Space[ObsType] | spaces.Space[WrapperObsType]#
Return the
Env
observation_space
unless overwritten then the wrapperobservation_space
is used.
- property Wrapper.reward_range: tuple[SupportsFloat, SupportsFloat]#
Return the
Env
reward_range
unless overwritten then the wrapperreward_range
is used.
- gymnasium.Wrapper.env#
The environment (one level underneath) this wrapper.
This may itself be a wrapped environment. To obtain the environment underneath all layers of wrappers, use
gymnasium.Wrapper.unwrapped
.
- property Wrapper.unwrapped: Env[ObsType, ActType]#
Returns the base environment of the wrapper.
This will be the bare
gymnasium.Env
environment, underneath all layers of wrappers.
Gymnasium Wrappers#
Gymnasium provides a number of commonly used wrappers listed below. More information can be found on the particular wrapper in the page on the wrapper type
Name |
Type |
Description |
---|---|---|
Misc Wrapper |
Implements the common preprocessing applied tp Atari environments |
|
Misc Wrapper |
The wrapped environment will automatically reset when the terminated or truncated state is reached. |
|
Action Wrapper |
Clip the continuous action to the valid bound specified by the environment’s action_space |
|
Misc Wrapper |
Provides compatibility for environments written in the OpenAI Gym v0.21 API to look like Gymnasium environments |
|
Observation Wrapper |
Filters a dictionary observation spaces to only requested keys |
|
Observation Wrapper |
An Observation wrapper that flattens the observation |
|
Observation Wrapper |
AnObservation wrapper that stacks the observations in a rolling manner. |
|
Observation Wrapper |
Convert the image observation from RGB to gray scale. |
|
Misc Wrapper |
Allows human like rendering for environments that support “rgb_array” rendering |
|
Observation Wrapper |
This wrapper will normalize observations s.t. each coordinate is centered with unit variance. |
|
Reward Wrapper |
This wrapper will normalize immediate rewards s.t. their exponential moving average has a fixed variance. |
|
Misc Wrapper |
This will produce an error if step or render is called before reset |
|
Observation Wrapper |
Augment observations by pixel values obtained via render that can be added to or replaces the environments observation. |
|
Misc Wrapper |
This will keep track of cumulative rewards and episode lengths returning them at the end. |
|
Misc Wrapper |
This wrapper will record videos of rollouts. |
|
Misc Wrapper |
Enable list versions of render modes, i.e. “rgb_array_list” for “rgb_array” such that the rendering for each step are saved in a list until render is called. |
|
Action Wrapper |
Rescales the continuous action space of the environment to a range [min_action, max_action], where min_action and max_action are numpy arrays or floats. |
|
Observation Wrapper |
This wrapper works on environments with image observations (or more generally observations of shape AxBxC) and resizes the observation to the shape given by the tuple shape. |
|
Misc Wrapper |
Modifies an environment step function from (old) done to the (new) termination / truncation API. |
|
Observation Wrapper |
Augment the observation with current time step in the trajectory (by appending it to the observation). |
|
Misc Wrapper |
This wrapper will emit a truncated signal if the specified number of steps is exceeded in an episode. |
|
Observation Wrapper |
This wrapper will apply function to observations |
|
Reward Wrapper |
This wrapper will apply function to rewards |
|
Misc Wrapper |
This wrapper will convert the info of a vectorized environment from the dict format to a list of dictionaries where the i-th dictionary contains info of the i-th environment. |