Action Wrappers¶

Base Class¶

class gymnasium.ActionWrapper(env: Env[ObsType, ActType])[source]¶

Superclass of wrappers that can modify the action before step().

If you would like to apply a function to the action before passing it to the base environment, you can simply inherit from ActionWrapper and overwrite the method action() to implement that transformation. The transformation defined in that method must take values in the base environment’s action space. However, its domain might differ from the original action space. In that case, you need to specify the new action space of the wrapper by setting action_space in the __init__() method of your wrapper.

Among others, Gymnasium provides the action wrappers gymnasium.wrappers.ClipAction and gymnasium.wrappers.RescaleAction for clipping and rescaling actions.

Parameters:: env – Environment to be wrapped.

action(action: WrapperActType) → ActType[source]¶

Returns a modified action before step() is called.

Parameters:: action – The original step() actions
Returns:: The modified actions

Available Action Wrappers¶

class gymnasium.wrappers.TransformAction(env: Env[ObsType, ActType], func: Callable[[WrapperActType], ActType], action_space: Space[WrapperActType] | None)[source]¶

Applies a function to the action before passing the modified value to the environment step function.

A vector version of the wrapper exists gymnasium.wrappers.vector.TransformAction.

Example

>>> import numpy as np
>>> import gymnasium as gym
>>> env = gym.make("MountainCarContinuous-v0")
>>> _ = env.reset(seed=123)
>>> obs, *_= env.step(np.array([0.0, 1.0]))
>>> obs
array([-4.6397772e-01, -4.4808415e-04], dtype=float32)
>>> env = gym.make("MountainCarContinuous-v0")
>>> env = TransformAction(env, lambda a: 0.5 * a + 0.1, env.action_space)
>>> _ = env.reset(seed=123)
>>> obs, *_= env.step(np.array([0.0, 1.0]))
>>> obs
array([-4.6382770e-01, -2.9808417e-04], dtype=float32)

Change logs:

v1.0.0 - Initially added

Parameters:

env – The environment to wrap
func – Function to apply to the step()’s action
action_space – The updated action space of the wrapper given the function.

class gymnasium.wrappers.ClipAction(env: Env[ObsType, ActType])[source]¶

Clips the action pass to step to be within the environment’s action_space.

A vector version of the wrapper exists gymnasium.wrappers.vector.ClipAction.

Example

>>> import gymnasium as gym
>>> from gymnasium.wrappers import ClipAction
>>> import numpy as np
>>> env = gym.make("Hopper-v4", disable_env_checker=True)
>>> env = ClipAction(env)
>>> env.action_space
Box(-inf, inf, (3,), float32)
>>> _ = env.reset(seed=42)
>>> _ = env.step(np.array([5.0, -2.0, 0.0], dtype=np.float32))
... # Executes the action np.array([1.0, -1.0, 0]) in the base environment

Change logs:

v0.12.6 - Initially added
v1.0.0 - Action space is updated to infinite bounds as is technically correct

Parameters:: env – The environment to wrap

class gymnasium.wrappers.RescaleAction(env: Env[ObsType, ActType], min_action: floating | integer | ndarray, max_action: floating | integer | ndarray)[source]¶

Affinely (linearly) rescales a Box action space of the environment to within the range of [min_action, max_action].

The base environment env must have an action space of type spaces.Box. If min_action or max_action are numpy arrays, the shape must match the shape of the environment’s action space.

A vector version of the wrapper exists gymnasium.wrappers.vector.RescaleAction.

Example

>>> import gymnasium as gym
>>> from gymnasium.wrappers import RescaleAction
>>> import numpy as np
>>> env = gym.make("Hopper-v4", disable_env_checker=True)
>>> _ = env.reset(seed=42)
>>> obs, _, _, _, _ = env.step(np.array([1, 1, 1], dtype=np.float32))
>>> _ = env.reset(seed=42)
>>> min_action = -0.5
>>> max_action = np.array([0.0, 0.5, 0.75], dtype=np.float32)
>>> wrapped_env = RescaleAction(env, min_action=min_action, max_action=max_action)
>>> wrapped_env_obs, _, _, _, _ = wrapped_env.step(max_action)
>>> np.all(obs == wrapped_env_obs)
np.True_

Change logs:

v0.15.4 - Initially added

Parameters:

env (Env) – The environment to wrap
min_action (float, int or np.ndarray) – The min values for each action. This may be a numpy array or a scalar.
max_action (float, int or np.ndarray) – The max values for each action. This may be a numpy array or a scalar.

class gymnasium.wrappers.StickyAction(env: Env[ObsType, ActType], repeat_action_probability: float, repeat_action_duration: int | tuple[int, int] = 1)[source]¶

Adds a probability that the action is repeated for the same step function.

This wrapper follows the implementation proposed by Machado et al., 2018 in Section 5.2 on page 12, and adds the possibility to repeat the action for more than one step.

No vector version of the wrapper exists.

Example

>>> import gymnasium as gym
>>> env = gym.make("CartPole-v1")
>>> env = StickyAction(env, repeat_action_probability=0.9)
>>> env.reset(seed=123)
(array([ 0.01823519, -0.0446179 , -0.02796401, -0.03156282], dtype=float32), {})
>>> env.step(1)
(array([ 0.01734283,  0.15089367, -0.02859527, -0.33293587], dtype=float32), 1.0, False, False, {})
>>> env.step(0)
(array([ 0.0203607 ,  0.34641072, -0.03525399, -0.6344974 ], dtype=float32), 1.0, False, False, {})
>>> env.step(1)
(array([ 0.02728892,  0.5420062 , -0.04794393, -0.9380709 ], dtype=float32), 1.0, False, False, {})
>>> env.step(0)
(array([ 0.03812904,  0.34756234, -0.06670535, -0.6608303 ], dtype=float32), 1.0, False, False, {})

Change logs:

v1.0.0 - Initially added
v1.1.0 - Add repeat_action_duration argument for dynamic number of sticky actions

Parameters:

env (Env) – the wrapped environment,
repeat_action_probability (int | float) – a probability of repeating the old action,
repeat_action_duration (int | tuple[int, int]) – the number of steps the action is repeated. It can be either an int (for deterministic repeats) or a tuple[int, int] for a range of stochastic number of repeats.