Action Wrappers¶
Base Class¶
- class gymnasium.ActionWrapper(env: Env[ObsType, ActType])[source]¶
Superclass of wrappers that can modify the action before
step()
.If you would like to apply a function to the action before passing it to the base environment, you can simply inherit from
ActionWrapper
and overwrite the methodaction()
to implement that transformation. The transformation defined in that method must take values in the base environment’s action space. However, its domain might differ from the original action space. In that case, you need to specify the new action space of the wrapper by settingaction_space
in the__init__()
method of your wrapper.Among others, Gymnasium provides the action wrappers
gymnasium.wrappers.ClipAction
andgymnasium.wrappers.RescaleAction
for clipping and rescaling actions.- Parameters:
env – Environment to be wrapped.
Available Action Wrappers¶
- class gymnasium.wrappers.TransformAction(env: gym.Env[ObsType, ActType], func: Callable[[WrapperActType], ActType], action_space: Space[WrapperActType] | None)[source]¶
Applies a function to the
action
before passing the modified value to the environmentstep
function.A vector version of the wrapper exists
gymnasium.wrappers.vector.TransformAction
.Example
>>> import numpy as np >>> import gymnasium as gym >>> env = gym.make("MountainCarContinuous-v0") >>> _ = env.reset(seed=123) >>> obs, *_= env.step(np.array([0.0, 1.0])) >>> obs array([-4.6397772e-01, -4.4808415e-04], dtype=float32) >>> env = gym.make("MountainCarContinuous-v0") >>> env = TransformAction(env, lambda a: 0.5 * a + 0.1, env.action_space) >>> _ = env.reset(seed=123) >>> obs, *_= env.step(np.array([0.0, 1.0])) >>> obs array([-4.6382770e-01, -2.9808417e-04], dtype=float32)
- Change logs:
v1.0.0 - Initially added
- Parameters:
env – The environment to wrap
func – Function to apply to the
step()
’saction
action_space – The updated action space of the wrapper given the function.
- class gymnasium.wrappers.ClipAction(env: Env[ObsType, ActType])[source]¶
Clips the
action
pass tostep
to be within the environment’s action_space.A vector version of the wrapper exists
gymnasium.wrappers.vector.ClipAction
.Example
>>> import gymnasium as gym >>> from gymnasium.wrappers import ClipAction >>> import numpy as np >>> env = gym.make("Hopper-v4", disable_env_checker=True) >>> env = ClipAction(env) >>> env.action_space Box(-inf, inf, (3,), float32) >>> _ = env.reset(seed=42) >>> _ = env.step(np.array([5.0, -2.0, 0.0], dtype=np.float32)) ... # Executes the action np.array([1.0, -1.0, 0]) in the base environment
- Change logs:
v0.12.6 - Initially added
v1.0.0 - Action space is updated to infinite bounds as is technically correct
- Parameters:
env – The environment to wrap
- class gymnasium.wrappers.RescaleAction(env: gym.Env[ObsType, ActType], min_action: np.floating | np.integer | np.ndarray, max_action: np.floating | np.integer | np.ndarray)[source]¶
Affinely (linearly) rescales a
Box
action space of the environment to within the range of[min_action, max_action]
.The base environment
env
must have an action space of typespaces.Box
. Ifmin_action
ormax_action
are numpy arrays, the shape must match the shape of the environment’s action space.A vector version of the wrapper exists
gymnasium.wrappers.vector.RescaleAction
.Example
>>> import gymnasium as gym >>> from gymnasium.wrappers import RescaleAction >>> import numpy as np >>> env = gym.make("Hopper-v4", disable_env_checker=True) >>> _ = env.reset(seed=42) >>> obs, _, _, _, _ = env.step(np.array([1, 1, 1], dtype=np.float32)) >>> _ = env.reset(seed=42) >>> min_action = -0.5 >>> max_action = np.array([0.0, 0.5, 0.75], dtype=np.float32) >>> wrapped_env = RescaleAction(env, min_action=min_action, max_action=max_action) >>> wrapped_env_obs, _, _, _, _ = wrapped_env.step(max_action) >>> np.all(obs == wrapped_env_obs) np.True_
- Change logs:
v0.15.4 - Initially added
- Parameters:
env (Env) – The environment to wrap
min_action (float, int or np.ndarray) – The min values for each action. This may be a numpy array or a scalar.
max_action (float, int or np.ndarray) – The max values for each action. This may be a numpy array or a scalar.
- class gymnasium.wrappers.StickyAction(env: gym.Env[ObsType, ActType], repeat_action_probability: float, repeat_action_duration: int | tuple[int, int] = 1)[source]¶
Adds a probability that the action is repeated for the same
step
function.This wrapper follows the implementation proposed by Machado et al., 2018 in Section 5.2 on page 12, and adds the possibility to repeat the action for more than one step.
No vector version of the wrapper exists.
Example
>>> import gymnasium as gym >>> env = gym.make("CartPole-v1") >>> env = StickyAction(env, repeat_action_probability=0.9) >>> env.reset(seed=123) (array([ 0.01823519, -0.0446179 , -0.02796401, -0.03156282], dtype=float32), {}) >>> env.step(1) (array([ 0.01734283, 0.15089367, -0.02859527, -0.33293587], dtype=float32), 1.0, False, False, {}) >>> env.step(0) (array([ 0.0203607 , 0.34641072, -0.03525399, -0.6344974 ], dtype=float32), 1.0, False, False, {}) >>> env.step(1) (array([ 0.02728892, 0.5420062 , -0.04794393, -0.9380709 ], dtype=float32), 1.0, False, False, {}) >>> env.step(0) (array([ 0.03812904, 0.34756234, -0.06670535, -0.6608303 ], dtype=float32), 1.0, False, False, {})
- Change logs:
v1.0.0 - Initially added
v1.1.0 - Add repeat_action_duration argument for dynamic number of sticky actions
- Parameters:
env (Env) – the wrapped environment,
repeat_action_probability (int | float) – a probability of repeating the old action,
repeat_action_duration (int | tuple[int, int]) – the number of steps the action is repeated. It can be either an int (for deterministic repeats) or a tuple[int, int] for a range of stochastic number of repeats.