Vectorize#
Gymnasium.vector.VectorEnv#
- class gymnasium.vector.VectorEnv[source]#
Base class for vectorized environments to run multiple independent copies of the same environment in parallel.
Vector environments can provide a linear speed-up in the steps taken per second through sampling multiple sub-environments at the same time. To prevent terminated environments waiting until all sub-environments have terminated or truncated, the vector environments automatically reset sub-environments after they terminate or truncated (within the same step call). As a result, the step’s observation and info are overwritten by the reset’s observation and info. To preserve this data, the observation and info for the final step of a sub-environment is stored in the info parameter, using “final_observation” and “final_info” respectively. See
step()
for more information.The vector environments batches observations, rewards, terminations, truncations and info for each sub-environment. In addition,
step()
expects to receive a batch of actions for each parallel environment.Gymnasium contains two generalised Vector environments:
AsyncVectorEnv
andSyncVectorEnv
along with several custom vector environment implementations.The Vector Environments have the additional attributes for users to understand the implementation
num_envs
- The number of sub-environment in the vector environmentobservation_space
- The batched observation space of the vector environmentsingle_observation_space
- The observation space of a single sub-environmentaction_space
- The batched action space of the vector environmentsingle_action_space
- The action space of a single sub-environment
Examples
>>> import gymnasium as gym >>> envs = gym.make_vec("CartPole-v1", num_envs=3, vectorization_mode="sync", wrappers=(gym.wrappers.TimeAwareObservation,)) >>> envs = gym.wrappers.vector.ClipReward(envs, min_reward=0.2, max_reward=0.8) >>> envs <ClipReward, SyncVectorEnv(CartPole-v1, num_envs=3)> >>> observations, infos = envs.reset(seed=123) >>> observations array([[ 0.01823519, -0.0446179 , -0.02796401, -0.03156282, 0. ], [ 0.02852531, 0.02858594, 0.0469136 , 0.02480598, 0. ], [ 0.03517495, -0.000635 , -0.01098382, -0.03203924, 0. ]]) >>> infos {} >>> _ = envs.action_space.seed(123) >>> observations, rewards, terminations, truncations, infos = envs.step(envs.action_space.sample()) >>> observations array([[ 0.01734283, 0.15089367, -0.02859527, -0.33293587, 1. ], [ 0.02909703, -0.16717631, 0.04740972, 0.3319138 , 1. ], [ 0.03516225, -0.19559774, -0.01162461, 0.25715804, 1. ]]) >>> rewards array([0.8, 0.8, 0.8]) >>> terminations array([False, False, False]) >>> truncations array([False, False, False]) >>> infos {} >>> envs.close()
Note
The info parameter of
reset()
andstep()
was originally implemented before v0.25 as a list of dictionary for each sub-environment. However, this was modified in v0.25+ to be a dictionary with a NumPy array for each key. To use the old info style, utilise theDictInfoToList
wrapper.Note
All parallel environments should share the identical observation and action spaces. In other words, a vector of multiple different environments is not supported.
Note
make_vec()
is the equivalent function tomake()
for vector environments.
Methods#
- VectorEnv.step(actions: ActType) tuple[ObsType, ArrayType, ArrayType, ArrayType, dict[str, Any]] [source]#
Take an action for each parallel environment.
- Parameters:
actions – Batch of actions with the
action_space
shape.- Returns:
Batch of (observations, rewards, terminations, truncations, infos)
Note
As the vector environments autoreset for a terminating and truncating sub-environments, the returned observation and info is not the final step’s observation or info which is instead stored in info as “final_observation” and “final_info”.
Example
>>> import gymnasium as gym >>> import numpy as np >>> envs = gym.make_vec("CartPole-v1", num_envs=3, vectorization_mode="sync") >>> _ = envs.reset(seed=42) >>> actions = np.array([1, 0, 1], dtype=np.int32) >>> observations, rewards, terminations, truncations, infos = envs.step(actions) >>> observations array([[ 0.02727336, 0.18847767, 0.03625453, -0.26141977], [ 0.01431748, -0.24002443, -0.04731862, 0.3110827 ], [-0.03822722, 0.1710671 , -0.00848456, -0.2487226 ]], dtype=float32) >>> rewards array([1., 1., 1.]) >>> terminations array([False, False, False]) >>> terminations array([False, False, False]) >>> infos {}
- VectorEnv.reset(*, seed: int | list[int] | None = None, options: dict[str, Any] | None = None) tuple[ObsType, dict[str, Any]] [source]#
Reset all parallel environments and return a batch of initial observations and info.
- Parameters:
seed – The environment reset seeds
options – If to return the options
- Returns:
A batch of observations and info from the vectorized environment.
Example
>>> import gymnasium as gym >>> envs = gym.make_vec("CartPole-v1", num_envs=3, vectorization_mode="sync") >>> observations, infos = envs.reset(seed=42) >>> observations array([[ 0.0273956 , -0.00611216, 0.03585979, 0.0197368 ], [ 0.01522993, -0.04562247, -0.04799704, 0.03392126], [-0.03774345, -0.02418869, -0.00942293, 0.0469184 ]], dtype=float32) >>> infos {}
- VectorEnv.render() tuple[RenderFrame, ...] | None [source]#
Returns the rendered frames from the parallel environments.
- Returns:
A tuple of rendered frames from the parallel environments
- VectorEnv.close(**kwargs: Any)[source]#
Close all parallel environments and release resources.
It also closes all the existing image viewers, then calls
close_extras()
and setclosed
asTrue
.Warning
This function itself does not close the environments, it should be handled in
close_extras()
. This is generic for both synchronous and asynchronous vectorized environments.Note
This will be automatically called when garbage collected or program exited.
- Parameters:
**kwargs – Keyword arguments passed to
close_extras()
Attributes#
- VectorEnv.num_envs: int#
The number of sub-environments in the vector environment.
- VectorEnv.action_space: gym.Space#
The (batched) action space. The input actions of step must be valid elements of action_space.
- VectorEnv.observation_space: gym.Space#
The (batched) observation space. The observations returned by reset and step are valid elements of observation_space.
- VectorEnv.single_action_space: gym.Space#
The action space of a sub-environment.
- VectorEnv.single_observation_space: gym.Space#
The observation space of a sub-environment.
- VectorEnv.spec: EnvSpec | None = None#
The
EnvSpec
of the environment normally set duringgymnasium.make_vec()
- VectorEnv.render_mode: str | None = None#
The render mode of the environment which should follow similar specifications to Env.render_mode.
- VectorEnv.closed: bool = False#
If the vector environment has been closed already.
Additional Methods#
- property VectorEnv.unwrapped#
Return the base environment.
- property VectorEnv.np_random: Generator#
Returns the environment’s internal
_np_random
that if not set will initialise with a random seed.- Returns:
Instances of `np.random.Generator`
Making Vector Environments#
To create vector environments, gymnasium provides gymnasium.make_vec()
as an equivalent function to gymnasium.make()
.