Vectorize#

Gymnasium.vector.VectorEnv#

class gymnasium.vector.VectorEnv[source]#

Base class for vectorized environments to run multiple independent copies of the same environment in parallel.

Vector environments can provide a linear speed-up in the steps taken per second through sampling multiple sub-environments at the same time. To prevent terminated environments waiting until all sub-environments have terminated or truncated, the vector environments automatically reset sub-environments after they terminate or truncated (within the same step call). As a result, the step’s observation and info are overwritten by the reset’s observation and info. To preserve this data, the observation and info for the final step of a sub-environment is stored in the info parameter, using “final_observation” and “final_info” respectively. See step() for more information.

The vector environments batches observations, rewards, terminations, truncations and info for each sub-environment. In addition, step() expects to receive a batch of actions for each parallel environment.

Gymnasium contains two generalised Vector environments: AsyncVectorEnv and SyncVectorEnv along with several custom vector environment implementations.

The Vector Environments have the additional attributes for users to understand the implementation

num_envs - The number of sub-environment in the vector environment
observation_space - The batched observation space of the vector environment
single_observation_space - The observation space of a single sub-environment
action_space - The batched action space of the vector environment
single_action_space - The action space of a single sub-environment

Examples

>>> import gymnasium as gym
>>> envs = gym.make_vec("CartPole-v1", num_envs=3, vectorization_mode="sync", wrappers=(gym.wrappers.TimeAwareObservation,))
>>> envs = gym.wrappers.vector.ClipReward(envs, min_reward=0.2, max_reward=0.8)
>>> envs
<ClipReward, SyncVectorEnv(CartPole-v1, num_envs=3)>
>>> observations, infos = envs.reset(seed=123)
>>> observations
array([[ 0.01823519, -0.0446179 , -0.02796401, -0.03156282,  0.        ],
       [ 0.02852531,  0.02858594,  0.0469136 ,  0.02480598,  0.        ],
       [ 0.03517495, -0.000635  , -0.01098382, -0.03203924,  0.        ]])
>>> infos
{}
>>> _ = envs.action_space.seed(123)
>>> observations, rewards, terminations, truncations, infos = envs.step(envs.action_space.sample())
>>> observations
array([[ 0.01734283,  0.15089367, -0.02859527, -0.33293587,  1.        ],
       [ 0.02909703, -0.16717631,  0.04740972,  0.3319138 ,  1.        ],
       [ 0.03516225, -0.19559774, -0.01162461,  0.25715804,  1.        ]])
>>> rewards
array([0.8, 0.8, 0.8])
>>> terminations
array([False, False, False])
>>> truncations
array([False, False, False])
>>> infos
{}
>>> envs.close()

Note

The info parameter of reset() and step() was originally implemented before v0.25 as a list of dictionary for each sub-environment. However, this was modified in v0.25+ to be a dictionary with a NumPy array for each key. To use the old info style, utilise the DictInfoToList wrapper.

Note

All parallel environments should share the identical observation and action spaces. In other words, a vector of multiple different environments is not supported.

Note

make_vec() is the equivalent function to make() for vector environments.

Methods#

VectorEnv.step(actions: ActType) → tuple[ObsType, ArrayType, ArrayType, ArrayType, dict[str, Any]][source]#

Take an action for each parallel environment.

Parameters:: actions – Batch of actions with the action_space shape.
Returns:: Batch of (observations, rewards, terminations, truncations, infos)

Note

As the vector environments autoreset for a terminating and truncating sub-environments, the returned observation and info is not the final step’s observation or info which is instead stored in info as “final_observation” and “final_info”.

Example

>>> import gymnasium as gym
>>> import numpy as np
>>> envs = gym.make_vec("CartPole-v1", num_envs=3, vectorization_mode="sync")
>>> _ = envs.reset(seed=42)
>>> actions = np.array([1, 0, 1], dtype=np.int32)
>>> observations, rewards, terminations, truncations, infos = envs.step(actions)
>>> observations
array([[ 0.02727336,  0.18847767,  0.03625453, -0.26141977],
       [ 0.01431748, -0.24002443, -0.04731862,  0.3110827 ],
       [-0.03822722,  0.1710671 , -0.00848456, -0.2487226 ]],
      dtype=float32)
>>> rewards
array([1., 1., 1.])
>>> terminations
array([False, False, False])
>>> terminations
array([False, False, False])
>>> infos
{}

VectorEnv.reset(*, seed: int | list[int] | None = None, options: dict[str, Any] | None = None) → tuple[ObsType, dict[str, Any]][source]#

Reset all parallel environments and return a batch of initial observations and info.

Parameters:

seed – The environment reset seeds
options – If to return the options

Returns:

A batch of observations and info from the vectorized environment.

Example

>>> import gymnasium as gym
>>> envs = gym.make_vec("CartPole-v1", num_envs=3, vectorization_mode="sync")
>>> observations, infos = envs.reset(seed=42)
>>> observations
array([[ 0.0273956 , -0.00611216,  0.03585979,  0.0197368 ],
       [ 0.01522993, -0.04562247, -0.04799704,  0.03392126],
       [-0.03774345, -0.02418869, -0.00942293,  0.0469184 ]],
      dtype=float32)
>>> infos
{}

VectorEnv.render() → tuple[RenderFrame, ...] | None[source]#

Returns the rendered frames from the parallel environments.

Returns:: A tuple of rendered frames from the parallel environments

VectorEnv.close(**kwargs: Any)[source]#

Close all parallel environments and release resources.

It also closes all the existing image viewers, then calls close_extras() and set closed as True.

Warning

This function itself does not close the environments, it should be handled in close_extras(). This is generic for both synchronous and asynchronous vectorized environments.

Note

This will be automatically called when garbage collected or program exited.

Parameters:: **kwargs – Keyword arguments passed to close_extras()

Attributes#

VectorEnv.num_envs: int#: The number of sub-environments in the vector environment.

VectorEnv.action_space: gym.Space#: The (batched) action space. The input actions of step must be valid elements of action_space.

VectorEnv.observation_space: gym.Space#: The (batched) observation space. The observations returned by reset and step are valid elements of observation_space.

VectorEnv.single_action_space: gym.Space#: The action space of a sub-environment.

VectorEnv.single_observation_space: gym.Space#: The observation space of a sub-environment.

VectorEnv.spec: EnvSpec | None = None#: The EnvSpec of the environment normally set during gymnasium.make_vec()

VectorEnv.render_mode: str | None = None#: The render mode of the environment which should follow similar specifications to Env.render_mode.

VectorEnv.closed: bool = False#: If the vector environment has been closed already.

Additional Methods#

property VectorEnv.unwrapped#: Return the base environment.

property VectorEnv.np_random: Generator#

Returns the environment’s internal _np_random that if not set will initialise with a random seed.

Returns:: Instances of `np.random.Generator`

Making Vector Environments#

To create vector environments, gymnasium provides gymnasium.make_vec() as an equivalent function to gymnasium.make().