Vectorize

Gymnasium.vector.VectorEnv

class gymnasium.vector.VectorEnv[source]

Base class for vectorized environments to run multiple independent copies of the same environment in parallel.

Vector environments can provide a linear speed-up in the steps taken per second through sampling multiple sub-environments at the same time. Gymnasium contains two generalised Vector environments: AsyncVectorEnv and SyncVectorEnv along with several custom vector environment implementations. For reset() and step() batches observations, rewards, terminations, truncations and info for each sub-environment, see the example below. For the rewards, terminations, and truncations, the data is packaged into a NumPy array of shape (num_envs,). For observations (and actions, the batching process is dependent on the type of observation (and action) space, and generally optimised for neural network input/outputs. For info, the data is kept as a dictionary such that a key will give the data for all sub-environment.

For creating environments, make_vec() is a vector environment equivalent to make() for easily creating vector environments that contains several unique arguments for modifying environment qualities, number of environment, vectorizer type, vectorizer arguments.

Note

The info parameter of reset() and step() was originally implemented before v0.25 as a list of dictionary for each sub-environment. However, this was modified in v0.25+ to be a dictionary with a NumPy array for each key. To use the old info style, utilise the DictInfoToList wrapper.

Examples

>>> import gymnasium as gym
>>> envs = gym.make_vec("CartPole-v1", num_envs=3, vectorization_mode="sync", wrappers=(gym.wrappers.TimeAwareObservation,))
>>> envs = gym.wrappers.vector.ClipReward(envs, min_reward=0.2, max_reward=0.8)
>>> envs
<ClipReward, SyncVectorEnv(CartPole-v1, num_envs=3)>
>>> envs.num_envs
3
>>> envs.action_space
MultiDiscrete([2 2 2])
>>> envs.observation_space
Box([[-4.80000019        -inf -0.41887903        -inf  0.        ]
 [-4.80000019        -inf -0.41887903        -inf  0.        ]
 [-4.80000019        -inf -0.41887903        -inf  0.        ]], [[4.80000019e+00            inf 4.18879032e-01            inf
  5.00000000e+02]
 [4.80000019e+00            inf 4.18879032e-01            inf
  5.00000000e+02]
 [4.80000019e+00            inf 4.18879032e-01            inf
  5.00000000e+02]], (3, 5), float64)
>>> observations, infos = envs.reset(seed=123)
>>> observations
array([[ 0.01823519, -0.0446179 , -0.02796401, -0.03156282,  0.        ],
       [ 0.02852531,  0.02858594,  0.0469136 ,  0.02480598,  0.        ],
       [ 0.03517495, -0.000635  , -0.01098382, -0.03203924,  0.        ]])
>>> infos
{}
>>> _ = envs.action_space.seed(123)
>>> actions = envs.action_space.sample()
>>> observations, rewards, terminations, truncations, infos = envs.step(actions)
>>> observations
array([[ 0.01734283,  0.15089367, -0.02859527, -0.33293587,  1.        ],
       [ 0.02909703, -0.16717631,  0.04740972,  0.3319138 ,  1.        ],
       [ 0.03516225, -0.19559774, -0.01162461,  0.25715804,  1.        ]])
>>> rewards
array([0.8, 0.8, 0.8])
>>> terminations
array([False, False, False])
>>> truncations
array([False, False, False])
>>> infos
{}
>>> envs.close()

To avoid having to wait for all sub-environments to terminated before resetting, implementations will autoreset sub-environments on episode end (terminated or truncated is True). As a result, when adding observations to a replay buffer, this requires knowing when an observation (and info) for each sub-environment are the first observation from an autoreset. We recommend using an additional variable to store this information such as has_autoreset = np.logical_or(terminated, truncated).

The Vector Environments have the additional attributes for users to understand the implementation

Methods

VectorEnv.step(actions: ActType) tuple[ObsType, ArrayType, ArrayType, ArrayType, dict[str, Any]][source]

Take an action for each parallel environment.

Parameters:

actions – Batch of actions with the action_space shape.

Returns:

Batch of (observations, rewards, terminations, truncations, infos)

Note

As the vector environments autoreset for a terminating and truncating sub-environments, this will occur on the next step after terminated or truncated is True.

Example

>>> import gymnasium as gym
>>> import numpy as np
>>> envs = gym.make_vec("CartPole-v1", num_envs=3, vectorization_mode="sync")
>>> _ = envs.reset(seed=42)
>>> actions = np.array([1, 0, 1], dtype=np.int32)
>>> observations, rewards, terminations, truncations, infos = envs.step(actions)
>>> observations
array([[ 0.02727336,  0.18847767,  0.03625453, -0.26141977],
       [ 0.01431748, -0.24002443, -0.04731862,  0.3110827 ],
       [-0.03822722,  0.1710671 , -0.00848456, -0.2487226 ]],
      dtype=float32)
>>> rewards
array([1., 1., 1.])
>>> terminations
array([False, False, False])
>>> terminations
array([False, False, False])
>>> infos
{}
VectorEnv.reset(*, seed: int | None = None, options: dict[str, Any] | None = None) tuple[ObsType, dict[str, Any]][source]

Reset all parallel environments and return a batch of initial observations and info.

Parameters:
  • seed – The environment reset seed

  • options – If to return the options

Returns:

A batch of observations and info from the vectorized environment.

Example

>>> import gymnasium as gym
>>> envs = gym.make_vec("CartPole-v1", num_envs=3, vectorization_mode="sync")
>>> observations, infos = envs.reset(seed=42)
>>> observations
array([[ 0.0273956 , -0.00611216,  0.03585979,  0.0197368 ],
       [ 0.01522993, -0.04562247, -0.04799704,  0.03392126],
       [-0.03774345, -0.02418869, -0.00942293,  0.0469184 ]],
      dtype=float32)
>>> infos
{}
VectorEnv.render() tuple[RenderFrame, ...] | None[source]

Returns the rendered frames from the parallel environments.

Returns:

A tuple of rendered frames from the parallel environments

VectorEnv.close(**kwargs: Any)[source]

Close all parallel environments and release resources.

It also closes all the existing image viewers, then calls close_extras() and set closed as True.

Warning

This function itself does not close the environments, it should be handled in close_extras(). This is generic for both synchronous and asynchronous vectorized environments.

Note

This will be automatically called when garbage collected or program exited.

Parameters:

**kwargs – Keyword arguments passed to close_extras()

Attributes

VectorEnv.num_envs: int

The number of sub-environments in the vector environment.

VectorEnv.action_space: gym.Space

The (batched) action space. The input actions of step must be valid elements of action_space.

VectorEnv.observation_space: gym.Space

The (batched) observation space. The observations returned by reset and step are valid elements of observation_space.

VectorEnv.single_action_space: gym.Space

The action space of a sub-environment.

VectorEnv.single_observation_space: gym.Space

The observation space of a sub-environment.

VectorEnv.spec: EnvSpec | None = None

The EnvSpec of the environment normally set during gymnasium.make_vec()

VectorEnv.metadata: dict[str, Any] = {}

The metadata of the environment containing rendering modes, rendering fps, etc

VectorEnv.render_mode: str | None = None

The render mode of the environment which should follow similar specifications to Env.render_mode.

VectorEnv.closed: bool = False

If the vector environment has been closed already.

Additional Methods

property VectorEnv.unwrapped

Return the base environment.

property VectorEnv.np_random: Generator

Returns the environment’s internal _np_random that if not set will initialise with a random seed.

Returns:

Instances of `np.random.Generator`

property VectorEnv.np_random_seed: int | None

Returns the environment’s internal _np_random_seed that if not set will first initialise with a random int as seed.

If np_random_seed was set directly instead of through reset() or set_np_random_through_seed(), the seed will take the value -1.

Returns:

int – the seed of the current np_random or -1, if the seed of the rng is unknown

Making Vector Environments

To create vector environments, gymnasium provides gymnasium.make_vec() as an equivalent function to gymnasium.make().