Vectorizing Environment#

gymnasium.experimental.VectorEnv#

class gymnasium.experimental.vector.VectorEnv#

Base class for vectorized environments to run multiple independent copies of the same environment in parallel.

Vector environments can provide a linear speed-up in the steps taken per second through sampling multiple sub-environments at the same time. To prevent terminated environments waiting until all sub-environments have terminated or truncated, the vector environments autoreset sub-environments after they terminate or truncated. As a result, the final step’s observation and info are overwritten by the reset’s observation and info. Therefore, the observation and info for the final step of a sub-environment is stored in the info parameter, using “final_observation” and “final_info” respectively. See step() for more information.

The vector environments batch observations, rewards, terminations, truncations and info for each parallel environment. In addition, step() expects to receive a batch of actions for each parallel environment.

Gymnasium contains two types of Vector environments: AsyncVectorEnv and SyncVectorEnv.

The Vector Environments have the additional attributes for users to understand the implementation

Note

The info parameter of reset() and step() was originally implemented before OpenAI Gym v25 was a list of dictionary for each sub-environment. However, this was modified in OpenAI Gym v25+ and in Gymnasium to a dictionary with a NumPy array for each key. To use the old info style using the VectorListInfo.

Note

To render the sub-environments, use call() with “render” arguments. Remember to set the render_modes for all the sub-environments during initialization.

Note

All parallel environments should share the identical observation and action spaces. In other words, a vector of multiple different environments is not supported.

gymnasium.experimental.vector.VectorEnv.reset(self, *, seed: int | list[int] | None = None, options: dict[str, Any] | None = None) tuple[ObsType, dict[str, Any]]#

Reset all parallel environments and return a batch of initial observations and info.

Parameters:
  • seed – The environment reset seeds

  • options – If to return the options

Returns:

A batch of observations and info from the vectorized environment.

Example

>>> import gymnasium as gym
>>> envs = gym.vector.make("CartPole-v1", num_envs=3)
>>> envs.reset(seed=42)
(array([[ 0.0273956 , -0.00611216,  0.03585979,  0.0197368 ],
       [ 0.01522993, -0.04562247, -0.04799704,  0.03392126],
       [-0.03774345, -0.02418869, -0.00942293,  0.0469184 ]],
      dtype=float32), {})
gymnasium.experimental.vector.VectorEnv.step(self, actions: ActType) tuple[ObsType, ArrayType, ArrayType, ArrayType, dict]#

Take an action for each parallel environment.

Parameters:

actions – element of action_space Batch of actions.

Returns:

Batch of (observations, rewards, terminations, truncations, infos)

Note

As the vector environments autoreset for a terminating and truncating sub-environments, the returned observation and info is not the final step’s observation or info which is instead stored in info as “final_observation” and “final_info”.

Example

>>> import gymnasium as gym
>>> import numpy as np
>>> envs = gym.vector.make("CartPole-v1", num_envs=3)
>>> _ = envs.reset(seed=42)
>>> actions = np.array([1, 0, 1])
>>> observations, rewards, termination, truncation, infos = envs.step(actions)
>>> observations
array([[ 0.02727336,  0.18847767,  0.03625453, -0.26141977],
       [ 0.01431748, -0.24002443, -0.04731862,  0.3110827 ],
       [-0.03822722,  0.1710671 , -0.00848456, -0.2487226 ]],
      dtype=float32)
>>> rewards
array([1., 1., 1.])
>>> termination
array([False, False, False])
>>> termination
array([False, False, False])
>>> infos
{}
gymnasium.experimental.vector.VectorEnv.close(self, **kwargs)#

Close all parallel environments and release resources.

It also closes all the existing image viewers, then calls close_extras() and set closed as True.

Warning

This function itself does not close the environments, it should be handled in close_extras(). This is generic for both synchronous and asynchronous vectorized environments.

Note

This will be automatically called when garbage collected or program exited.

Parameters:

**kwargs – Keyword arguments passed to close_extras()

gymnasium.experimental.vector.VectorEnv.reset(self, *, seed: int | list[int] | None = None, options: dict[str, Any] | None = None) tuple[ObsType, dict[str, Any]]#

Reset all parallel environments and return a batch of initial observations and info.

Parameters:
  • seed – The environment reset seeds

  • options – If to return the options

Returns:

A batch of observations and info from the vectorized environment.

Example

>>> import gymnasium as gym
>>> envs = gym.vector.make("CartPole-v1", num_envs=3)
>>> envs.reset(seed=42)
(array([[ 0.0273956 , -0.00611216,  0.03585979,  0.0197368 ],
       [ 0.01522993, -0.04562247, -0.04799704,  0.03392126],
       [-0.03774345, -0.02418869, -0.00942293,  0.0469184 ]],
      dtype=float32), {})

gymnasium.experimental.vector.AsyncVectorEnv#

class gymnasium.experimental.vector.AsyncVectorEnv(env_fns: Sequence[Callable[[], Env]], shared_memory: bool = True, copy: bool = True, context: str | None = None, daemon: bool = True, worker: callable | None = None)#

Vectorized environment that runs multiple environments in parallel.

It uses multiprocessing processes, and pipes for communication.

Example

>>> import gymnasium as gym
>>> env = gym.vector.AsyncVectorEnv([
...     lambda: gym.make("Pendulum-v1", g=9.81),
...     lambda: gym.make("Pendulum-v1", g=1.62)
... ])
>>> env.reset(seed=42)
(array([[-0.14995256,  0.9886932 , -0.12224312],
       [ 0.5760367 ,  0.8174238 , -0.91244936]], dtype=float32), {})
Parameters:
  • env_fns – Functions that create the environments.

  • shared_memory – If True, then the observations from the worker processes are communicated back through shared variables. This can improve the efficiency if the observations are large (e.g. images).

  • copy – If True, then the reset() and step() methods return a copy of the observations.

  • context – Context for `multiprocessing`_. If None, then the default context is used.

  • daemon – If True, then subprocesses have daemon flag turned on; that is, they will quit if the head process quits. However, daemon=True prevents subprocesses to spawn children, so for some environments you may want to have it set to False.

  • worker – If set, then use that worker in a subprocess instead of a default one. Can be useful to override some inner vector env logic, for instance, how resets on termination or truncation are handled.

Warnings: worker is an advanced mode option. It provides a high degree of flexibility and a high chance

to shoot yourself in the foot; thus, if you are writing your own worker, it is recommended to start from the code for _worker (or _worker_shared_memory) method, and add changes.

Raises:
  • RuntimeError – If the observation space of some sub-environment does not match observation_space (or, by default, the observation space of the first sub-environment).

  • ValueError – If observation_space is a custom space (i.e. not a default space in Gym, such as gymnasium.spaces.Box, gymnasium.spaces.Discrete, or gymnasium.spaces.Dict) and shared_memory is True.

gymnasium.experimental.vector.AsyncVectorEnv.reset(self, *, seed: int | list[int] | None = None, options: dict | None = None)#

Reset all parallel environments and return a batch of initial observations and info.

Parameters:
  • seed – The environment reset seeds

  • options – If to return the options

Returns:

A batch of observations and info from the vectorized environment.

gymnasium.experimental.vector.AsyncVectorEnv.step(self, actions)#

Take an action for each parallel environment.

Parameters:

actions – element of action_space Batch of actions.

Returns:

Batch of (observations, rewards, terminations, truncations, infos)

gymnasium.experimental.vector.AsyncVectorEnv.close(self, **kwargs)#

Close all parallel environments and release resources.

It also closes all the existing image viewers, then calls close_extras() and set closed as True.

Warning

This function itself does not close the environments, it should be handled in close_extras(). This is generic for both synchronous and asynchronous vectorized environments.

Note

This will be automatically called when garbage collected or program exited.

Parameters:

**kwargs – Keyword arguments passed to close_extras()

gymnasium.experimental.vector.AsyncVectorEnv.call(self, name: str, *args, **kwargs) list[Any]#

Call a method, or get a property, from each parallel environment.

Parameters:
  • name (str) – Name of the method or property to call.

  • *args – Arguments to apply to the method call.

  • **kwargs – Keyword arguments to apply to the method call.

Returns:

List of the results of the individual calls to the method or property for each environment.

gymnasium.experimental.vector.AsyncVectorEnv.get_attr(self, name: str)#

Get a property from each parallel environment.

Parameters:

name (str) – Name of the property to be get from each individual environment.

Returns:

The property with name

gymnasium.experimental.vector.AsyncVectorEnv.set_attr(self, name: str, values: list[Any] | tuple[Any] | object)#

Sets an attribute of the sub-environments.

Parameters:
  • name – Name of the property to be set in each individual environment.

  • values – Values of the property to be set to. If values is a list or tuple, then it corresponds to the values for each individual environment, otherwise a single value is set for all environments.

Raises:
  • ValueError – Values must be a list or tuple with length equal to the number of environments.

  • AlreadyPendingCallError – Calling set_attr while waiting for a pending call to complete.

gymnasium.experimental.vector.SyncVectorEnv#

class gymnasium.experimental.vector.SyncVectorEnv(env_fns: Iterator[Callable[[], Env]], copy: bool = True)#

Vectorized environment that serially runs multiple environments.

Example

>>> import gymnasium as gym
>>> env = gym.vector.SyncVectorEnv([
...     lambda: gym.make("Pendulum-v1", g=9.81),
...     lambda: gym.make("Pendulum-v1", g=1.62)
... ])
>>> env.reset(seed=42)
(array([[-0.14995256,  0.9886932 , -0.12224312],
       [ 0.5760367 ,  0.8174238 , -0.91244936]], dtype=float32), {})
Parameters:
  • env_fns – iterable of callable functions that create the environments.

  • copy – If True, then the reset() and step() methods return a copy of the observations.

Raises:

RuntimeError – If the observation space of some sub-environment does not match observation_space (or, by default, the observation space of the first sub-environment).

gymnasium.experimental.vector.SyncVectorEnv.reset(self, seed: int | list[int] | None = None, options: dict | None = None)#

Waits for the calls triggered by reset_async() to finish and returns the results.

Parameters:
  • seed – The reset environment seed

  • options – Option information for the environment reset

Returns:

The reset observation of the environment and reset information

gymnasium.experimental.vector.SyncVectorEnv.step(self, actions)#

Steps through each of the environments returning the batched results.

Returns:

The batched environment step results

gymnasium.experimental.vector.SyncVectorEnv.close(self, **kwargs)#

Close all parallel environments and release resources.

It also closes all the existing image viewers, then calls close_extras() and set closed as True.

Warning

This function itself does not close the environments, it should be handled in close_extras(). This is generic for both synchronous and asynchronous vectorized environments.

Note

This will be automatically called when garbage collected or program exited.

Parameters:

**kwargs – Keyword arguments passed to close_extras()

gymnasium.experimental.vector.SyncVectorEnv.call(self, name, *args, **kwargs) tuple#

Calls the method with name and applies args and kwargs.

Parameters:
  • name – The method name

  • *args – The method args

  • **kwargs – The method kwargs

Returns:

Tuple of results

gymnasium.experimental.vector.SyncVectorEnv.get_attr(self, name: str)#

Get a property from each parallel environment.

Parameters:

name (str) – Name of the property to be get from each individual environment.

Returns:

The property with name

gymnasium.experimental.vector.SyncVectorEnv.set_attr(self, name: str, values: list | tuple | Any)#

Sets an attribute of the sub-environments.

Parameters:
  • name – The property name to change

  • values – Values of the property to be set to. If values is a list or tuple, then it corresponds to the values for each individual environment, otherwise, a single value is set for all environments.

Raises:

ValueError – Values must be a list or tuple with length equal to the number of environments.