Vectorizing Environment#

gymnasium.experimental.VectorEnv#

class gymnasium.experimental.vector.VectorEnv#

Base class for vectorized environments to run multiple independent copies of the same environment in parallel.

Vector environments can provide a linear speed-up in the steps taken per second through sampling multiple sub-environments at the same time. To prevent terminated environments waiting until all sub-environments have terminated or truncated, the vector environments autoreset sub-environments after they terminate or truncated. As a result, the final step’s observation and info are overwritten by the reset’s observation and info. Therefore, the observation and info for the final step of a sub-environment is stored in the info parameter, using “final_observation” and “final_info” respectively. See step() for more information.

The vector environments batch observations, rewards, terminations, truncations and info for each parallel environment. In addition, step() expects to receive a batch of actions for each parallel environment.

Gymnasium contains two types of Vector environments: AsyncVectorEnv and SyncVectorEnv.

The Vector Environments have the additional attributes for users to understand the implementation

num_envs - The number of sub-environment in the vector environment
observation_space - The batched observation space of the vector environment
single_observation_space - The observation space of a single sub-environment
action_space - The batched action space of the vector environment
single_action_space - The action space of a single sub-environment

Note

The info parameter of reset() and step() was originally implemented before OpenAI Gym v25 was a list of dictionary for each sub-environment. However, this was modified in OpenAI Gym v25+ and in Gymnasium to a dictionary with a NumPy array for each key. To use the old info style using the VectorListInfo.

Note

To render the sub-environments, use call() with “render” arguments. Remember to set the render_modes for all the sub-environments during initialization.

Note

All parallel environments should share the identical observation and action spaces. In other words, a vector of multiple different environments is not supported.

gymnasium.experimental.vector.VectorEnv.reset(self, *, seed: int | list[int] | None = None, options: dict[str, Any] | None = None) → tuple[ObsType, dict[str, Any]]#

Reset all parallel environments and return a batch of initial observations and info.

Parameters:

seed – The environment reset seeds
options – If to return the options

Returns:

A batch of observations and info from the vectorized environment.

Example

>>> import gymnasium as gym
>>> envs = gym.vector.make("CartPole-v1", num_envs=3)
>>> envs.reset(seed=42)
(array([[ 0.0273956 , -0.00611216,  0.03585979,  0.0197368 ],
       [ 0.01522993, -0.04562247, -0.04799704,  0.03392126],
       [-0.03774345, -0.02418869, -0.00942293,  0.0469184 ]],
      dtype=float32), {})

gymnasium.experimental.vector.VectorEnv.step(self, actions: ActType) → tuple[ObsType, ArrayType, ArrayType, ArrayType, dict]#

Take an action for each parallel environment.

Parameters:: actions – element of action_space Batch of actions.
Returns:: Batch of (observations, rewards, terminations, truncations, infos)

Note

As the vector environments autoreset for a terminating and truncating sub-environments, the returned observation and info is not the final step’s observation or info which is instead stored in info as “final_observation” and “final_info”.

Example

>>> import gymnasium as gym
>>> import numpy as np
>>> envs = gym.vector.make("CartPole-v1", num_envs=3)
>>> _ = envs.reset(seed=42)
>>> actions = np.array([1, 0, 1])
>>> observations, rewards, termination, truncation, infos = envs.step(actions)
>>> observations
array([[ 0.02727336,  0.18847767,  0.03625453, -0.26141977],
       [ 0.01431748, -0.24002443, -0.04731862,  0.3110827 ],
       [-0.03822722,  0.1710671 , -0.00848456, -0.2487226 ]],
      dtype=float32)
>>> rewards
array([1., 1., 1.])
>>> termination
array([False, False, False])
>>> termination
array([False, False, False])
>>> infos
{}

gymnasium.experimental.vector.VectorEnv.close(self, **kwargs)#

Close all parallel environments and release resources.

It also closes all the existing image viewers, then calls close_extras() and set closed as True.

Warning

This function itself does not close the environments, it should be handled in close_extras(). This is generic for both synchronous and asynchronous vectorized environments.

Note

This will be automatically called when garbage collected or program exited.

Parameters:: **kwargs – Keyword arguments passed to close_extras()

gymnasium.experimental.vector.VectorEnv.reset(self, *, seed: int | list[int] | None = None, options: dict[str, Any] | None = None) → tuple[ObsType, dict[str, Any]]#

Reset all parallel environments and return a batch of initial observations and info.

Parameters:

seed – The environment reset seeds
options – If to return the options

Returns:

A batch of observations and info from the vectorized environment.

Example

>>> import gymnasium as gym
>>> envs = gym.vector.make("CartPole-v1", num_envs=3)
>>> envs.reset(seed=42)
(array([[ 0.0273956 , -0.00611216,  0.03585979,  0.0197368 ],
       [ 0.01522993, -0.04562247, -0.04799704,  0.03392126],
       [-0.03774345, -0.02418869, -0.00942293,  0.0469184 ]],
      dtype=float32), {})

gymnasium.experimental.vector.AsyncVectorEnv#

class gymnasium.experimental.vector.AsyncVectorEnv(env_fns: Sequence[Callable[[], Env]], shared_memory: bool = True, copy: bool = True, context: str | None = None, daemon: bool = True, worker: callable | None = None)#

Vectorized environment that runs multiple environments in parallel.

It uses multiprocessing processes, and pipes for communication.

Example

>>> import gymnasium as gym
>>> env = gym.vector.AsyncVectorEnv([
...     lambda: gym.make("Pendulum-v1", g=9.81),
...     lambda: gym.make("Pendulum-v1", g=1.62)
... ])
>>> env.reset(seed=42)
(array([[-0.14995256,  0.9886932 , -0.12224312],
       [ 0.5760367 ,  0.8174238 , -0.91244936]], dtype=float32), {})

Parameters:

env_fns – Functions that create the environments.
shared_memory – If True, then the observations from the worker processes are communicated back through shared variables. This can improve the efficiency if the observations are large (e.g. images).
copy – If True, then the reset() and step() methods return a copy of the observations.
context – Context for `multiprocessing`_. If None, then the default context is used.
daemon – If True, then subprocesses have daemon flag turned on; that is, they will quit if the head process quits. However, daemon=True prevents subprocesses to spawn children, so for some environments you may want to have it set to False.
worker – If set, then use that worker in a subprocess instead of a default one. Can be useful to override some inner vector env logic, for instance, how resets on termination or truncation are handled.

Warnings: worker is an advanced mode option. It provides a high degree of flexibility and a high chance: to shoot yourself in the foot; thus, if you are writing your own worker, it is recommended to start from the code for _worker (or _worker_shared_memory) method, and add changes.

Raises:

RuntimeError – If the observation space of some sub-environment does not match observation_space (or, by default, the observation space of the first sub-environment).
ValueError – If observation_space is a custom space (i.e. not a default space in Gym, such as gymnasium.spaces.Box, gymnasium.spaces.Discrete, or gymnasium.spaces.Dict) and shared_memory is True.

gymnasium.experimental.vector.AsyncVectorEnv.reset(self, *, seed: int | list[int] | None = None, options: dict | None = None)#

Reset all parallel environments and return a batch of initial observations and info.

Parameters:

seed – The environment reset seeds
options – If to return the options

Returns:

A batch of observations and info from the vectorized environment.

gymnasium.experimental.vector.AsyncVectorEnv.step(self, actions)#

Take an action for each parallel environment.

Parameters:: actions – element of action_space Batch of actions.
Returns:: Batch of (observations, rewards, terminations, truncations, infos)

gymnasium.experimental.vector.AsyncVectorEnv.close(self, **kwargs)#

Close all parallel environments and release resources.

It also closes all the existing image viewers, then calls close_extras() and set closed as True.

Warning

This function itself does not close the environments, it should be handled in close_extras(). This is generic for both synchronous and asynchronous vectorized environments.

Note

This will be automatically called when garbage collected or program exited.

Parameters:: **kwargs – Keyword arguments passed to close_extras()

gymnasium.experimental.vector.AsyncVectorEnv.call(self, name: str, *args, **kwargs) → list[Any]#

Call a method, or get a property, from each parallel environment.

Parameters:

name (str) – Name of the method or property to call.
*args – Arguments to apply to the method call.
**kwargs – Keyword arguments to apply to the method call.

Returns:

List of the results of the individual calls to the method or property for each environment.

gymnasium.experimental.vector.AsyncVectorEnv.get_attr(self, name: str)#

Get a property from each parallel environment.

Parameters:: name (str) – Name of the property to be get from each individual environment.
Returns:: The property with name

gymnasium.experimental.vector.AsyncVectorEnv.set_attr(self, name: str, values: list[Any] | tuple[Any] | object)#

Sets an attribute of the sub-environments.

Parameters:

name – Name of the property to be set in each individual environment.
values – Values of the property to be set to. If values is a list or tuple, then it corresponds to the values for each individual environment, otherwise a single value is set for all environments.

Raises:

ValueError – Values must be a list or tuple with length equal to the number of environments.
AlreadyPendingCallError – Calling set_attr while waiting for a pending call to complete.

gymnasium.experimental.vector.SyncVectorEnv#

class gymnasium.experimental.vector.SyncVectorEnv(env_fns: Iterator[Callable[[], Env]], copy: bool = True)#

Vectorized environment that serially runs multiple environments.

Example

>>> import gymnasium as gym
>>> env = gym.vector.SyncVectorEnv([
...     lambda: gym.make("Pendulum-v1", g=9.81),
...     lambda: gym.make("Pendulum-v1", g=1.62)
... ])
>>> env.reset(seed=42)
(array([[-0.14995256,  0.9886932 , -0.12224312],
       [ 0.5760367 ,  0.8174238 , -0.91244936]], dtype=float32), {})

Parameters:

env_fns – iterable of callable functions that create the environments.
copy – If True, then the reset() and step() methods return a copy of the observations.

Raises:

RuntimeError – If the observation space of some sub-environment does not match observation_space (or, by default, the observation space of the first sub-environment).

gymnasium.experimental.vector.SyncVectorEnv.reset(self, seed: int | list[int] | None = None, options: dict | None = None)#

Waits for the calls triggered by reset_async() to finish and returns the results.

Parameters:

seed – The reset environment seed
options – Option information for the environment reset

Returns:

The reset observation of the environment and reset information

gymnasium.experimental.vector.SyncVectorEnv.step(self, actions)#

Steps through each of the environments returning the batched results.

Returns:: The batched environment step results

gymnasium.experimental.vector.SyncVectorEnv.close(self, **kwargs)#

Close all parallel environments and release resources.

It also closes all the existing image viewers, then calls close_extras() and set closed as True.

Warning

This function itself does not close the environments, it should be handled in close_extras(). This is generic for both synchronous and asynchronous vectorized environments.

Note

This will be automatically called when garbage collected or program exited.

Parameters:: **kwargs – Keyword arguments passed to close_extras()

gymnasium.experimental.vector.SyncVectorEnv.call(self, name, *args, **kwargs) → tuple#

Calls the method with name and applies args and kwargs.

Parameters:

name – The method name
*args – The method args
**kwargs – The method kwargs

Returns:

Tuple of results

gymnasium.experimental.vector.SyncVectorEnv.get_attr(self, name: str)#

Get a property from each parallel environment.

Parameters:: name (str) – Name of the property to be get from each individual environment.
Returns:: The property with name

gymnasium.experimental.vector.SyncVectorEnv.set_attr(self, name: str, values: list | tuple | Any)#

Sets an attribute of the sub-environments.

Parameters:

name – The property name to change
values – Values of the property to be set to. If values is a list or tuple, then it corresponds to the values for each individual environment, otherwise, a single value is set for all environments.

Raises:

ValueError – Values must be a list or tuple with length equal to the number of environments.