Vectorizing Environment#
gymnasium.experimental.VectorEnv#
- class gymnasium.experimental.vector.VectorEnv#
Base class for vectorized environments to run multiple independent copies of the same environment in parallel.
Vector environments can provide a linear speed-up in the steps taken per second through sampling multiple sub-environments at the same time. To prevent terminated environments waiting until all sub-environments have terminated or truncated, the vector environments autoreset sub-environments after they terminate or truncated. As a result, the final step’s observation and info are overwritten by the reset’s observation and info. Therefore, the observation and info for the final step of a sub-environment is stored in the info parameter, using “final_observation” and “final_info” respectively. See
step()
for more information.The vector environments batch observations, rewards, terminations, truncations and info for each parallel environment. In addition,
step()
expects to receive a batch of actions for each parallel environment.Gymnasium contains two types of Vector environments:
AsyncVectorEnv
andSyncVectorEnv
.The Vector Environments have the additional attributes for users to understand the implementation
num_envs
- The number of sub-environment in the vector environmentobservation_space
- The batched observation space of the vector environmentsingle_observation_space
- The observation space of a single sub-environmentaction_space
- The batched action space of the vector environmentsingle_action_space
- The action space of a single sub-environment
Note
The info parameter of
reset()
andstep()
was originally implemented before OpenAI Gym v25 was a list of dictionary for each sub-environment. However, this was modified in OpenAI Gym v25+ and in Gymnasium to a dictionary with a NumPy array for each key. To use the old info style using theVectorListInfo
.Note
To render the sub-environments, use
call()
with “render” arguments. Remember to set the render_modes for all the sub-environments during initialization.Note
All parallel environments should share the identical observation and action spaces. In other words, a vector of multiple different environments is not supported.
- gymnasium.experimental.vector.VectorEnv.reset(self, *, seed: int | list[int] | None = None, options: dict[str, Any] | None = None) tuple[ObsType, dict[str, Any]] #
Reset all parallel environments and return a batch of initial observations and info.
- Parameters:
seed – The environment reset seeds
options – If to return the options
- Returns:
A batch of observations and info from the vectorized environment.
Example
>>> import gymnasium as gym >>> envs = gym.vector.make("CartPole-v1", num_envs=3) >>> envs.reset(seed=42) (array([[ 0.0273956 , -0.00611216, 0.03585979, 0.0197368 ], [ 0.01522993, -0.04562247, -0.04799704, 0.03392126], [-0.03774345, -0.02418869, -0.00942293, 0.0469184 ]], dtype=float32), {})
- gymnasium.experimental.vector.VectorEnv.step(self, actions: ActType) tuple[ObsType, ArrayType, ArrayType, ArrayType, dict] #
Take an action for each parallel environment.
- Parameters:
actions – element of
action_space
Batch of actions.- Returns:
Batch of (observations, rewards, terminations, truncations, infos)
Note
As the vector environments autoreset for a terminating and truncating sub-environments, the returned observation and info is not the final step’s observation or info which is instead stored in info as “final_observation” and “final_info”.
Example
>>> import gymnasium as gym >>> import numpy as np >>> envs = gym.vector.make("CartPole-v1", num_envs=3) >>> _ = envs.reset(seed=42) >>> actions = np.array([1, 0, 1]) >>> observations, rewards, termination, truncation, infos = envs.step(actions) >>> observations array([[ 0.02727336, 0.18847767, 0.03625453, -0.26141977], [ 0.01431748, -0.24002443, -0.04731862, 0.3110827 ], [-0.03822722, 0.1710671 , -0.00848456, -0.2487226 ]], dtype=float32) >>> rewards array([1., 1., 1.]) >>> termination array([False, False, False]) >>> termination array([False, False, False]) >>> infos {}
- gymnasium.experimental.vector.VectorEnv.close(self, **kwargs)#
Close all parallel environments and release resources.
It also closes all the existing image viewers, then calls
close_extras()
and setclosed
asTrue
.Warning
This function itself does not close the environments, it should be handled in
close_extras()
. This is generic for both synchronous and asynchronous vectorized environments.Note
This will be automatically called when garbage collected or program exited.
- Parameters:
**kwargs – Keyword arguments passed to
close_extras()
- gymnasium.experimental.vector.VectorEnv.reset(self, *, seed: int | list[int] | None = None, options: dict[str, Any] | None = None) tuple[ObsType, dict[str, Any]] #
Reset all parallel environments and return a batch of initial observations and info.
- Parameters:
seed – The environment reset seeds
options – If to return the options
- Returns:
A batch of observations and info from the vectorized environment.
Example
>>> import gymnasium as gym >>> envs = gym.vector.make("CartPole-v1", num_envs=3) >>> envs.reset(seed=42) (array([[ 0.0273956 , -0.00611216, 0.03585979, 0.0197368 ], [ 0.01522993, -0.04562247, -0.04799704, 0.03392126], [-0.03774345, -0.02418869, -0.00942293, 0.0469184 ]], dtype=float32), {})
gymnasium.experimental.vector.AsyncVectorEnv#
- class gymnasium.experimental.vector.AsyncVectorEnv(env_fns: Sequence[Callable[[], Env]], shared_memory: bool = True, copy: bool = True, context: str | None = None, daemon: bool = True, worker: callable | None = None)#
Vectorized environment that runs multiple environments in parallel.
It uses
multiprocessing
processes, and pipes for communication.Example
>>> import gymnasium as gym >>> env = gym.vector.AsyncVectorEnv([ ... lambda: gym.make("Pendulum-v1", g=9.81), ... lambda: gym.make("Pendulum-v1", g=1.62) ... ]) >>> env.reset(seed=42) (array([[-0.14995256, 0.9886932 , -0.12224312], [ 0.5760367 , 0.8174238 , -0.91244936]], dtype=float32), {})
- Parameters:
env_fns – Functions that create the environments.
shared_memory – If
True
, then the observations from the worker processes are communicated back through shared variables. This can improve the efficiency if the observations are large (e.g. images).copy – If
True
, then thereset()
andstep()
methods return a copy of the observations.context – Context for `multiprocessing`_. If
None
, then the default context is used.daemon – If
True
, then subprocesses havedaemon
flag turned on; that is, they will quit if the head process quits. However,daemon=True
prevents subprocesses to spawn children, so for some environments you may want to have it set toFalse
.worker – If set, then use that worker in a subprocess instead of a default one. Can be useful to override some inner vector env logic, for instance, how resets on termination or truncation are handled.
- Warnings: worker is an advanced mode option. It provides a high degree of flexibility and a high chance
to shoot yourself in the foot; thus, if you are writing your own worker, it is recommended to start from the code for
_worker
(or_worker_shared_memory
) method, and add changes.
- Raises:
RuntimeError – If the observation space of some sub-environment does not match observation_space (or, by default, the observation space of the first sub-environment).
ValueError – If observation_space is a custom space (i.e. not a default space in Gym, such as gymnasium.spaces.Box, gymnasium.spaces.Discrete, or gymnasium.spaces.Dict) and shared_memory is True.
- gymnasium.experimental.vector.AsyncVectorEnv.reset(self, *, seed: int | list[int] | None = None, options: dict | None = None)#
Reset all parallel environments and return a batch of initial observations and info.
- Parameters:
seed – The environment reset seeds
options – If to return the options
- Returns:
A batch of observations and info from the vectorized environment.
- gymnasium.experimental.vector.AsyncVectorEnv.step(self, actions)#
Take an action for each parallel environment.
- Parameters:
actions – element of
action_space
Batch of actions.- Returns:
Batch of (observations, rewards, terminations, truncations, infos)
- gymnasium.experimental.vector.AsyncVectorEnv.close(self, **kwargs)#
Close all parallel environments and release resources.
It also closes all the existing image viewers, then calls
close_extras()
and setclosed
asTrue
.Warning
This function itself does not close the environments, it should be handled in
close_extras()
. This is generic for both synchronous and asynchronous vectorized environments.Note
This will be automatically called when garbage collected or program exited.
- Parameters:
**kwargs – Keyword arguments passed to
close_extras()
- gymnasium.experimental.vector.AsyncVectorEnv.call(self, name: str, *args, **kwargs) list[Any] #
Call a method, or get a property, from each parallel environment.
- Parameters:
name (str) – Name of the method or property to call.
*args – Arguments to apply to the method call.
**kwargs – Keyword arguments to apply to the method call.
- Returns:
List of the results of the individual calls to the method or property for each environment.
- gymnasium.experimental.vector.AsyncVectorEnv.get_attr(self, name: str)#
Get a property from each parallel environment.
- Parameters:
name (str) – Name of the property to be get from each individual environment.
- Returns:
The property with name
- gymnasium.experimental.vector.AsyncVectorEnv.set_attr(self, name: str, values: list[Any] | tuple[Any] | object)#
Sets an attribute of the sub-environments.
- Parameters:
name – Name of the property to be set in each individual environment.
values – Values of the property to be set to. If
values
is a list or tuple, then it corresponds to the values for each individual environment, otherwise a single value is set for all environments.
- Raises:
ValueError – Values must be a list or tuple with length equal to the number of environments.
AlreadyPendingCallError – Calling set_attr while waiting for a pending call to complete.
gymnasium.experimental.vector.SyncVectorEnv#
- class gymnasium.experimental.vector.SyncVectorEnv(env_fns: Iterator[Callable[[], Env]], copy: bool = True)#
Vectorized environment that serially runs multiple environments.
Example
>>> import gymnasium as gym >>> env = gym.vector.SyncVectorEnv([ ... lambda: gym.make("Pendulum-v1", g=9.81), ... lambda: gym.make("Pendulum-v1", g=1.62) ... ]) >>> env.reset(seed=42) (array([[-0.14995256, 0.9886932 , -0.12224312], [ 0.5760367 , 0.8174238 , -0.91244936]], dtype=float32), {})
- Parameters:
- Raises:
RuntimeError – If the observation space of some sub-environment does not match observation_space (or, by default, the observation space of the first sub-environment).
- gymnasium.experimental.vector.SyncVectorEnv.reset(self, seed: int | list[int] | None = None, options: dict | None = None)#
Waits for the calls triggered by
reset_async()
to finish and returns the results.- Parameters:
seed – The reset environment seed
options – Option information for the environment reset
- Returns:
The reset observation of the environment and reset information
- gymnasium.experimental.vector.SyncVectorEnv.step(self, actions)#
Steps through each of the environments returning the batched results.
- Returns:
The batched environment step results
- gymnasium.experimental.vector.SyncVectorEnv.close(self, **kwargs)#
Close all parallel environments and release resources.
It also closes all the existing image viewers, then calls
close_extras()
and setclosed
asTrue
.Warning
This function itself does not close the environments, it should be handled in
close_extras()
. This is generic for both synchronous and asynchronous vectorized environments.Note
This will be automatically called when garbage collected or program exited.
- Parameters:
**kwargs – Keyword arguments passed to
close_extras()
- gymnasium.experimental.vector.SyncVectorEnv.call(self, name, *args, **kwargs) tuple #
Calls the method with name and applies args and kwargs.
- Parameters:
name – The method name
*args – The method args
**kwargs – The method kwargs
- Returns:
Tuple of results
- gymnasium.experimental.vector.SyncVectorEnv.get_attr(self, name: str)#
Get a property from each parallel environment.
- Parameters:
name (str) – Name of the property to be get from each individual environment.
- Returns:
The property with name
- gymnasium.experimental.vector.SyncVectorEnv.set_attr(self, name: str, values: list | tuple | Any)#
Sets an attribute of the sub-environments.
- Parameters:
name – The property name to change
values – Values of the property to be set to. If
values
is a list or tuple, then it corresponds to the values for each individual environment, otherwise, a single value is set for all environments.
- Raises:
ValueError – Values must be a list or tuple with length equal to the number of environments.