Vector#
Gymnasium.vector.VectorEnv#
- class gymnasium.vector.VectorEnv(num_envs: int, observation_space: Space, action_space: Space)#
Base class for vectorized environments to run multiple independent copies of the same environment in parallel.
Vector environments can provide a linear speed-up in the steps taken per second through sampling multiple sub-environments at the same time. To prevent terminated environments waiting until all sub-environments have terminated or truncated, the vector environments autoreset sub-environments after they terminate or truncated. As a result, the final step’s observation and info are overwritten by the reset’s observation and info. Therefore, the observation and info for the final step of a sub-environment is stored in the info parameter, using “final_observation” and “final_info” respectively. See
step()
for more information.The vector environments batch observations, rewards, terminations, truncations and info for each parallel environment. In addition,
step()
expects to receive a batch of actions for each parallel environment.Gymnasium contains two types of Vector environments:
AsyncVectorEnv
andSyncVectorEnv
.The Vector Environments have the additional attributes for users to understand the implementation
num_envs
- The number of sub-environment in the vector environmentobservation_space
- The batched observation space of the vector environmentsingle_observation_space
- The observation space of a single sub-environmentaction_space
- The batched action space of the vector environmentsingle_action_space
- The action space of a single sub-environment
Note
The info parameter of
reset()
andstep()
was originally implemented before OpenAI Gym v25 was a list of dictionary for each sub-environment. However, this was modified in OpenAI Gym v25+ and in Gymnasium to a dictionary with a NumPy array for each key. To use the old info style using theVectorListInfo
.Note
To render the sub-environments, use
call()
with “render” arguments. Remember to set the render_modes for all the sub-environments during initialization.Note
All parallel environments should share the identical observation and action spaces. In other words, a vector of multiple different environments is not supported.
- Parameters:
num_envs – Number of environments in the vectorized environment.
observation_space – Observation space of a single environment.
action_space – Action space of a single environment.
Methods#
- VectorEnv.reset(*, seed: int | List[int] | None = None, options: dict | None = None)#
Reset all parallel environments and return a batch of initial observations and info.
- Parameters:
seed – The environment reset seeds
options – If to return the options
- Returns:
A batch of observations and info from the vectorized environment.
Example
>>> import gymnasium as gym >>> envs = gym.vector.make("CartPole-v1", num_envs=3) >>> envs.reset(seed=42) (array([[ 0.0273956 , -0.00611216, 0.03585979, 0.0197368 ], [ 0.01522993, -0.04562247, -0.04799704, 0.03392126], [-0.03774345, -0.02418869, -0.00942293, 0.0469184 ]], dtype=float32), {})
- VectorEnv.step(actions) Tuple[Any, ndarray[Any, dtype[Any]], ndarray[Any, dtype[Any]], ndarray[Any, dtype[Any]], dict] #
Take an action for each parallel environment.
- Parameters:
actions – element of
action_space
Batch of actions.- Returns:
Batch of (observations, rewards, terminations, truncations, infos)
Note
As the vector environments autoreset for a terminating and truncating sub-environments, the returned observation and info is not the final step’s observation or info which is instead stored in info as “final_observation” and “final_info”.
Example
>>> import gymnasium as gym >>> import numpy as np >>> envs = gym.vector.make("CartPole-v1", num_envs=3) >>> _ = envs.reset(seed=42) >>> actions = np.array([1, 0, 1]) >>> observations, rewards, termination, truncation, infos = envs.step(actions) >>> observations array([[ 0.02727336, 0.18847767, 0.03625453, -0.26141977], [ 0.01431748, -0.24002443, -0.04731862, 0.3110827 ], [-0.03822722, 0.1710671 , -0.00848456, -0.2487226 ]], dtype=float32) >>> rewards array([1., 1., 1.]) >>> termination array([False, False, False]) >>> termination array([False, False, False]) >>> infos {}
- VectorEnv.close(**kwargs)#
Close all parallel environments and release resources.
It also closes all the existing image viewers, then calls
close_extras()
and setclosed
asTrue
.Warning
This function itself does not close the environments, it should be handled in
close_extras()
. This is generic for both synchronous and asynchronous vectorized environments.Note
This will be automatically called when garbage collected or program exited.
- Parameters:
**kwargs – Keyword arguments passed to
close_extras()
Attributes#
- action_space#
The (batched) action space. The input actions of step must be valid elements of action_space.:
>>> envs = gymnasium.vector.make("CartPole-v1", num_envs=3) >>> envs.action_space MultiDiscrete([2 2 2])
- observation_space#
The (batched) observation space. The observations returned by reset and step are valid elements of observation_space.:
>>> envs = gymnasium.vector.make("CartPole-v1", num_envs=3) >>> envs.observation_space Box([[-4.8 ...]], [[4.8 ...]], (3, 4), float32)
- single_action_space#
The action space of an environment copy.:
>>> envs = gymnasium.vector.make("CartPole-v1", num_envs=3) >>> envs.single_action_space Discrete(2)
- single_observation_space#
The observation space of an environment copy.:
>>> envs = gymnasium.vector.make("CartPole-v1", num_envs=3) >>> envs.single_action_space Box([-4.8 ...], [4.8 ...], (4,), float32)
Making Vector Environments#
- gymnasium.vector.make(id: str, num_envs: int = 1, asynchronous: bool = True, wrappers: Callable[[Env], Env] | List[Callable[[Env], Env]] | None = None, disable_env_checker: bool | None = None, **kwargs) VectorEnv #
Create a vectorized environment from multiple copies of an environment, from its id.
- Parameters:
id – The environment ID. This must be a valid ID from the registry.
num_envs – Number of copies of the environment.
asynchronous – If True, wraps the environments in an
AsyncVectorEnv
(which uses multiprocessing to run the environments in parallel). IfFalse
, wraps the environments in aSyncVectorEnv
.wrappers – If not
None
, then apply the wrappers to each internal environment during creation.disable_env_checker – If to run the env checker for the first environment only. None will default to the environment spec disable_env_checker parameter (that is by default False), otherwise will run according to this argument (True = not run, False = run)
**kwargs – Keywords arguments applied during gym.make
- Returns:
The vectorized environment.
Example
>>> import gymnasium as gym >>> env = gym.vector.make('CartPole-v1', num_envs=3) >>> env.reset(seed=42) (array([[ 0.0273956 , -0.00611216, 0.03585979, 0.0197368 ], [ 0.01522993, -0.04562247, -0.04799704, 0.03392126], [-0.03774345, -0.02418869, -0.00942293, 0.0469184 ]], dtype=float32), {})
Async Vector Env#
- class gymnasium.vector.AsyncVectorEnv(env_fns: Sequence[Callable[[], Env]], observation_space: Space | None = None, action_space: Space | None = None, shared_memory: bool = True, copy: bool = True, context: str | None = None, daemon: bool = True, worker: Callable | None = None)#
Vectorized environment that runs multiple environments in parallel.
It uses
multiprocessing
processes, and pipes for communication.Example
>>> import gymnasium as gym >>> env = gym.vector.AsyncVectorEnv([ ... lambda: gym.make("Pendulum-v1", g=9.81), ... lambda: gym.make("Pendulum-v1", g=1.62) ... ]) >>> env.reset(seed=42) (array([[-0.14995256, 0.9886932 , -0.12224312], [ 0.5760367 , 0.8174238 , -0.91244936]], dtype=float32), {})
- Parameters:
env_fns – Functions that create the environments.
observation_space – Observation space of a single environment. If
None
, then the observation space of the first environment is taken.action_space – Action space of a single environment. If
None
, then the action space of the first environment is taken.shared_memory – If
True
, then the observations from the worker processes are communicated back through shared variables. This can improve the efficiency if the observations are large (e.g. images).copy – If
True
, then thereset()
andstep()
methods return a copy of the observations.context – Context for `multiprocessing`_. If
None
, then the default context is used.daemon – If
True
, then subprocesses havedaemon
flag turned on; that is, they will quit if the head process quits. However,daemon=True
prevents subprocesses to spawn children, so for some environments you may want to have it set toFalse
.worker – If set, then use that worker in a subprocess instead of a default one. Can be useful to override some inner vector env logic, for instance, how resets on termination or truncation are handled.
- Warnings: worker is an advanced mode option. It provides a high degree of flexibility and a high chance
to shoot yourself in the foot; thus, if you are writing your own worker, it is recommended to start from the code for
_worker
(or_worker_shared_memory
) method, and add changes.
- Raises:
RuntimeError – If the observation space of some sub-environment does not match observation_space (or, by default, the observation space of the first sub-environment).
ValueError – If observation_space is a custom space (i.e. not a default space in Gym, such as gymnasium.spaces.Box, gymnasium.spaces.Discrete, or gymnasium.spaces.Dict) and shared_memory is True.
Sync Vector Env#
- class gymnasium.vector.SyncVectorEnv(env_fns: Iterable[Callable[[], Env]], observation_space: Space | None = None, action_space: Space | None = None, copy: bool = True)#
Vectorized environment that serially runs multiple environments.
Example
>>> import gymnasium as gym >>> env = gym.vector.SyncVectorEnv([ ... lambda: gym.make("Pendulum-v1", g=9.81), ... lambda: gym.make("Pendulum-v1", g=1.62) ... ]) >>> env.reset(seed=42) (array([[-0.14995256, 0.9886932 , -0.12224312], [ 0.5760367 , 0.8174238 , -0.91244936]], dtype=float32), {})
- Parameters:
env_fns – iterable of callable functions that create the environments.
observation_space – Observation space of a single environment. If
None
, then the observation space of the first environment is taken.action_space – Action space of a single environment. If
None
, then the action space of the first environment is taken.copy – If
True
, then thereset()
andstep()
methods return a copy of the observations.
- Raises:
RuntimeError – If the observation space of some sub-environment does not match observation_space (or, by default, the observation space of the first sub-environment).