Vector#
Gymnasium.vector.VectorEnv#
- class gymnasium.vector.VectorEnv(num_envs: int, observation_space: Space, action_space: Space)[source]#
Base class for vectorized environments to run multiple independent copies of the same environment in parallel.
Vector environments can provide a linear speed-up in the steps taken per second through sampling multiple sub-environments at the same time. To prevent terminated environments waiting until all sub-environments have terminated or truncated, the vector environments autoreset sub-environments after they terminate or truncated. As a result, the final step’s observation and info are overwritten by the reset’s observation and info. Therefore, the observation and info for the final step of a sub-environment is stored in the info parameter, using “final_observation” and “final_info” respectively. See
step()
for more information.The vector environments batch observations, rewards, terminations, truncations and info for each parallel environment. In addition,
step()
expects to receive a batch of actions for each parallel environment.Gymnasium contains two types of Vector environments:
AsyncVectorEnv
andSyncVectorEnv
.The Vector Environments have the additional attributes for users to understand the implementation
num_envs
- The number of sub-environment in the vector environmentobservation_space
- The batched observation space of the vector environmentsingle_observation_space
- The observation space of a single sub-environmentaction_space
- The batched action space of the vector environmentsingle_action_space
- The action space of a single sub-environment
Note
The info parameter of
reset()
andstep()
was originally implemented before OpenAI Gym v25 was a list of dictionary for each sub-environment. However, this was modified in OpenAI Gym v25+ and in Gymnasium to a dictionary with a NumPy array for each key. To use the old info style using theVectorListInfo
.Note
To render the sub-environments, use
call()
with “render” arguments. Remember to set the render_modes for all the sub-environments during initialization.Note
All parallel environments should share the identical observation and action spaces. In other words, a vector of multiple different environments is not supported.
- Parameters:
num_envs – Number of environments in the vectorized environment.
observation_space – Observation space of a single environment.
action_space – Action space of a single environment.
Methods#
- VectorEnv.reset(*, seed: int | List[int] | None = None, options: dict | None = None)[source]#
Reset all parallel environments and return a batch of initial observations and info.
- Parameters:
seed – The environment reset seeds
options – If to return the options
- Returns:
A batch of observations and info from the vectorized environment.
Example
>>> import gymnasium as gym >>> envs = gym.vector.make("CartPole-v1", num_envs=3) >>> envs.reset(seed=42) (array([[ 0.0273956 , -0.00611216, 0.03585979, 0.0197368 ], [ 0.01522993, -0.04562247, -0.04799704, 0.03392126], [-0.03774345, -0.02418869, -0.00942293, 0.0469184 ]], dtype=float32), {})
- VectorEnv.step(actions) Tuple[Any, ndarray[Any, dtype[Any]], ndarray[Any, dtype[Any]], ndarray[Any, dtype[Any]], dict] [source]#
Take an action for each parallel environment.
- Parameters:
actions – element of
action_space
Batch of actions.- Returns:
Batch of (observations, rewards, terminations, truncations, infos)
Note
As the vector environments autoreset for a terminating and truncating sub-environments, the returned observation and info is not the final step’s observation or info which is instead stored in info as “final_observation” and “final_info”.
Example
>>> import gymnasium as gym >>> import numpy as np >>> envs = gym.vector.make("CartPole-v1", num_envs=3) >>> _ = envs.reset(seed=42) >>> actions = np.array([1, 0, 1]) >>> observations, rewards, termination, truncation, infos = envs.step(actions) >>> observations array([[ 0.02727336, 0.18847767, 0.03625453, -0.26141977], [ 0.01431748, -0.24002443, -0.04731862, 0.3110827 ], [-0.03822722, 0.1710671 , -0.00848456, -0.2487226 ]], dtype=float32) >>> rewards array([1., 1., 1.]) >>> termination array([False, False, False]) >>> truncation array([False, False, False]) >>> infos {}
- VectorEnv.close(**kwargs)[source]#
Close all parallel environments and release resources.
It also closes all the existing image viewers, then calls
close_extras()
and setclosed
asTrue
.Warning
This function itself does not close the environments, it should be handled in
close_extras()
. This is generic for both synchronous and asynchronous vectorized environments.Note
This will be automatically called when garbage collected or program exited.
- Parameters:
**kwargs – Keyword arguments passed to
close_extras()
Attributes#
- action_space#
The (batched) action space. The input actions of step must be valid elements of action_space.:
>>> envs = gymnasium.vector.make("CartPole-v1", num_envs=3) >>> envs.action_space MultiDiscrete([2 2 2])
- observation_space#
The (batched) observation space. The observations returned by reset and step are valid elements of observation_space.:
>>> envs = gymnasium.vector.make("CartPole-v1", num_envs=3) >>> envs.observation_space Box([[-4.8 ...]], [[4.8 ...]], (3, 4), float32)
- single_action_space#
The action space of an environment copy.:
>>> envs = gymnasium.vector.make("CartPole-v1", num_envs=3) >>> envs.single_action_space Discrete(2)
- single_observation_space#
The observation space of an environment copy.:
>>> envs = gymnasium.vector.make("CartPole-v1", num_envs=3) >>> envs.single_observation_space Box([-4.8 ...], [4.8 ...], (4,), float32)
Making Vector Environments#
- gymnasium.vector.make(id: str, num_envs: int = 1, asynchronous: bool = True, wrappers: Callable[[Env], Env] | List[Callable[[Env], Env]] | None = None, disable_env_checker: bool | None = None, **kwargs) VectorEnv [source]#
Create a vectorized environment from multiple copies of an environment, from its id.
- Parameters:
id – The environment ID. This must be a valid ID from the registry.
num_envs – Number of copies of the environment.
asynchronous – If True, wraps the environments in an
AsyncVectorEnv
(which uses multiprocessing to run the environments in parallel). IfFalse
, wraps the environments in aSyncVectorEnv
.wrappers – If not
None
, then apply the wrappers to each internal environment during creation.disable_env_checker – If to run the env checker for the first environment only. None will default to the environment spec disable_env_checker parameter (that is by default False), otherwise will run according to this argument (True = not run, False = run)
**kwargs – Keywords arguments applied during gym.make
- Returns:
The vectorized environment.
Example
>>> import gymnasium as gym >>> env = gym.vector.make('CartPole-v1', num_envs=3) >>> env.reset(seed=42) (array([[ 0.0273956 , -0.00611216, 0.03585979, 0.0197368 ], [ 0.01522993, -0.04562247, -0.04799704, 0.03392126], [-0.03774345, -0.02418869, -0.00942293, 0.0469184 ]], dtype=float32), {})
Async Vector Env#
- class gymnasium.vector.AsyncVectorEnv(env_fns: Sequence[Callable[[], Env]], observation_space: Space | None = None, action_space: Space | None = None, shared_memory: bool = True, copy: bool = True, context: str | None = None, daemon: bool = True, worker: Callable | None = None)[source]#
Vectorized environment that runs multiple environments in parallel.
It uses
multiprocessing
processes, and pipes for communication.Example
>>> import gymnasium as gym >>> env = gym.vector.AsyncVectorEnv([ ... lambda: gym.make("Pendulum-v1", g=9.81), ... lambda: gym.make("Pendulum-v1", g=1.62) ... ]) >>> env.reset(seed=42) (array([[-0.14995256, 0.9886932 , -0.12224312], [ 0.5760367 , 0.8174238 , -0.91244936]], dtype=float32), {})
- Parameters:
env_fns – Functions that create the environments.
observation_space – Observation space of a single environment. If
None
, then the observation space of the first environment is taken.action_space – Action space of a single environment. If
None
, then the action space of the first environment is taken.shared_memory – If
True
, then the observations from the worker processes are communicated back through shared variables. This can improve the efficiency if the observations are large (e.g. images).copy – If
True
, then thereset()
andstep()
methods return a copy of the observations.context – Context for `multiprocessing`_. If
None
, then the default context is used.daemon – If
True
, then subprocesses havedaemon
flag turned on; that is, they will quit if the head process quits. However,daemon=True
prevents subprocesses to spawn children, so for some environments you may want to have it set toFalse
.worker – If set, then use that worker in a subprocess instead of a default one. Can be useful to override some inner vector env logic, for instance, how resets on termination or truncation are handled.
Warning
worker is an advanced mode option. It provides a high degree of flexibility and a high chance to shoot yourself in the foot; thus, if you are writing your own worker, it is recommended to start from the code for
_worker
(or_worker_shared_memory
) method, and add changes.- Raises:
RuntimeError – If the observation space of some sub-environment does not match observation_space (or, by default, the observation space of the first sub-environment).
ValueError – If observation_space is a custom space (i.e. not a default space in Gym, such as gymnasium.spaces.Box, gymnasium.spaces.Discrete, or gymnasium.spaces.Dict) and shared_memory is True.
Sync Vector Env#
- class gymnasium.vector.SyncVectorEnv(env_fns: Iterable[Callable[[], Env]], observation_space: Space | None = None, action_space: Space | None = None, copy: bool = True)[source]#
Vectorized environment that serially runs multiple environments.
Example
>>> import gymnasium as gym >>> env = gym.vector.SyncVectorEnv([ ... lambda: gym.make("Pendulum-v1", g=9.81), ... lambda: gym.make("Pendulum-v1", g=1.62) ... ]) >>> env.reset(seed=42) (array([[-0.14995256, 0.9886932 , -0.12224312], [ 0.5760367 , 0.8174238 , -0.91244936]], dtype=float32), {})
- Parameters:
env_fns – iterable of callable functions that create the environments.
observation_space – Observation space of a single environment. If
None
, then the observation space of the first environment is taken.action_space – Action space of a single environment. If
None
, then the action space of the first environment is taken.copy – If
True
, then thereset()
andstep()
methods return a copy of the observations.
- Raises:
RuntimeError – If the observation space of some sub-environment does not match observation_space (or, by default, the observation space of the first sub-environment).