Vector#
Gymnasium.vector.VectorEnv#
- class gymnasium.vector.VectorEnv(num_envs: int, observation_space: Space, action_space: Space)#
Base class for vectorized environments to run multiple independent copies of the same environment in parallel.
Vector environments can provide a linear speed-up in the steps taken per second through sampling multiple sub-environments at the same time. To prevent terminated environments waiting until all sub-environments have terminated or truncated, the vector environments autoreset sub-environments after they terminate or truncated. As a result, the final step’s observation and info are overwritten by the reset’s observation and info. Therefore, the observation and info for the final step of a sub-environment is stored in the info parameter, using “final_observation” and “final_info” respectively. See
step()
for more information.The vector environments batch observations, rewards, terminations, truncations and info for each parallel environment. In addition,
step()
expects to receive a batch of actions for each parallel environment.Gymnasium contains two types of Vector environments:
AsyncVectorEnv
andSyncVectorEnv
.The Vector Environments have the additional attributes for users to understand the implementation
num_envs
- The number of sub-environment in the vector environmentobservation_space
- The batched observation space of the vector environmentsingle_observation_space
- The observation space of a single sub-environmentaction_space
- The batched action space of the vector environmentsingle_action_space
- The action space of a single sub-environment
Note
The info parameter of
reset()
andstep()
was originally implemented before OpenAI Gym v25 was a list of dictionary for each sub-environment. However, this was modified in OpenAI Gym v25+ and in Gymnasium to a dictionary with a NumPy array for each key. To use the old info style using theVectorListInfo
.Note
To render the sub-environments, use
call()
with “render” arguments. Remember to set the render_modes for all the sub-environments during initialization.Note
All parallel environments should share the identical observation and action spaces. In other words, a vector of multiple different environments is not supported.
Base class for vectorized environments.
- Parameters:
num_envs – Number of environments in the vectorized environment.
observation_space – Observation space of a single environment.
action_space – Action space of a single environment.
Methods#
- VectorEnv.reset(*, seed: Optional[Union[int, List[int]]] = None, options: Optional[dict] = None)#
Reset all parallel environments and return a batch of initial observations and info.
- Parameters:
seed – The environment reset seeds
options – If to return the options
- Returns:
A batch of observations and info from the vectorized environment.
An example:
>>> import gymnasium as gym >>> envs = gym.vector.make("CartPole-v1", num_envs=3) >>> envs.reset() (array([[-0.02240574, -0.03439831, -0.03904812, 0.02810693], [ 0.01586068, 0.01929009, 0.02394426, 0.04016077], [-0.01314174, 0.03893502, -0.02400815, 0.0038326 ]], dtype=float32), {})
- VectorEnv.step(actions)#
Take an action for each parallel environment.
- Parameters:
actions – element of
action_space
Batch of actions.- Returns:
Batch of (observations, rewards, terminations, truncations, infos)
Note
As the vector environments autoreset for a terminating and truncating sub-environments, the returned observation and info is not the final step’s observation or info which is instead stored in info as “final_observation” and “final_info”.
An example:
>>> envs = gym.vector.make("CartPole-v1", num_envs=3) >>> envs.reset() >>> actions = np.array([1, 0, 1]) >>> observations, rewards, termination, truncation, infos = envs.step(actions) >>> observations array([[ 0.00122802, 0.16228443, 0.02521779, -0.23700266], [ 0.00788269, -0.17490888, 0.03393489, 0.31735462], [ 0.04918966, 0.19421194, 0.02938497, -0.29495203]], dtype=float32) >>> rewards array([1., 1., 1.]) >>> termination array([False, False, False]) >>> termination array([False, False, False]) >>> infos {}
- VectorEnv.close(**kwargs)#
Close all parallel environments and release resources.
It also closes all the existing image viewers, then calls
close_extras()
and setclosed
asTrue
.Warning
This function itself does not close the environments, it should be handled in
close_extras()
. This is generic for both synchronous and asynchronous vectorized environments.Note
This will be automatically called when garbage collected or program exited.
- Parameters:
**kwargs – Keyword arguments passed to
close_extras()
Attributes#
- action_space#
The (batched) action space. The input actions of step must be valid elements of action_space.:
>>> envs = gymnasium.vector.make("CartPole-v1", num_envs=3) >>> envs.action_space MultiDiscrete([2 2 2])
- observation_space#
The (batched) observation space. The observations returned by reset and step are valid elements of observation_space.:
>>> envs = gymnasium.vector.make("CartPole-v1", num_envs=3) >>> envs.observation_space Box([[-4.8 ...]], [[4.8 ...]], (3, 4), float32)
- single_action_space#
The action space of an environment copy.:
>>> envs = gymnasium.vector.make("CartPole-v1", num_envs=3) >>> envs.single_action_space Discrete(2)
- single_observation_space#
The observation space of an environment copy.:
>>> envs = gymnasium.vector.make("CartPole-v1", num_envs=3) >>> envs.single_action_space Box([-4.8 ...], [4.8 ...], (4,), float32)
Making Vector Environments#
- gymnasium.vector.make(id: str, num_envs: int = 1, asynchronous: bool = True, wrappers: Optional[Union[callable, List[callable]]] = None, disable_env_checker: Optional[bool] = None, **kwargs) VectorEnv #
Create a vectorized environment from multiple copies of an environment, from its id.
Example:
>>> import gymnasium as gym >>> env = gym.vector.make('CartPole-v1', num_envs=3) >>> env.reset() array([[-0.04456399, 0.04653909, 0.01326909, -0.02099827], [ 0.03073904, 0.00145001, -0.03088818, -0.03131252], [ 0.03468829, 0.01500225, 0.01230312, 0.01825218]], dtype=float32)
- Parameters:
id – The environment ID. This must be a valid ID from the registry.
num_envs – Number of copies of the environment.
asynchronous – If True, wraps the environments in an
AsyncVectorEnv
(which uses multiprocessing to run the environments in parallel). IfFalse
, wraps the environments in aSyncVectorEnv
.wrappers – If not
None
, then apply the wrappers to each internal environment during creation.disable_env_checker – If to run the env checker for the first environment only. None will default to the environment spec disable_env_checker parameter (that is by default False), otherwise will run according to this argument (True = not run, False = run)
**kwargs – Keywords arguments applied during gym.make
- Returns:
The vectorized environment.
Async Vector Env#
- class gymnasium.vector.AsyncVectorEnv(env_fns: Sequence[callable], observation_space: Optional[Space] = None, action_space: Optional[Space] = None, shared_memory: bool = True, copy: bool = True, context: Optional[str] = None, daemon: bool = True, worker: Optional[callable] = None)#
Vectorized environment that runs multiple environments in parallel.
It uses
multiprocessing
processes, and pipes for communication.Example:
>>> import gymnasium as gym >>> env = gym.vector.AsyncVectorEnv([ ... lambda: gym.make("Pendulum-v0", g=9.81), ... lambda: gym.make("Pendulum-v0", g=1.62) ... ]) >>> env.reset() array([[-0.8286432 , 0.5597771 , 0.90249056], [-0.85009176, 0.5266346 , 0.60007906]], dtype=float32)
Vectorized environment that runs multiple environments in parallel.
- Parameters:
env_fns – Functions that create the environments.
observation_space – Observation space of a single environment. If
None
, then the observation space of the first environment is taken.action_space – Action space of a single environment. If
None
, then the action space of the first environment is taken.shared_memory – If
True
, then the observations from the worker processes are communicated back through shared variables. This can improve the efficiency if the observations are large (e.g. images).copy – If
True
, then thereset()
andstep()
methods return a copy of the observations.context – Context for `multiprocessing`_. If
None
, then the default context is used.daemon – If
True
, then subprocesses havedaemon
flag turned on; that is, they will quit if the head process quits. However,daemon=True
prevents subprocesses to spawn children, so for some environments you may want to have it set toFalse
.worker – If set, then use that worker in a subprocess instead of a default one. Can be useful to override some inner vector env logic, for instance, how resets on termination or truncation are handled.
- Warnings: worker is an advanced mode option. It provides a high degree of flexibility and a high chance
to shoot yourself in the foot; thus, if you are writing your own worker, it is recommended to start from the code for
_worker
(or_worker_shared_memory
) method, and add changes.
- Raises:
RuntimeError – If the observation space of some sub-environment does not match observation_space (or, by default, the observation space of the first sub-environment).
ValueError – If observation_space is a custom space (i.e. not a default space in Gym, such as gymnasium.spaces.Box, gymnasium.spaces.Discrete, or gymnasium.spaces.Dict) and shared_memory is True.
Sync Vector Env#
- class gymnasium.vector.SyncVectorEnv(env_fns: Iterator[Callable[[], Env]], observation_space: Optional[Space] = None, action_space: Optional[Space] = None, copy: bool = True)#
Vectorized environment that serially runs multiple environments.
Example:
>>> import gymnasium as gym >>> env = gym.vector.SyncVectorEnv([ ... lambda: gym.make("Pendulum-v0", g=9.81), ... lambda: gym.make("Pendulum-v0", g=1.62) ... ]) >>> env.reset() array([[-0.8286432 , 0.5597771 , 0.90249056], [-0.85009176, 0.5266346 , 0.60007906]], dtype=float32)
Vectorized environment that serially runs multiple environments.
- Parameters:
env_fns – iterable of callable functions that create the environments.
observation_space – Observation space of a single environment. If
None
, then the observation space of the first environment is taken.action_space – Action space of a single environment. If
None
, then the action space of the first environment is taken.copy – If
True
, then thereset()
andstep()
methods return a copy of the observations.
- Raises:
RuntimeError – If the observation space of some sub-environment does not match observation_space (or, by default, the observation space of the first sub-environment).