Vector#

Gymnasium.vector.VectorEnv#

class gymnasium.vector.VectorEnv(num_envs: int, observation_space: Space, action_space: Space)#

Base class for vectorized environments to run multiple independent copies of the same environment in parallel.

Vector environments can provide a linear speed-up in the steps taken per second through sampling multiple sub-environments at the same time. To prevent terminated environments waiting until all sub-environments have terminated or truncated, the vector environments autoreset sub-environments after they terminate or truncated. As a result, the final step’s observation and info are overwritten by the reset’s observation and info. Therefore, the observation and info for the final step of a sub-environment is stored in the info parameter, using “final_observation” and “final_info” respectively. See step() for more information.

The vector environments batch observations, rewards, terminations, truncations and info for each parallel environment. In addition, step() expects to receive a batch of actions for each parallel environment.

Gymnasium contains two types of Vector environments: AsyncVectorEnv and SyncVectorEnv.

The Vector Environments have the additional attributes for users to understand the implementation

num_envs - The number of sub-environment in the vector environment
observation_space - The batched observation space of the vector environment
single_observation_space - The observation space of a single sub-environment
action_space - The batched action space of the vector environment
single_action_space - The action space of a single sub-environment

Note

The info parameter of reset() and step() was originally implemented before OpenAI Gym v25 was a list of dictionary for each sub-environment. However, this was modified in OpenAI Gym v25+ and in Gymnasium to a dictionary with a NumPy array for each key. To use the old info style using the VectorListInfo.

Note

To render the sub-environments, use call() with “render” arguments. Remember to set the render_modes for all the sub-environments during initialization.

Note

All parallel environments should share the identical observation and action spaces. In other words, a vector of multiple different environments is not supported.

Base class for vectorized environments.

Parameters:

num_envs – Number of environments in the vectorized environment.
observation_space – Observation space of a single environment.
action_space – Action space of a single environment.

Methods#

VectorEnv.reset(*, seed: Optional[Union[int, List[int]]] = None, options: Optional[dict] = None)#

Reset all parallel environments and return a batch of initial observations and info.

Parameters:

seed – The environment reset seeds
options – If to return the options

Returns:

A batch of observations and info from the vectorized environment.

An example:

>>> import gymnasium as gym
>>> envs = gym.vector.make("CartPole-v1", num_envs=3)
>>> envs.reset()
(array([[-0.02240574, -0.03439831, -0.03904812,  0.02810693],
       [ 0.01586068,  0.01929009,  0.02394426,  0.04016077],
       [-0.01314174,  0.03893502, -0.02400815,  0.0038326 ]],
      dtype=float32), {})

VectorEnv.step(actions)#

Take an action for each parallel environment.

Parameters:: actions – element of action_space Batch of actions.
Returns:: Batch of (observations, rewards, terminations, truncations, infos)

Note

As the vector environments autoreset for a terminating and truncating sub-environments, the returned observation and info is not the final step’s observation or info which is instead stored in info as “final_observation” and “final_info”.

An example:

>>> envs = gym.vector.make("CartPole-v1", num_envs=3)
>>> envs.reset()
>>> actions = np.array([1, 0, 1])
>>> observations, rewards, termination, truncation, infos = envs.step(actions)

>>> observations
array([[ 0.00122802,  0.16228443,  0.02521779, -0.23700266],
        [ 0.00788269, -0.17490888,  0.03393489,  0.31735462],
        [ 0.04918966,  0.19421194,  0.02938497, -0.29495203]],
        dtype=float32)
>>> rewards
array([1., 1., 1.])
>>> termination
array([False, False, False])
>>> termination
array([False, False, False])
>>> infos
{}

VectorEnv.close(**kwargs)#

Close all parallel environments and release resources.

It also closes all the existing image viewers, then calls close_extras() and set closed as True.

Warning

This function itself does not close the environments, it should be handled in close_extras(). This is generic for both synchronous and asynchronous vectorized environments.

Note

This will be automatically called when garbage collected or program exited.

Parameters:: **kwargs – Keyword arguments passed to close_extras()

Attributes#

action_space#

The (batched) action space. The input actions of step must be valid elements of action_space.:

>>> envs = gymnasium.vector.make("CartPole-v1", num_envs=3)
>>> envs.action_space
MultiDiscrete([2 2 2])

observation_space#

The (batched) observation space. The observations returned by reset and step are valid elements of observation_space.:

>>> envs = gymnasium.vector.make("CartPole-v1", num_envs=3)
>>> envs.observation_space
Box([[-4.8 ...]], [[4.8 ...]], (3, 4), float32)

single_action_space#

The action space of an environment copy.:

>>> envs = gymnasium.vector.make("CartPole-v1", num_envs=3)
>>> envs.single_action_space
Discrete(2)

single_observation_space#

The observation space of an environment copy.:

>>> envs = gymnasium.vector.make("CartPole-v1", num_envs=3)
>>> envs.single_action_space
Box([-4.8 ...], [4.8 ...], (4,), float32)

Making Vector Environments#

gymnasium.vector.make(id: str, num_envs: int = 1, asynchronous: bool = True, wrappers: Optional[Union[callable, List[callable]]] = None, disable_env_checker: Optional[bool] = None, **kwargs) → VectorEnv#

Create a vectorized environment from multiple copies of an environment, from its id.

Example:

>>> import gymnasium as gym
>>> env = gym.vector.make('CartPole-v1', num_envs=3)
>>> env.reset()
array([[-0.04456399,  0.04653909,  0.01326909, -0.02099827],
       [ 0.03073904,  0.00145001, -0.03088818, -0.03131252],
       [ 0.03468829,  0.01500225,  0.01230312,  0.01825218]],
      dtype=float32)

Parameters:

id – The environment ID. This must be a valid ID from the registry.
num_envs – Number of copies of the environment.
asynchronous – If True, wraps the environments in an AsyncVectorEnv (which uses multiprocessing to run the environments in parallel). If False, wraps the environments in a SyncVectorEnv.
wrappers – If not None, then apply the wrappers to each internal environment during creation.
disable_env_checker – If to run the env checker for the first environment only. None will default to the environment spec disable_env_checker parameter (that is by default False), otherwise will run according to this argument (True = not run, False = run)
**kwargs – Keywords arguments applied during gym.make

Returns:

The vectorized environment.

Async Vector Env#

class gymnasium.vector.AsyncVectorEnv(env_fns: Sequence[callable], observation_space: Optional[Space] = None, action_space: Optional[Space] = None, shared_memory: bool = True, copy: bool = True, context: Optional[str] = None, daemon: bool = True, worker: Optional[callable] = None)#

Vectorized environment that runs multiple environments in parallel.

It uses multiprocessing processes, and pipes for communication.

Example:

>>> import gymnasium as gym
>>> env = gym.vector.AsyncVectorEnv([
...     lambda: gym.make("Pendulum-v0", g=9.81),
...     lambda: gym.make("Pendulum-v0", g=1.62)
... ])
>>> env.reset()
array([[-0.8286432 ,  0.5597771 ,  0.90249056],
       [-0.85009176,  0.5266346 ,  0.60007906]], dtype=float32)

Vectorized environment that runs multiple environments in parallel.

Parameters:

env_fns – Functions that create the environments.
observation_space – Observation space of a single environment. If None, then the observation space of the first environment is taken.
action_space – Action space of a single environment. If None, then the action space of the first environment is taken.
shared_memory – If True, then the observations from the worker processes are communicated back through shared variables. This can improve the efficiency if the observations are large (e.g. images).
copy – If True, then the reset() and step() methods return a copy of the observations.
context – Context for `multiprocessing`_. If None, then the default context is used.
daemon – If True, then subprocesses have daemon flag turned on; that is, they will quit if the head process quits. However, daemon=True prevents subprocesses to spawn children, so for some environments you may want to have it set to False.
worker – If set, then use that worker in a subprocess instead of a default one. Can be useful to override some inner vector env logic, for instance, how resets on termination or truncation are handled.

Warnings: worker is an advanced mode option. It provides a high degree of flexibility and a high chance: to shoot yourself in the foot; thus, if you are writing your own worker, it is recommended to start from the code for _worker (or _worker_shared_memory) method, and add changes.

Raises:

RuntimeError – If the observation space of some sub-environment does not match observation_space (or, by default, the observation space of the first sub-environment).
ValueError – If observation_space is a custom space (i.e. not a default space in Gym, such as gymnasium.spaces.Box, gymnasium.spaces.Discrete, or gymnasium.spaces.Dict) and shared_memory is True.

Sync Vector Env#

class gymnasium.vector.SyncVectorEnv(env_fns: Iterable[Callable[[], Env]], observation_space: Optional[Space] = None, action_space: Optional[Space] = None, copy: bool = True)#

Vectorized environment that serially runs multiple environments.

Example:

>>> import gymnasium as gym
>>> env = gym.vector.SyncVectorEnv([
...     lambda: gym.make("Pendulum-v0", g=9.81),
...     lambda: gym.make("Pendulum-v0", g=1.62)
... ])
>>> env.reset()
array([[-0.8286432 ,  0.5597771 ,  0.90249056],
       [-0.85009176,  0.5266346 ,  0.60007906]], dtype=float32)

Vectorized environment that serially runs multiple environments.

Parameters:

env_fns – iterable of callable functions that create the environments.
observation_space – Observation space of a single environment. If None, then the observation space of the first environment is taken.
action_space – Action space of a single environment. If None, then the action space of the first environment is taken.
copy – If True, then the reset() and step() methods return a copy of the observations.

Raises:

RuntimeError – If the observation space of some sub-environment does not match observation_space (or, by default, the observation space of the first sub-environment).