Bipedal Walker


This environment is part of the Box2D environments which contains general information about the environment.

Action Space

Box(-1.0, 1.0, (4,), float32)

Observation Space

Box([-3.1415927 -5. -5. -5. -3.1415927 -5. -3.1415927 -5. -0. -3.1415927 -5. -3.1415927 -5. -0. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. ], [3.1415927 5. 5. 5. 3.1415927 5. 3.1415927 5. 5. 3.1415927 5. 3.1415927 5. 5. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. ], (24,), float32)




This is a simple 4-joint walker robot environment. There are two versions:

  • Normal, with slightly uneven terrain.

  • Hardcore, with ladders, stumps, pitfalls.

To solve the normal version, you need to get 300 points in 1600 time steps. To solve the hardcore version, you need 300 points in 2000 time steps.

A heuristic is provided for testing. It’s also useful to get demonstrations to learn from. To run the heuristic:

python gymnasium/envs/box2d/

Action Space

Actions are motor speed values in the [-1, 1] range for each of the 4 joints at both hips and knees.

Observation Space

State consists of hull angle speed, angular velocity, horizontal speed, vertical speed, position of joints and joints angular speed, legs contact with ground, and 10 lidar rangefinder measurements. There are no coordinates in the state vector.


Reward is given for moving forward, totaling 300+ points up to the far end. If the robot falls, it gets -100. Applying motor torque costs a small amount of points. A more optimal agent will get a better score.

Starting State

The walker starts standing at the left end of the terrain with the hull horizontal, and both legs in the same position with a slight knee angle.

Episode Termination

The episode will terminate if the hull gets in contact with the ground or if the walker exceeds the right end of the terrain length.


To use the hardcore environment, you need to specify the hardcore=True:

>>> import gymnasium as gym
>>> env = gym.make("BipedalWalker-v3", hardcore=True, render_mode="rgb_array")
>>> env

Version History

  • v3: Returns the closest lidar trace instead of furthest; faster video recording

  • v2: Count energy spent

  • v1: Legs now report contact with ground; motors have higher torque and speed; ground has higher friction; lidar rendered less nervously.

  • v0: Initial version


Created by Oleg Klimov