Operate the inverted pendulum. Start with the pole standing on the cart as shown below and move the cart left and right to keep the pole standing.
■Concrete example using CartPole-v0 function
A simple way to do it is below. Here is an example of using this game as a material for reinforcement learning. Click here for how to install OpenAI gym and numpy.
import gym
env = gym.make('CartPole-v0') # Cartpole definition
env.reset() # State initialization
for i in range(100):
env.render() # animation
observation, reward, done, info = env.step(env.action_space.sample()) # Move CartPole and return result
print("Step:",i,done,"Reward:",reward,"Obs:",observation)
<Operate CartPole:env.step>
Enter 0 in "env.step()" to move left and 1 to move right. "env.action_space.sample()" is a function that randomly selects actions.
<Cartpole state:observation>
As a result of manipulating the cart, the states of the cart and poles are defined in observations. The range of values is the officially defined value.
<Reward acquisition conditions:reward=1>
When all the following conditions are met, you will receive a "reward = 1".
① Pole angle within ±0.21
② Cart position within ±2.4
You can move the cart even if you don't get any rewards, but if you do reinforcement learning, you need to stop one episode there.
(The following error message appears and prompts you to reset once)
You are calling 'step()' even though this environment has already returned done = True. You should always call 'reset()' once you receive 'done = True' -- any further steps are undefined behavior.
<Game end condition:done=True>
The game ends (done=True) when reward=0.