zoukankan      html  css  js  c++  java
  • gym 搭建 RL 环境

    gym调用

    gym的调用遵从以下的顺序

    1. env = gym.make('x')
    2. observation = env.reset()
    3. for i in range(time_steps):
      env.render()
      action = policy(observation)
      observation, reward, done, info = env.step(action)
      if done:
      ……
      break
    4. env.close()

    例程

    例程是一个简单的策略,杆左斜车左移,右斜则右移。

    import gym
    import numpy as np
    env = gym.make('CartPole-v0')
    t_all = []
    action_bef = 0
    for i_episode in range(5):
        observation = env.reset()
        for t in range(100):
            env.render()
            cp, cv, pa, pv = observation
            if abs(pa)<= 0.1:
                action = 1 -action_bef
            elif pa >= 0:
                action = 1
            elif pa <= 0:
                action = 0
            observation, reward, done, info = env.step(action)
            action_bef = action
            if done:
                # print("Episode finished after {} timesteps".format(t+1))
                t_all.append(t)
                break
            if t ==99:
                t_all.append(0)
    env.close()
    print(t_all)
    print(np.mean(t_all))
    
    
    

    gym的搭建

    gym的函数构成

    一个完整的gym环境包括以下函数:类构建、初始化、

    • class Cartpoleenv(gym.env)
      • def __ init __(self):
      • def reset(self):
      • def seed(self, seed = None): return [seed]
      • def step(self, action): return self.state, reward, done, {}
      • def render(self, mode='human'): return self.viewer.render()
      • def close():

    功能函数

    • 参数限位 vel = np.clip(vel, vel_min, vel_max)
    • action输入校验
      self.action_space.contains(action)

    • action和observation空间定义
      Discrete: 0,1,2
      low = np.array([min_0,min_1],dtype=np.float32)
      high = np.array([max_0,max_1],dtype=np.float32)

      self.action_space = spaces.Discrete(3)
      self.observation_space = spaces.Box(
      self.low, self.high, dtype=np.float32)

    Mac系统添加自己写的环境到gym

    1. 打开gym.envs目录:/usr/local/lib/python3.7/site-packages/gym/envs
    2. 将自己编写的myenv.py拷贝至一个aa目录
    3. envs/aa下__init__.py添加 from gym.envs.classic_control.myenv import MyEnv
    4. env下__init__.py添加
    register(
    id='myenv-v0',
    entry_point='gym.envs.classic_control:MyEnv,
    max_episode_steps=999, 
    )
    
    • 注意:注册方法内 id号不能省
      然后就可以调用了
    id = 'myenv-v0'
    env = gym.make('id')
    env.reset()
    env.step()
    env.sloce()
    
  • 相关阅读:
    转载:疯狂的XML扩展:GML、SVG、VML
    HDU 4274 Spy's Work [DFS]
    HDU 4279 Number [数学?]
    HDU 4276 The Ghost Blows Light [树形背包DP]
    HDU 3271 SNIBB [数位DP]
    HDU 4280 Island Transport [平面图网络流]
    HDU 4278 Faulty Odometer [进制转换]
    HDU 3058 Generator [AC自动机+期望DP]
    HDU 4277 USACO ORZ [状态压缩+枚举]
    HDU 4282 A very hard mathematic problem [枚举]
  • 原文地址:https://www.cnblogs.com/tolshao/p/gym-da-jian-rl-huan-jing.html
Copyright © 2011-2022 走看看