zoukankan      html  css  js  c++  java
  • NVIDIA公司推出的GPU运行环境下的机器人仿真环境(NVIDIA Isaac Gym)的安装——强化学习的仿真训练环境 (续2)

    紧接前文:

    NVIDIA公司推出的GPU运行环境下的机器人仿真环境(NVIDIA Isaac Gym)的安装——强化学习的仿真训练环境

     
    本文主要给出  NVIDIA Isaac Gym  在给出的pytorch下PPO算法下运行例子的运行命令例子:
     
     
     
     

    下面就给出几个使用rlgpu文件下的reinforcement learning代码训练isaacgym环境的例子:

    下面的例子使用的文件:/home/devil/isaacgym/python/rlgpu/train.py

    rlgpu下面的train.py

    使用help解释来查看NVIDIA给出的reinforcement leanring算法命令参数:

    python train.py -h

    RL Policy
    
    optional arguments:
      -h, --help            show this help message and exit
      --sim_device SIM_DEVICE
                            Physics Device in PyTorch-like syntax
      --pipeline PIPELINE   Tensor API pipeline (cpu/gpu)
      --graphics_device_id GRAPHICS_DEVICE_ID
                            Graphics Device ID
      --flex                Use FleX for physics
      --physx               Use PhysX for physics
      --num_threads NUM_THREADS
                            Number of cores used by PhysX
      --subscenes SUBSCENES
                            Number of PhysX subscenes to simulate in parallel
      --slices SLICES       Number of client threads that process env slices
      --test                Run trained policy, no training
      --play                Run trained policy, the same as test, can be used only
                            by rl_games RL library
      --resume RESUME       Resume training or start testing from a checkpoint
      --checkpoint CHECKPOINT
                            Path to the saved weights, only for rl_games RL
                            library
      --headless            Force display off at all times
      --horovod             Use horovod for multi-gpu training, have effect only
                            with rl_games RL library
      --task TASK           Can be BallBalance, Cartpole, CartpoleYUp, Ant,
                            Humanoid, Anymal, FrankaCabinet, Quadcopter,
                            ShadowHand, Ingenuity
      --task_type TASK_TYPE
                            Choose Python or C++
      --rl_device RL_DEVICE
                            Choose CPU or GPU device for inferencing policy
                            network
      --logdir LOGDIR
      --experiment EXPERIMENT
                            Experiment name. If used with --metadata flag an
                            additional information about physics engine, sim
                            device, pipeline and domain randomization will be
                            added to the name
      --metadata            Requires --experiment flag, adds physics engine, sim
                            device, pipeline info and if domain randomization is
                            used to the experiment name provided by user
      --cfg_train CFG_TRAIN
      --cfg_env CFG_ENV
      --num_envs NUM_ENVS   Number of environments to create - override config
                            file
      --episode_length EPISODE_LENGTH
                            Episode length, by default is read from yaml config
      --seed SEED           Random seed
      --max_iterations MAX_ITERATIONS
                            Set a maximum number of training iterations
      --steps_num STEPS_NUM
                            Set number of simulation steps per 1 PPO iteration.
                            Supported only by rl_games. If not -1 overrides the
                            config settings.
      --minibatch_size MINIBATCH_SIZE
                            Set batch size for PPO optimization step. Supported
                            only by rl_games. If not -1 overrides the config
                            settings.
      --randomize           Apply physics domain randomization
      --torch_deterministic
                            Apply additional PyTorch settings for more
                            deterministic behaviour

     ============================================

    运行命令例子:

    1.  CPU上仿真,CPU上训练

    在CPU上运行仿真环境,同时PPO深度强化学习算法在CPU上进行训练    #Simulation on CPU, training on CPU:

    python  train.py --task=ShadowHand --headless --sim_device=cpu --rl_device=cpu --physx --num_threads=24

    2.  CPU上仿真,GPU上训练

    python  train.py --task=ShadowHand --headless --sim_device=cpu --rl_device=cuda:0 --physx --num_threads=24

    3.  GPU上仿真,CPU上训练

    python  train.py --task=ShadowHand --headless --sim_device=cuda:0 --rl_device=cpu  --physx --num_threads=24

    4.  GPU上仿真,GPU上训练

    其中,在0号显卡仿真,在1号显卡训练:

    python  train.py --task=ShadowHand --headless --sim_device=cuda:0 --rl_device=cuda:1  --physx --num_threads=24

    其中,在1号显卡仿真,在0号显卡训练:

    python  train.py --task=ShadowHand --headless --sim_device=cuda:1  --rl_device=cuda:0  --physx --num_threads=24

    =============================================

    本博客是博主个人学习时的一些记录,不保证是为原创,个别文章加入了转载的源地址还有个别文章是汇总网上多份资料所成,在这之中也必有疏漏未加标注者,如有侵权请与博主联系。
  • 相关阅读:
    几个关于文本文件、字符串、编码的函数
    海量数据解决思路之Hash算法
    从头到尾彻底解析哈希表算法
    几个 GetHashCode 函数
    DELPHI指针的使用
    关于Delphi中的字符串的详细分析
    TStringList常用操作
    Pascal 排序算法
    Delphi THashedStringList用法
    Delphi代码创建形式规范 1.0
  • 原文地址:https://www.cnblogs.com/devilmaycry812839668/p/15218267.html
Copyright © 2011-2022 走看看