zoukankan html css js c++ java

NVIDIA公司推出的GPU运行环境下的机器人仿真环境（NVIDIA Isaac Gym）的安装——强化学习的仿真训练环境（续2）

紧接前文：

NVIDIA公司推出的GPU运行环境下的机器人仿真环境（NVIDIA Isaac Gym）的安装——强化学习的仿真训练环境

本文主要给出 NVIDIA Isaac Gym 在给出的pytorch下PPO算法下运行例子的运行命令例子：

下面就给出几个使用rlgpu文件下的reinforcement learning代码训练isaacgym环境的例子：

下面的例子使用的文件：/home/devil/isaacgym/python/rlgpu/train.py

rlgpu下面的train.py

使用help解释来查看NVIDIA给出的reinforcement leanring算法命令参数：

python train.py -h

RL Policy

optional arguments:
  -h, --help            show this help message and exit
  --sim_device SIM_DEVICE
                        Physics Device in PyTorch-like syntax
  --pipeline PIPELINE   Tensor API pipeline (cpu/gpu)
  --graphics_device_id GRAPHICS_DEVICE_ID
                        Graphics Device ID
  --flex                Use FleX for physics
  --physx               Use PhysX for physics
  --num_threads NUM_THREADS
                        Number of cores used by PhysX
  --subscenes SUBSCENES
                        Number of PhysX subscenes to simulate in parallel
  --slices SLICES       Number of client threads that process env slices
  --test                Run trained policy, no training
  --play                Run trained policy, the same as test, can be used only
                        by rl_games RL library
  --resume RESUME       Resume training or start testing from a checkpoint
  --checkpoint CHECKPOINT
                        Path to the saved weights, only for rl_games RL
                        library
  --headless            Force display off at all times
  --horovod             Use horovod for multi-gpu training, have effect only
                        with rl_games RL library
  --task TASK           Can be BallBalance, Cartpole, CartpoleYUp, Ant,
                        Humanoid, Anymal, FrankaCabinet, Quadcopter,
                        ShadowHand, Ingenuity
  --task_type TASK_TYPE
                        Choose Python or C++
  --rl_device RL_DEVICE
                        Choose CPU or GPU device for inferencing policy
                        network
  --logdir LOGDIR
  --experiment EXPERIMENT
                        Experiment name. If used with --metadata flag an
                        additional information about physics engine, sim
                        device, pipeline and domain randomization will be
                        added to the name
  --metadata            Requires --experiment flag, adds physics engine, sim
                        device, pipeline info and if domain randomization is
                        used to the experiment name provided by user
  --cfg_train CFG_TRAIN
  --cfg_env CFG_ENV
  --num_envs NUM_ENVS   Number of environments to create - override config
                        file
  --episode_length EPISODE_LENGTH
                        Episode length, by default is read from yaml config
  --seed SEED           Random seed
  --max_iterations MAX_ITERATIONS
                        Set a maximum number of training iterations
  --steps_num STEPS_NUM
                        Set number of simulation steps per 1 PPO iteration.
                        Supported only by rl_games. If not -1 overrides the
                        config settings.
  --minibatch_size MINIBATCH_SIZE
                        Set batch size for PPO optimization step. Supported
                        only by rl_games. If not -1 overrides the config
                        settings.
  --randomize           Apply physics domain randomization
  --torch_deterministic
                        Apply additional PyTorch settings for more
                        deterministic behaviour

============================================

运行命令例子：

1. CPU上仿真，CPU上训练

在CPU上运行仿真环境，同时PPO深度强化学习算法在CPU上进行训练 #Simulation on CPU, training on CPU:

python  train.py --task=ShadowHand --headless --sim_device=cpu --rl_device=cpu --physx --num_threads=24

2. CPU上仿真，GPU上训练

python  train.py --task=ShadowHand --headless --sim_device=cpu --rl_device=cuda:0 --physx --num_threads=24

3. GPU上仿真，CPU上训练

python  train.py --task=ShadowHand --headless --sim_device=cuda:0 --rl_device=cpu  --physx --num_threads=24

4. GPU上仿真，GPU上训练

其中，在0号显卡仿真，在1号显卡训练：

python  train.py --task=ShadowHand --headless --sim_device=cuda:0 --rl_device=cuda:1  --physx --num_threads=24

其中，在1号显卡仿真，在0号显卡训练：

python  train.py --task=ShadowHand --headless --sim_device=cuda:1  --rl_device=cuda:0  --physx --num_threads=24

=============================================

本博客是博主个人学习时的一些记录，不保证是为原创，个别文章加入了转载的源地址还有个别文章是汇总网上多份资料所成，在这之中也必有疏漏未加标注者，如有侵权请与博主联系。

查看全文

相关阅读:
.Net常用的命名空间
 Jquery测试纠错笔记
 第一章学习总结
 Java和C++引用的区别
 gin的墙内开发艺术
 golang几个环境变量的问题
 Leetcode240_搜索二维矩阵II
Leetcode1358_包含所有三种字符的子字符串数目
 Leetcode1354_多次求和构造目标数组
 Leetcode1353_最多可以参加的会议数目

原文地址：https://www.cnblogs.com/devilmaycry812839668/p/15218267.html

NVIDIA公司推出的GPU运行环境下的机器人仿真环境（NVIDIA Isaac Gym）的安装——强化学习的仿真训练环境 （续2）

NVIDIA公司推出的GPU运行环境下的机器人仿真环境（NVIDIA Isaac Gym）的安装——强化学习的仿真训练环境

NVIDIA公司推出的GPU运行环境下的机器人仿真环境（NVIDIA Isaac Gym）的安装——强化学习的仿真训练环境（续2）