zoukankan      html  css  js  c++  java
  • kubenetes GPU

    https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#deploying-nvidia-gpu-device-plugin

    1. 安装 nvidia-docker(ubuntu14.04)

    https://github.com/NVIDIA/nvidia-docker

    卸载旧版

    docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
    sudo apt-get purge -y nvidia-docker


    # Add the package repositories
    curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
    distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
    curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
    sudo apt-get update # Install nvidia-docker2 and reload the Docker daemon configuration sudo apt-get install -y nvidia-docker2
    sudo pkill -SIGHUP dockerd

    2. 设置docker runtime

    First you will need to check and/or enable the nvidia runtime as your default runtime on your node. We will be editing the docker daemon config file which is usually present at /etc/docker/daemon.json:

    {
        "default-runtime": "nvidia",
        "runtimes": {
            "nvidia": {
                "path": "/usr/bin/nvidia-container-runtime",
                "runtimeArgs": []
            }
        }
    }


    重起docker

    root@ogs-gpu02:/etc/ssl/certs# docker run --runtime=nvidia --rm registry.bst-1.cns.bstjpc.com:5000/nvidia/cuda nvidia-smi
    Fri Mar 23 05:30:37 2018       
    +-----------------------------------------------------------------------------+
    | NVIDIA-SMI 384.111                Driver Version: 384.111                   |
    |-------------------------------+----------------------+----------------------+
    | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
    |===============================+======================+======================|
    |   0  Tesla K20m          Off  | 00000000:04:00.0 Off |                    0 |
    | N/A   27C    P0    48W / 225W |      0MiB /  4742MiB |      0%      Default |
    +-------------------------------+----------------------+----------------------+
    |   1  Tesla K20m          Off  | 00000000:43:00.0 Off |                    0 |
    | N/A   27C    P0    48W / 225W |      0MiB /  4742MiB |      0%      Default |
    +-------------------------------+----------------------+----------------------+
    |   2  Tesla K20m          Off  | 00000000:84:00.0 Off |                    0 |
    | N/A   31C    P0    47W / 225W |      0MiB /  4742MiB |      0%      Default |
    +-------------------------------+----------------------+----------------------+
    |   3  Tesla K20m          Off  | 00000000:C4:00.0 Off |                    0 |
    | N/A   30C    P0    48W / 225W |      0MiB /  4742MiB |     43%      Default |
    +-------------------------------+----------------------+----------------------+
                                                                                   
    +-----------------------------------------------------------------------------+
    | Processes:                                                       GPU Memory |
    |  GPU       PID   Type   Process name                             Usage      |
    |=============================================================================|
    |  No running processes found                                                 |
    +-----------------------------------------------------------------------------+

    kubelet 启动参数增加 --feature-gates="DevicePlugins=true"

    用k8s 启动 nvidia-device-plugin

    //////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

    用k8s 自带的gpu功能, kubelet 启动参数 --feature-gates="Accelerators=true"

  • 相关阅读:
    (原创)sqlite封装库SmartDB1.3发布
    合索引 与 单一列的索引
    Sql中CHARINDEX用法
    Eclipse 的快捷键以及文档注释、多行注释的快捷键
    JAVA 方法或者类的注释快捷键
    关于/r与/n 以及 /r/n 的区别总结
    c#中Split 分离字符以及空格消除方法
    C#生成Guid的几种方式
    MVC ViewBag和ViewData的使用
    软考之高级系统架构设计师(包含历年真题详解+课本教程+论文范文+视频教程)
  • 原文地址:https://www.cnblogs.com/mhc-fly/p/8608382.html
Copyright © 2011-2022 走看看