zoukankan      html  css  js  c++  java
  • Google Colab V100 +TensorFlow1.15.2 性能测试

    为了对比滴滴云内测版NVIDIA A100,跑了一下Google Colab V100 的 TensorFlow基准测试,现在把结果记录一下!

    运行环境

    平台为:Google Colab

    系统为:Ubuntu 18.04

    显卡为:V100-SXM2-16GB

    Python版本: 3.6

    TensorFlow版本:1.15.2

    显卡相关:

     

    测试方法

     

    TensorFlow benchmarks测试方法:

    https://github.com/tensorflow/benchmarks

    ResNet50_v1.5 BS64

    !python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=resnet50_v1.5
    Step	Img/sec	total_loss
    1 images/sec: 349.6 +/- 0.0 (jitter = 0.0) 7.848
    10 images/sec: 349.9 +/- 0.2 (jitter = 0.4) 8.053
    20 images/sec: 349.9 +/- 0.1 (jitter = 0.6) 8.103
    30 images/sec: 350.2 +/- 0.1 (jitter = 0.6) 8.118
    40 images/sec: 350.2 +/- 0.1 (jitter = 0.8) 7.894
    50 images/sec: 350.3 +/- 0.1 (jitter = 0.8) 7.918
    60 images/sec: 350.1 +/- 0.1 (jitter = 0.7) 8.103
    70 images/sec: 350.0 +/- 0.1 (jitter = 0.8) 7.986
    80 images/sec: 350.0 +/- 0.1 (jitter = 0.8) 7.808
    90 images/sec: 350.0 +/- 0.1 (jitter = 0.8) 7.972
    100 images/sec: 350.0 +/- 0.1 (jitter = 0.9) 7.649
    ----------------------------------------------------------------
    total images/sec: 349.78
    ----------------------------------------------------------------

    Resnet50 BS64

    python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=resnet50
    Step	Img/sec	total_loss
    1 images/sec: 386.2 +/- 0.0 (jitter = 0.0) 8.220
    10 images/sec: 384.8 +/- 0.4 (jitter = 0.7) 7.880
    20 images/sec: 385.5 +/- 0.5 (jitter = 2.2) 7.910
    30 images/sec: 385.7 +/- 0.4 (jitter = 2.6) 7.821
    40 images/sec: 386.0 +/- 0.4 (jitter = 2.3) 8.004
    50 images/sec: 386.2 +/- 0.3 (jitter = 2.4) 7.768
    60 images/sec: 386.3 +/- 0.3 (jitter = 2.4) 8.118
    70 images/sec: 386.1 +/- 0.3 (jitter = 2.5) 7.816
    80 images/sec: 386.3 +/- 0.2 (jitter = 2.4) 7.977
    90 images/sec: 386.2 +/- 0.2 (jitter = 2.5) 8.098
    100 images/sec: 386.3 +/- 0.2 (jitter = 2.4) 8.045
    ----------------------------------------------------------------
    total images/sec: 386.06
    ----------------------------------------------------------------

    --use_fp16

    python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=resnet50 --use_fp16
    Step	Img/sec	total_loss
    1 images/sec: 911.0 +/- 0.0 (jitter = 0.0) 8.103
    10 images/sec: 918.1 +/- 1.2 (jitter = 3.1) 7.756
    20 images/sec: 914.3 +/- 2.3 (jitter = 4.3) 7.915
    30 images/sec: 914.2 +/- 2.2 (jitter = 4.2) 7.769
    40 images/sec: 912.8 +/- 1.7 (jitter = 6.5) 7.915
    50 images/sec: 911.7 +/- 1.5 (jitter = 7.3) 7.888
    60 images/sec: 912.9 +/- 1.3 (jitter = 7.0) 7.707
    70 images/sec: 911.8 +/- 1.2 (jitter = 7.6) 8.011
    80 images/sec: 912.3 +/- 1.1 (jitter = 7.3) 7.779
    90 images/sec: 912.9 +/- 1.0 (jitter = 6.9) 7.805
    100 images/sec: 913.1 +/- 0.9 (jitter = 6.8) 8.034
    ----------------------------------------------------------------
    total images/sec: 912.08
    ----------------------------------------------------------------

    AlexNet BS512

    python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=512 --model=alexnet
    Step	Img/sec	total_loss
    1 images/sec: 4824.0 +/- 0.0 (jitter = 0.0) nan
    10 images/sec: 4804.0 +/- 5.9 (jitter = 23.3) nan
    20 images/sec: 4802.3 +/- 4.3 (jitter = 24.4) nan
    30 images/sec: 4801.7 +/- 4.4 (jitter = 24.0) nan
    40 images/sec: 4804.5 +/- 3.9 (jitter = 23.0) nan
    50 images/sec: 4805.4 +/- 4.0 (jitter = 24.4) nan
    60 images/sec: 4806.7 +/- 3.5 (jitter = 24.8) nan
    70 images/sec: 4810.1 +/- 3.4 (jitter = 24.4) nan
    80 images/sec: 4810.0 +/- 3.1 (jitter = 25.7) nan
    90 images/sec: 4810.9 +/- 2.8 (jitter = 23.4) nan
    100 images/sec: 4811.5 +/- 2.7 (jitter = 23.4) nan
    ----------------------------------------------------------------
    total images/sec: 4808.18
    ----------------------------------------------------------------

    Inception v3 BS64

    python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=inception3
    Step	Img/sec	total_loss
    1 images/sec: 255.3 +/- 0.0 (jitter = 0.0) 7.277
    10 images/sec: 254.3 +/- 0.5 (jitter = 2.2) 7.304
    20 images/sec: 254.4 +/- 0.3 (jitter = 2.4) 7.292
    30 images/sec: 254.3 +/- 0.3 (jitter = 2.3) 7.402
    40 images/sec: 254.2 +/- 0.3 (jitter = 2.3) 7.314
    50 images/sec: 254.3 +/- 0.2 (jitter = 2.3) 7.283
    60 images/sec: 254.3 +/- 0.2 (jitter = 2.2) 7.363
    70 images/sec: 254.3 +/- 0.2 (jitter = 2.1) 7.350
    80 images/sec: 254.3 +/- 0.2 (jitter = 2.2) 7.384
    90 images/sec: 254.3 +/- 0.2 (jitter = 1.9) 7.318
    100 images/sec: 254.3 +/- 0.1 (jitter = 1.9) 7.376
    ----------------------------------------------------------------
    total images/sec: 254.19
    ----------------------------------------------------------------

    VGG16 BS64

    python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=vgg16
    Step	Img/sec	total_loss
    1 images/sec: 250.0 +/- 0.0 (jitter = 0.0) 7.319
    10 images/sec: 250.2 +/- 0.2 (jitter = 0.2) 7.297
    20 images/sec: 250.4 +/- 0.1 (jitter = 0.5) 7.284
    30 images/sec: 250.4 +/- 0.1 (jitter = 0.6) 7.274
    40 images/sec: 250.4 +/- 0.1 (jitter = 0.6) 7.288
    50 images/sec: 250.4 +/- 0.1 (jitter = 0.6) 7.278
    60 images/sec: 250.3 +/- 0.1 (jitter = 0.6) 7.278
    70 images/sec: 250.3 +/- 0.1 (jitter = 0.6) 7.266
    80 images/sec: 250.3 +/- 0.1 (jitter = 0.6) 7.288
    90 images/sec: 250.2 +/- 0.1 (jitter = 0.6) 7.269
    100 images/sec: 250.3 +/- 0.1 (jitter = 0.6) 7.270
    ----------------------------------------------------------------
    total images/sec: 250.19
    ----------------------------------------------------------------

    GoogLeNet BS128

    python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=128 --model=googlenet
    Step	Img/sec	total_loss
    1 images/sec: 1034.6 +/- 0.0 (jitter = 0.0) 7.105
    10 images/sec: 1034.2 +/- 0.9 (jitter = 1.8) 7.105
    20 images/sec: 1030.9 +/- 1.8 (jitter = 2.9) 7.094
    30 images/sec: 1031.0 +/- 1.3 (jitter = 4.2) 7.086
    40 images/sec: 1031.6 +/- 1.0 (jitter = 3.9) 7.067
    50 images/sec: 1030.6 +/- 0.9 (jitter = 5.4) 7.093
    60 images/sec: 1030.4 +/- 0.8 (jitter = 5.4) 7.050
    70 images/sec: 1030.6 +/- 0.8 (jitter = 5.7) 7.073
    80 images/sec: 1030.3 +/- 0.7 (jitter = 5.9) 7.078
    90 images/sec: 1030.3 +/- 0.6 (jitter = 5.6) 7.078
    100 images/sec: 1030.0 +/- 0.6 (jitter = 5.5) 7.069
    ----------------------------------------------------------------
    total images/sec: 1029.42
    ----------------------------------------------------------------

    ResNet152 BS32

    python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=32 --model=resnet152
    Step	Img/sec	total_loss
    1 images/sec: 137.0 +/- 0.0 (jitter = 0.0) 9.023
    10 images/sec: 138.0 +/- 0.4 (jitter = 1.4) 8.574
    20 images/sec: 138.5 +/- 0.3 (jitter = 1.6) 8.600
    30 images/sec: 138.5 +/- 0.2 (jitter = 1.6) 8.755
    40 images/sec: 138.6 +/- 0.2 (jitter = 1.6) 8.624
    50 images/sec: 138.5 +/- 0.2 (jitter = 1.6) 8.801
    60 images/sec: 138.4 +/- 0.1 (jitter = 1.7) 8.679
    70 images/sec: 138.4 +/- 0.1 (jitter = 1.8) 9.112
    80 images/sec: 138.4 +/- 0.1 (jitter = 1.7) 8.872
    90 images/sec: 138.4 +/- 0.1 (jitter = 1.7) 9.025
    100 images/sec: 138.4 +/- 0.1 (jitter = 1.7) 8.847
    ----------------------------------------------------------------
    total images/sec: 138.39
    ----------------------------------------------------------------

    性能对比

    A100 和V100 和 2080ti 性能对比:

    https://www.tonyisstark.com/383.html

  • 相关阅读:
    Linux Core Dump
    ODP.NET Managed正式推出
    获取EditText的光标位置
    (Java实现) 洛谷 P1603 斯诺登的密码
    (Java实现) 洛谷 P1603 斯诺登的密码
    (Java实现) 洛谷 P1036 选数
    (Java实现) 洛谷 P1036 选数
    (Java实现) 洛谷 P1012 拼数
    (Java实现) 洛谷 P1012 拼数
    (Java实现) 洛谷 P1028 数的计算
  • 原文地址:https://www.cnblogs.com/wangpg/p/13689583.html
Copyright © 2011-2022 走看看