  • 深度学习主机环境配置: Ubuntu16.04+GeForce GTX 1080+TensorFlow

    接上文《深度学习主机环境配置: Ubuntu16.04+Nvidia GTX 1080+CUDA8.0》,我们继续来安装 TensorFlow,使其支持GeForce GTX 1080显卡。

    1 下载和安装cuDNN

    cuDNN全称 CUDA Deep Neural Network library,是NVIDIA专门针对深度神经网络设计的一套GPU计算加速库,被广泛用于各种深度学习框架,例如Caffe, TensorFlow, Theano, Torch, CNTK等。

    The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. cuDNN is part of the NVIDIA Deep Learning SDK.

    Deep learning researchers and framework developers worldwide rely on cuDNN for high-performance GPU acceleration. It allows them to focus on training neural networks and developing software applications rather than spending time on low-level GPU performance tuning. cuDNN accelerates widely used deep learning frameworks, including Caffe, TensorFlow, Theano, Torch, and CNTK. See supported frameworks for more details.

    首先需要下载cuDNN,直接从Nvidia官方下载链接选择一个版本,不过下载cuDNN前同样需要登录甚至填写一个简单的调查问卷: https://developer.nvidia.com/rdp/cudnn-download,这里选择的是支持CUDA8.0的cuDNN v5版本,而支持CUDA8的5.1版本虽然显示在下载选择项里,但是提示:cuDNN 5.1 RC for CUDA 8RC will be available soon - please check back again.

    tar -zxvf cudnn-8.0-linux-x64-v5.0-ga.tgz


    sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
    sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
    sudo chmod a+r /usr/local/cuda/include/cudnn.h
    sudo chmod a+r /usr/local/cuda/lib64/libcudnn*

    2 通过源代码方式编译安装TensorFlow GPU版本

    TensorFlow的CPU版本安装比较简单,在Ubuntu 环境下通过PIP方式安装即可,具体请参考TensorFlow官方安装文档。这里通过源代码编译安装TensorFlow 0.9版本,使其支持相应的GPU:GTX1080

    1) Python相关环境准备

    sudo apt-get install python-pip
    sudo apt-get install python-numpy swig python-dev python-wheel




    从Bazel github上最新的Linux relase版本:

    wget https://github.com/bazelbuild/bazel/releases/download/0.3.0/bazel-0.3.0-installer-linux-x86_64.sh


    chmod +x bazel-0.3.0-installer-linux-x86_64.sh
    ./bazel-0.3.0-installer-linux-x86_64.sh --user


    Java not found, please install the corresponding package
    See http://bazel.io/docs/install.html for more information on

    应该是没有安装Java环境的问题,bazel需要Java JDK 8,在ubuntu16.04直接apt-get安装即可:

    sudo apt-get update
    sudo apt-get install default-jre
    sudo apt-get install default-jdk


    ./bazel-0.3.0-installer-linux-x86_64.sh --user

    Bazel installer

    # Release 0.3.0 (2016-06-10)

    Baseline: a9301fa

    Cherry picks:
    + ff30a73: Turn --legacy_external_runfiles back on by default
    + aeee3b8: Fix delete[] warning on fsevents.cc

    Incompatible changes:

    - The --cwarn command line option is not supported anymore. Use
    --copt instead.

    New features:

    - On OSX, --watchfs now uses FsEvents to be notified of changes
    from the filesystem (previously, this flag had no effect on OS X).
    - add support for the '-=', '*=', '/=', and'%=' operators to
    skylark. Notably, we do not support '|=' because the semantics
    of skylark sets are sufficiently different from python sets.

    Important changes:

    - Use singular form when appropriate in blaze's test result summary
    - Added supported for Android NDK revision 11
    - --objc_generate_debug_symbols is now deprecated.
    - swift_library now generates an Objective-C header for its @objc
    - new_objc_provider can now set the USES_SWIFT flag.
    - objc_framework now supports dynamic frameworks.
    - Symlinks in zip files are now unzipped correctly by http_archive,
    download_and_extract, etc.
    - swift_library is now able to import framework rules such as
    - Adds "jre_deps" attribute to j2objc_library.
    - Release apple_binary rule, for creating multi-architecture
    ("fat") objc/cc binaries and libraries, targeting ios platforms.
    - Aspects documentation added.
    - The --ues_isystem_for_includes command line option is not
    supported anymore.
    - global function 'provider' is removed from .bzl files. Providers
    can only be accessed through fields in a 'target' object.

    ## Build informations
    - [Build log](http://ci.bazel.io/job/Bazel/JAVA_VERSION=1.8,PLATFORM_NAME=linux-x86_64/595/)
    - [Commit](https://github.com/bazelbuild/bazel/commit/e671d29)
    Uncompressing......Extracting Bazel installation...

    Bazel is now installed!

    Make sure you have "/home/textminer/bin" in your path. You can also activate bash
    completion by adding the following line to your ~/.bashrc:
    source /home/textminer/.bazel/bin/bazel-complete.bash

    See http://bazel.io/docs/getting-started.html to start a new project!

    然后在 ~/.bashrc中追加:

    source /home/textminer/.bazel/bin/bazel-complete.bash
    export PATH=$PATH:/home/textminer/.bazel/bin


    Bazel comes with a bash completion script. To install it:

    Build it with Bazel: bazel build //scripts:bazel-complete.bash.
    Copy the script bazel-bin/scripts/bazel-complete.bash to your completion folder (/etc/bash_completion.d directory under Ubuntu). If you don't have a completion folder, you can copy it wherever suits you and simply insert source /path/to/bazel-complete.bash in your ~/.bashrc file (under OS X, put it in your ~/.bash_profile file).


    source ~/.bashrc


    3) 编译安装TensorFlow:


    $ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl



    Please specify the location of python. [Default is /usr/bin/python]:
    Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] y
    Google Cloud Platform support will be enabled for TensorFlow

    ERROR: It appears that the development version of libcurl is not available. Please install the libcurl3-dev package.


    sudo apt-get install libcurl3 libcurl3-dev



    除了两处选择yes or no 的地方外,其他地方一路回车:

    Please specify the location of python. [Default is /usr/bin/python]:
    Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] y
    Google Cloud Platform support will be enabled for TensorFlow
    Do you wish to build TensorFlow with GPU support? [y/N] y
    GPU support will be enabled for TensorFlow
    Please specify which gcc nvcc should use as the host compiler. [Default is /usr/bin/gcc]:
    Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]:
    Please specify the location where CUDA toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
    Please specify the Cudnn version you want to use. [Leave empty to use system default]:
    Please specify the location where cuDNN library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
    Please specify a list of comma-separated Cuda compute capabilities you want to build with.
    You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
    Please note that each additional compute capability significantly increases your build time and binary size.
    [Default is: "3.5,5.2"]:
    Setting up Cuda include
    Setting up Cuda lib64
    Setting up Cuda bin
    Setting up Cuda nvvm
    Setting up CUPTI include
    Setting up CUPTI lib64
    Configuration finished


    bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer

    这个过程中需要通过git下载和编译google protobuf 和 boringssl:

    INFO: Cloning https://github.com/google/protobuf: Receiving objects
    INFO: Cloning https://github.com/google/boringssl.git: Receiving objects


    configure: error: zlib not installed
    Target //tensorflow/cc:tutorials_example_trainer failed to build


    sudo apt-get install zlib1g-dev


    Target //tensorflow/cc:tutorials_example_trainer up-to-date:
    INFO: Elapsed time: 897.845s, Critical Path: 533.72s

    执行一下TensorFlow官方文档里的例子,看看能否成功调用GTX 1080:

    bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu

    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
    I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
    name: GeForce GTX 1080
    major: 6 minor: 1 memoryClockRate (GHz) 1.835
    pciBusID 0000:01:00.0
    Total memory: 7.92GiB
    Free memory: 7.65GiB
    I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
    I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
    000003/000006 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]
    000006/000007 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]
    000009/000006 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]
    000009/000004 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]
    000000/000005 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]
    000000/000004 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]


    import tensorflow as tf


    ImportError: cannot import name pywrap_tensorflow


    bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
    bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
    sudo pip install /tmp/tensorflow_pkg/tensorflow-0.9.0-py2-none-any.whl

    Requirement already satisfied (use --upgrade to upgrade): setuptools in /usr/lib/python2.7/dist-packages (from protobuf==3.0.0b2->tensorflow==0.9.0)
    Installing collected packages: six, funcsigs, pbr, mock, protobuf, tensorflow
    Successfully installed funcsigs-1.0.2 mock-2.0.0 pbr-1.10.0 protobuf-3.0.0b2 six-1.10.0 tensorflow-0.9.0


    Python 2.7.12 (default, Jul  1 2016, 15:12:24) 
    Type "copyright", "credits" or "license" for more information.

    IPython 2.4.1 -- An enhanced Interactive Python.
    ?         -> Introduction and overview of IPython's features.
    %quickref -> Quick reference.
    help      -> Python's own help system.
    object?   -> Details about 'object', use 'object??' for extra details.

    In [1]: import tensorflow as tf
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally

    In [2]: import numpy as np

    In [3]: x_data = np.random.rand(100).astype(np.float32)

    In [4]: y_data = x_data * 0.1 + 0.3

    In [5]: W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))

    In [6]: b = tf.Variable(tf.zeros([1]))

    In [7]: y = W * x_data + b

    In [8]: loss = tf.reduce_mean(tf.square(y - y_data))

    In [9]: optimizer = tf.train.GradientDescentOptimizer(0.5)

    In [10]: train = optimizer.minimize(loss)

    In [11]: init = tf.initialize_all_variables()

    In [12]: sess = tf.Session()
    I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: 
    name: GeForce GTX 1080
    major: 6 minor: 1 memoryClockRate (GHz) 1.835
    pciBusID 0000:01:00.0
    Total memory: 7.92GiB
    Free memory: 7.65GiB
    I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
    I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y 
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)

    In [13]: sess.run(init)

    In [14]: for step in range(201):
       ....:     sess.run(train)
       ....:     if step % 20 == 0:
       ....:         print(step, sess.run(W), sess.run(b))
    (0, array([-0.10331395], dtype=float32), array([ 0.62236434], dtype=float32))
    (20, array([ 0.03067014], dtype=float32), array([ 0.3403711], dtype=float32))
    (40, array([ 0.08353967], dtype=float32), array([ 0.30958495], dtype=float32))
    (60, array([ 0.09609199], dtype=float32), array([ 0.30227566], dtype=float32))
    (80, array([ 0.09907217], dtype=float32), array([ 0.3005403], dtype=float32))
    (100, array([ 0.09977971], dtype=float32), array([ 0.30012828], dtype=float32))
    (120, array([ 0.0999477], dtype=float32), array([ 0.30003047], dtype=float32))
    (140, array([ 0.0999876], dtype=float32), array([ 0.30000722], dtype=float32))
    (160, array([ 0.09999706], dtype=float32), array([ 0.30000171], dtype=float32))
    (180, array([ 0.09999929], dtype=float32), array([ 0.30000043], dtype=float32))
    (200, array([ 0.09999985], dtype=float32), array([ 0.3000001], dtype=float32))

    终于OK了,之后就可以尽情享用基于GTX 1080 GPU版的TensorFlow了。

