zoukankan      html  css  js  c++  java
  • TensorFlow-Gpu环境搭建——Win10+ Python+Anaconda+cuda

    参考:http://blog.csdn.net/sb19931201/article/details/53648615

    https://segmentfault.com/a/1190000009803319

    python版本tensorflow分为Cpu版本和Gpu版本,Nvidia的Gpu非常适合机器学校的训练

    python和tensorflow的安装较简单,可以参考上面的链接,主要是通过Anaconda来管理。

    使用Nvidia的Gpu,需要安装Cuda和cudnn

    需要注意

    1、显卡是否支持GPU加速

    2、软件的版本

    windows 10--python 3.5--tensorflow-gpu 1.4.0--cuda cuda_8.0.61_win10 --cudnn-8.0-windows10-x64-v6.0

    Cuda

    The NVIDIA® CUDA® Toolkit provides a development environment for creating high performance GPU-accelerated applications. With the CUDA Toolkit, you can develop, optimize and deploy your applications on GPU-accelerated embedded systems, desktop workstations, enterprise data centers, cloud-based platforms and HPC supercomputers. The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ compiler and a runtime library to deploy your application.

    介绍及最新版下载地址:https://developer.nvidia.com/cuda-toolkit

    cuda个版本下载地址:https://developer.nvidia.com/cuda-toolkit-archive,根据提示安装即可

    cudnn

    The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. cuDNN is part of the NVIDIA Deep Learning SDK.

    cudnn 是一个dll文件,需要复制到cuda的安装目录的bin文件中

    测试代码,使用的是tensorflow官网的代码

    import tensorflow as tf
    import numpy as np
    
    # 使用 NumPy 生成假数据(phony data), 总共 100 个点.
    x_data = np.float32(np.random.rand(2, 100)) # 随机输入
    y_data = np.dot([0.100, 0.200], x_data) + 0.300
    
    # 构造一个线性模型
    #
    b = tf.Variable(tf.zeros([1]))
    W = tf.Variable(tf.random_uniform([1, 2], -1.0, 1.0))
    y = tf.matmul(W, x_data) + b
    
    # 最小化方差
    loss = tf.reduce_mean(tf.square(y - y_data))
    optimizer = tf.train.GradientDescentOptimizer(0.5)
    train = optimizer.minimize(loss)
    
    # 初始化变量
    init = tf.initialize_all_variables()
    
    # 启动图 (graph)
    sess = tf.Session()
    sess.run(init)
    
    # 拟合平面
    for step in range(0, 201):
        sess.run(train)
        if step % 20 == 0:
            print (step, sess.run(W), sess.run(b))
    
    # 得到最佳拟合结果 W: [[0.100  0.200]], b: [0.300]

    输出结果:

    可以看到显卡的计算能力是6.1

    D:ToolsAnaconda35python.exe D:/PythonProj/tensorFlow/tensor8.py
    WARNING:tensorflow:From D:ToolsAnaconda35libsite-packages	ensorflowpythonutil	f_should_use.py:107: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
    Instructions for updating:
    Use `tf.global_variables_initializer` instead.
    2017-11-19 17:08:40.225423: I C:	f_jenkinshomeworkspace
    el-winMwindows-gpuPY35	ensorflowcoreplatformcpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
    2017-11-19 17:08:40.882335: I C:	f_jenkinshomeworkspace
    el-winMwindows-gpuPY35	ensorflowcorecommon_runtimegpugpu_device.cc:1030] Found device 0 with properties: 
    name: GeForce GTX 1060 3GB major: 6 minor: 1 memoryClockRate(GHz): 1.7085
    pciBusID: 0000:01:00.0
    totalMemory: 3.00GiB freeMemory: 254.16MiB
    2017-11-19 17:08:40.883414: I C:	f_jenkinshomeworkspace
    el-winMwindows-gpuPY35	ensorflowcorecommon_runtimegpugpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1060 3GB, pci bus id: 0000:01:00.0, compute capability: 6.1)
    0 [[ 0.29419887 -0.23337287]] [ 1.0515306]
    20 [[ 0.00030054  0.03563837]] [ 0.44433528]
    40 [[ 0.04815638  0.14494912]] [ 0.35854429]
    60 [[ 0.07746208  0.17898612]] [ 0.32386735]
    80 [[ 0.09062619  0.19159497]] [ 0.30974501]
    100 [[ 0.09614999  0.19658807]] [ 0.30398068]
    120 [[ 0.09842454  0.1986087 ]] [ 0.30162627]
    140 [[ 0.09935603  0.1994319 ]] [ 0.3006644]
    160 [[ 0.09973686  0.19976793]] [ 0.30027145]
    180 [[ 0.09989249  0.1999052 ]] [ 0.30011091]
    200 [[ 0.09995609  0.19996127]] [ 0.30004531]
    
    Process finished with exit code 0

    MNIST教程,训练结果比cup版本快了大约百倍

    from tensorflow.examples.tutorials.mnist import input_data
    import tensorflow as tf
    
    #加载训练数据
    MNIST_data_folder=r"D:WorkSpace	ensorFlowdata"
    mnist=input_data.read_data_sets(MNIST_data_folder,one_hot=True)
    print(mnist.train.next_batch(1))
    #
    # 建立抽象模型
    x = tf.placeholder("float", [None, 784])
    W = tf.Variable(tf.zeros([784,10]))
    b = tf.Variable(tf.zeros([10]))
    y = tf.nn.softmax(tf.matmul(x,W) + b)
    y_ = tf.placeholder("float", [None,10])
    #权重初始化
    def weight_variable(shape):
      initial = tf.truncated_normal(shape, stddev=0.1)
      return tf.Variable(initial)
    
    def bias_variable(shape):
      initial = tf.constant(0.1, shape=shape)
      return tf.Variable(initial)
    
    #卷积和池化
    def conv2d(x, W):
      return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
    
    def max_pool_2x2(x):
      return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                            strides=[1, 2, 2, 1], padding='SAME')
    
    #第一层卷积
    W_conv1 = weight_variable([5, 5, 1, 32])
    b_conv1 = bias_variable([32])
    x_image = tf.reshape(x, [-1,28,28,1])
    h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
    h_pool1 = max_pool_2x2(h_conv1)
    
    #第二层卷积
    W_conv2 = weight_variable([5, 5, 32, 64])
    b_conv2 = bias_variable([64])
    h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
    h_pool2 = max_pool_2x2(h_conv2)
    
    #密集连接层
    W_fc1 = weight_variable([7 * 7 * 64, 1024])
    b_fc1 = bias_variable([1024])
    
    h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
    h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
    
    #Dropout
    keep_prob = tf.placeholder("float")
    h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
    #输出层
    W_fc2 = weight_variable([1024, 10])
    b_fc2 = bias_variable([10])
    
    y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
    
    #训练和评估模型
    cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv))
    train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
    correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
    
    sess = tf.InteractiveSession();
    init = tf.global_variables_initializer();
    sess.run(init);
    
    for i in range(20000):
      batch = mnist.train.next_batch(50)
      if i%100 == 0:
        train_accuracy = accuracy.eval(feed_dict={
            x:batch[0], y_: batch[1], keep_prob: 1.0})
        print("step %d, training accuracy %g"%(i, train_accuracy))
      train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
    
    print("test accuracy %g"%accuracy.eval(feed_dict={
        x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
  • 相关阅读:
    【File类:重命名功能】
    一段代码-Java
    Galahad
    简单的中位数
    小A的题 线段树区间赋值
    上升子序列方案数
    Superdoku 二分图匹配
    Haybale Guessing 区间并查集
    Dijkstra+二分查找
    莫比乌斯反演
  • 原文地址:https://www.cnblogs.com/learnMoreEveryday/p/7860342.html
Copyright © 2011-2022 走看看