zoukankan html css js c++ java

TensorFlow-cpu优化及numpy优化

1，TensorFlow-cpu优化

当你使用cpu版TensorFlow时（比如pip安装），你可能会遇到警告，说你cpu支持AVX/AVX2指令集，那么在以下网址下载对应版本。

https://github.com/fo40225/tensorflow-windows-wheel

具体使用github上有说明。

根据测试，安装AVX指令集后相应数学计算（矩阵乘法、分解等）速度是原来的3倍左右。

2，numpy优化

一般现在的numpy默认都是支持openblas的，但是我发现支持mkl的更快。下载地址

https://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy

查看numpy支持的优化：np.config.show()

以下附上测试代码及结果，你可以在自己电脑上测试。

'''
default numpy(openblas):
---------
Dotted two 4096x4096 matrices in 1.99 s.
Dotted two vectors of length 524288 in 0.40 ms.
SVD of a 2048x1024 matrix in 1.75 s.
Cholesky decomposition of a 2048x2048 matrix in 0.21 s.
Eigendecomposition of a 2048x2048 matrix in 10.31 s.
------------------------------------------------------
numpy+mkl:
----------
Dotted two 4096x4096 matrices in 1.56 s.
Dotted two vectors of length 524288 in 0.33 ms.
SVD of a 2048x1024 matrix in 1.07 s.
Cholesky decomposition of a 2048x2048 matrix in 0.24 s.
Eigendecomposition of a 2048x2048 matrix in 6.94 s.

'''
import numpy as np
from time import time

# Let's take the randomness out of random numbers (for reproducibility)
np.random.seed(0)

size = 4096
A, B = np.random.random((size, size)), np.random.random((size, size))
C, D = np.random.random((size * 128, )), np.random.random((size * 128, ))
E = np.random.random((int(size / 2), int(size / 4)))
F = np.random.random((int(size / 2), int(size / 2)))
F = np.dot(F, F.T)
G = np.random.random((int(size / 2), int(size / 2)))

# Matrix multiplication
N = 20
t = time()
for i in range(N):
    np.dot(A, B)
delta = time() - t
print('Dotted two %dx%d matrices in %0.2f s.' % (size, size, delta / N))
del A, B

# Vector multiplication
N = 5000
t = time()
for i in range(N):
    np.dot(C, D)
delta = time() - t
print('Dotted two vectors of length %d in %0.2f ms.' %
      (size * 128, 1e3 * delta / N))
del C, D

# Singular Value Decomposition (SVD)
N = 3
t = time()
for i in range(N):
    np.linalg.svd(E, full_matrices=False)
delta = time() - t
print("SVD of a %dx%d matrix in %0.2f s." % (size / 2, size / 4, delta / N))
del E

# Cholesky Decomposition
N = 3
t = time()
for i in range(N):
    np.linalg.cholesky(F)
delta = time() - t
print("Cholesky decomposition of a %dx%d matrix in %0.2f s." %
      (size / 2, size / 2, delta / N))

# Eigendecomposition
t = time()
for i in range(N):
    np.linalg.eig(G)
delta = time() - t
print("Eigendecomposition of a %dx%d matrix in %0.2f s." %
      (size / 2, size / 2, delta / N))

查看全文

相关阅读:
Fiddler 教程
 ios iOS手势识别的详细使用(拖动,缩放,旋转,点击,手势依赖,自定义手势)
ios 生成一个动态的随机的头像/随机数的操作
 在工程中如何使用一个公用的页面
 使用手势，让键盘在点击空白处消失
 ios开发之--iOS 11适配：iOS11导航栏返回偏移
 svn 操作字母的提示
 字面量
 控制打开和关闭远程推送通知
 常见结构体日期字符串的操作很实用

原文地址：https://www.cnblogs.com/lunge-blog/p/11904824.html

TensorFlow-cpu优化及numpy优化

1，TensorFlow-cpu优化

2，numpy优化

查看numpy支持的优化：np.__config__.show()

查看numpy支持的优化：np.config.show()