一个简单的计时器对比各种可迭代对象定义方式的速度区别
前情介绍: 如果对迭代器和生成器不了解,可以先看这两篇
初始版本
import time
reps = 1000
repslist = range(reps)
def timer(func, *pargs, **kargs):
start = time.clock()
for i in repslist:
ret = func(*pargs, **kargs)
elapsed = time.clock() - start
return (elapsed, ret)
这个是初始版本的计时器.
我们先来做个测试跑一遍
from timer import timer
import sys
reps = 100000
repslist = range(reps)
def forloop():
res = []
for x in repslist:
res.append(abs(x))
return res
def listComp():
return [abs(x) for x in repslist]
def mapCall():
return list(map(abs,repslist))
def genExpr():
return list(abs(x) for x in repslist)
def genFunc():
def gen():
for x in repslist:
yield abs(x)
return list(gen())
print(sys.version)
for test in(forloop,listComp,mapCall,genExpr,genFunc):
elapsed,result = timer(test)
print('-'*33)
print('%-9s:%.5f => [%s...%s]'%(test.__name__,elapsed,result[0],result[-1]))
得到的结果如下:
C:Anaconda3python.exe C:/Users/Brady/PycharmProjects/FAQ/literor.py
3.7.4 (default, Aug 9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]
---------------------------------
forloop :11.40492 => [0...99999]
---------------------------------
listComp :7.58494 => [0...99999]
---------------------------------
mapCall :4.28971 => [0...99999]
---------------------------------
genExpr :10.49181 => [0...99999]
---------------------------------
genFunc :10.76498 => [0...99999]
从结果中可以看出来:
map比列表解析式快,而且两者都比for循环要快得多. 生成器表达式和函数速度居中
如果我们采用自定义函数而非内置函数的话,得到的结果就更有意思了:
from timer import timer
import sys
reps = 100000
repslist = range(reps)
def forloop():
res = []
for x in repslist:
res.append(x+10)
return res
def listComp():
return [x+10 for x in repslist]
def mapCall():
return list(map(lambda x:x+10,repslist))
def genExpr():
return list(x+10 for x in repslist)
def genFunc():
def gen():
for x in repslist:
yield x+10
return list(gen())
print(sys.version)
for test in(forloop,listComp,mapCall,genExpr,genFunc):
elapsed,result = timer(test)
print('-'*33)
print('%-9s:%.5f => [%s...%s]'%(test.__name__,elapsed,result[0],result[-1]))
我们得到的结果如下:
3.7.4 (default, Aug 9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]
---------------------------------
forloop :26.69562 => [10...100009]
---------------------------------
listComp :16.46341 => [10...100009]
---------------------------------
mapCall :19.51527 => [10...100009]
---------------------------------
genExpr :10.53358 => [10...100009]
---------------------------------
genFunc :10.85899 => [10...100009]
Process finished with exit code 0
说实话这个结果有点不好解释了...貌似打脸了...
于是我又跑了一遍...得到的结果如下:
3.7.4 (default, Aug 9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]
---------------------------------
forloop :11.92378 => [10...100009]
---------------------------------
listComp :7.27866 => [10...100009]
---------------------------------
mapCall :12.92113 => [10...100009]
---------------------------------
genExpr :10.50988 => [10...100009]
---------------------------------
genFunc :10.56482 => [10...100009]
Process finished with exit code 0
这个结果比较符合我们的预期...
在自定义函数下,map的速度比for循环要慢 列表解析式速度是最块的. 生成器表达式的速度比列表解析式要慢,但是与生成器函数差不多.
进阶版本
这个结果主要是由于python解释器的实现造成的.
同时也说明一个问题... 我们的计时器不够科学...
于是下面我们来优化一下我们的计时器.
考虑平台的兼容性,在类unix系统中, time.time
可以提供更好的解析由于随机的系统载入可能引起的波动,我们在测试中取最短时间比取总运行时间要更可靠.
改版后的计时器
import time
import sys
if sys.platform[:3]=='win':
timefunc = time.clock
else:
timfunc = time.time
def trace(*args):
"""
used for debuging
:param args:
:return:
"""
pass
def timer(func,*pargs,**kargs):
_reps = kargs.pop('_reps',1000)
trace(func,pargs,kargs,_reps)
repslist = range(_reps)
start = timefunc()
for i in repslist:
ret = func(*pargs,**kargs)
elapsed = timefunc()-start
return (elapsed,ret)
def best(func,*pargs,**kargs):
_reps = kargs.pop('_reps',50)
best=2**32
for i in range(_reps):
(time,ret)=timer(func,*pargs,_reps=1,**kargs)
if time <best: best=time
return (best,ret)
改版后的测试代码
from timer import timer
from timer import best
import sys
reps = 100000
repslist = range(reps)
def forloop():
res = []
for x in repslist:
res.append(x+10)
return res
def listComp():
return [x+10 for x in repslist]
def mapCall():
return list(map(lambda x:x+10,repslist))
def genExpr():
return list(x+10 for x in repslist)
def genFunc():
def gen():
for x in repslist:
yield x+10
return list(gen())
print(sys.version)
for tester in (timer,best):
print(f'<{tester.__name__}>')
for test in(forloop,listComp,mapCall,genExpr,genFunc):
elapsed,result = tester(test)
print('-'*35)
print('%-9s:%.5f => [%s...%s]'%(test.__name__,elapsed,result[0],result[-1]))
来看一下结果
3.7.4 (default, Aug 9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]
<timer>
-----------------------------------
forloop :11.18427 => [10...100009]
-----------------------------------
listComp :7.33068 => [10...100009]
-----------------------------------
mapCall :13.33474 => [10...100009]
-----------------------------------
genExpr :11.25375 => [10...100009]
-----------------------------------
genFunc :11.03975 => [10...100009]
<best>
-----------------------------------
forloop :0.00904 => [10...100009]
-----------------------------------
listComp :0.00525 => [10...100009]
-----------------------------------
mapCall :0.01133 => [10...100009]
-----------------------------------
genExpr :0.00845 => [10...100009]
-----------------------------------
genFunc :0.00785 => [10...100009]
从运行的最快速度来看的话,完全符合我们上面的结论.
列表解析式的速度是最快的 map函数比正常的for循环要慢 生成器表达式比for循环要快,速度与生成器函数差不太多.
「结论:」
其实这篇文章写来纯粹是为了好玩的. 既然选择了python...就别太纠结运行速度了,毕竟python只负责貌美如花...
python代码的优化,首先考虑的是可读性和简单性,其次实在闲的蛋疼了再去优化性能.