zoukankan      html  css  js  c++  java
  • Python代码优化概要

    Python即是面向过程语言,也是面向对象语言,很多其它情况下充当脚本语言的角色。虽是脚本语言,但相同涉及到代码优化的问题,代码优化可以让程序执行更快,它是在不改变程序执行结果的情况下使程序执行效率更高,依据80/20原则。实现程序的重构、优化、扩展以及文档相关的事情通常须要消耗80%的工作量。

    优化通常包括双方面的内容:

    1. 减小代码的体积、提高代码的可读性及可维护性

    2. 改进算法,减少代码复杂度,提高代码执行效率。

    选择合适的数据结构一个良好的算法可以对性能起到关键作用。因此性能改进的首要点是对算法的改进。

    在算法的时间复杂度排序上依次是:

    O(1) > O(lg n) > O(n lg n) > O(n^2) > O(n^3) > O(n^k) > O(k^n) > O(n!)


    比方说字典是哈希结构。遍历字典算法复杂度是O(1),而列表算法复杂度是O(n),因此查找对象字典比列表快。

    以下列出一些代码优化的技巧。以概要方式总结。因为时间关系。仅仅总结当中一部分。以后会持续更新。

    说明

    測试的工具: 包含time模块,timeit模块,profile模块或cProfile模块

    验证的方式包含Python ShelliPythonPython脚本

    測试的环境: 包含Python 2.7.6IPython 2.3.1 

    NOTE: 

    1. 一般来说c开头是c语言实现,速度更快些,比方cProfile就比profile快。

    cPickle比pickle快。

    2. 一般来说Python版本号较高。在速度上都有非常大提升。所以測试环境不同,结果不一样。

    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    += 比 +快

    从Python2.0開始,添加了增强性数据类型,比方说

    X += Y

    等价于X = X + Y

    1. 就优化来说。左側仅仅需计算一次。在X += Y中,X能够使复杂的对象表达式。在增强形式中,则仅仅须要计算一次。

    然而,在完整的X = X + Y中,X出现两次,必须运行两次。因此增强赋值语句通常更快些。

    from timeit import Timer   #记得导入timeit模块

    In [4]: Timer('S = S + "eggs"','S = "SPAM"').timeit()
    Out[4]: 2.8523161220051065
    
    In [5]: Timer('S += "eggs"','S = "SPAM"').timeit()
    Out[5]: 2.602857082653941
    2. 优化技术会自己主动选择,对于支持原处改动的对象而言,增强形式会自己主动运行原处的改动。

    普通复制:

    >>> M = [1,2,3]
    >>> L = M
    >>> M = M + [5]
    >>> M;L
    [1, 2, 3, 4]
    [1, 2, 3]
    原处改动:
    >>> M  = [1,2,3]
    >>> L  = M
    >>> M += [4]
    >>> M;L
    [1, 2, 3, 4]
    [1, 2, 3, 4]
    
    >>> Timer('L = L + [4,5,6]','L = [1,2,3]').timeit(20000)
    4.324376213615835
    >>> Timer('L += [4,5,6]','L = [1,2,3]').timeit(20000)
    0.005897484107492801
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    可变对象内置函数比合并操作快

    第一种方法: 普通加入来实现

    >>> L = [1,2,3]
    >>> L = L + [4]
    >>> L
    [1, 2, 3, 4]
    另外一种方法: 内置函数来实现
    >>> L = [1,2,3]
    >>> L.append(4)
    >>> L
    [1, 2, 3, 4]
    其所花费的时间,相差数百倍:
    >>> Timer('L = L + [4]','L = [1,2,3]').timeit(50000)
    8.118179033256638
    >>> Timer('L.append(4)','L = [1,2,3]').timeit(50000)    #内置函数append()方法
    0.01078882192950914
    >>> Timer('L.extend([4])','L = [1,2,3]').timeit(50000)  #内置函数extend()方法
    0.020846637858539907
    普通的合并操作尽管没有共享引用带来的副作用,与等效的原处改动相比。但速度非常慢。合并操作必须建立新的对象,复制左側的列表,再复制右側的列表。与之相比的是:在原处的改动法仅仅会在内存块的末尾加入元素。

    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    布尔測试比边界測试快

    >>> Timer('X < Y and Y < Z','X=1;Y=2;Z=3').timeit(100000000)  #布尔測试
    7.142944090197389
    >>> Timer('X < Y < Z','X=1;Y=2;Z=3').timeit(100000000)        #边界測试,推断Y结余X,Z之间
    11.501173499654769
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    短路运算比and运算快

    在Python中,if用于条件推断,有以下几种情况

    X and Y:  X与Y同一时候为真。方为真

    X  or Y:  X或Y任一位真,就为真。 也叫短路运算。即假设前面为真,后面则不推断

    not X:    X为假时方为真

    In [28]: Timer('2 or 3').timeit(100000000)  #短路运算:前面为真,后面不运算,所以速度快些
    Out[28]: 3.780060393088206
    
    In [29]: Timer('2 and 3').timeit(100000000) #and。必须运算为全部的,速度相对慢些
    Out[29]: 4.313562268420355
    
    In [30]: Timer('0 or 1').timeit(100000000)  #or运算,但前面为假。所以和前面速度相当
    Out[30]: 4.251177957004984
    
    In [31]: Timer('not 0').timeit(100000000)   #not运算,仅仅须要推断一个条件,速度快些
    Out[31]: 3.6270803685183637
    在前面三个表达式中,短路运算和not运算无疑速度快些,and运算和or中前面条件为假者速度慢些。

    所以在程序中适当使用,能够提高程序效率.

    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    append比insert速度快

    列表的append方法要比insert方法快的多,由于兴许的引用必须被移动以便使新元素腾地方.

    复杂度append末尾加入,复杂度O(1)。而insert复杂度是O(n)

    >>> Timer('L.append(4)','L=[1,2,3,5,6]').timeit(200000)
    0.03233202260122425
    >>> Timer('L.insert(3,4)','L=[1,2,3,5,6]').timeit(200000)
    18.31223843438289
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    成员变量測试:字典和集合快于列表和元祖

    能够用in来做成员变量推断,比方'a' in 'abcd'

    推断列表和元祖中是否含有某个值的操作要比字典和集合慢的多。

    由于Python会对列表中的值进行线性扫描。而另外两个基于哈希表,能够瞬间完毕推断。

    数据越大,越明显。

    In [44]: Timer('4 in L','L=(1,2,3,4,5,6,7,8,9)').timeit(100000000)
    Out[44]: 12.941504527043435    #列表成员推断
    
    
    In [45]: Timer('4 in T','T=[1,2,3,4,5,6,7,8,9]').timeit(100000000)
    Out[45]: 12.883945908790338    #元祖成员推断,和列表几乎相同
    
    
    In [46]: Timer('4 in S','S=set([1,2,3,4,5,6,7,8,9])').timeit(100000000)
    Out[46]: 6.254324848690885     #集合成员推断。和字典几乎相同
    
    
    In [47]: Timer('4 in D','D={1:"a",2:"b",3:"c",4:"d",5:"e",6:"f",7:"g",8:"h",9:"i"}').timeit(100000000)
    Out[47]: 6.3508488422085065    #字典成员推断

    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    列表合并extend比+速度快

    列表合并(+)是一种相当费资源的操作,由于必须创建一个新列表并将全部对象复制进去。

    而extend将元素附加到现有列表中,因此会快非常多,尤其是创建一个大列表时尤其如此.

    +操作运行结果:

    import profile            #用cProfile会快些
    
    def func_add():           #測试列表合并操作
        lst = []
        for i in range(5000): 
            for item in [[0],[1],[2],[3],[4],[5],[6],[7],[8],[9],[10]]:
                lst = lst + item
                
    if __name__=='__main__':
        profile.run('func_add()')
    #####測试结果:#####
    >>> 
             5 function calls in 9.243 seconds
    
       Ordered by: standard name
    
       ncalls  tottime  percall  cumtime  percall filename:lineno(function)
            1    0.000    0.000    0.000    0.000 :0(range)
            1    0.006    0.006    0.006    0.006 :0(setprofile)
            1    0.000    0.000    9.237    9.237 <string>:1(<module>)
            1    9.236    9.236    9.236    9.236 Learn.py:3(func_add)
            1    0.000    0.000    9.243    9.243 profile:0(func_add())
            0    0.000             0.000          profile:0(profiler)

    extend运行结果:

    import profile
    
    def func_extend():
        lst = []
        for i in range(5000):
            for item in [[0],[1],[2],[3],[4],[5],[6],[7],[8],[9],[10]]:
                lst.extend(item)
        
    
    if __name__=='__main__':
        profile.run('func_extend()')
    
    #####输出结果:#####
    >>> 
             55005 function calls in 0.279 seconds
    
       Ordered by: standard name
    
       ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        55000    0.124    0.000    0.124    0.000 :0(extend)
            1    0.000    0.000    0.000    0.000 :0(range)
            1    0.005    0.005    0.005    0.005 :0(setprofile)
            1    0.000    0.000    0.274    0.274 <string>:1(<module>)
            1    0.149    0.149    0.273    0.273 Learn.py:3(func_extend)
            1    0.000    0.000    0.279    0.279 profile:0(func_extend())
            0    0.000             0.000          profile:0(profiler)

    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    xrange比range快

    In [9]: Timer('for i in range(1000): pass').timeit()
    Out[9]: 30.839959527228757
    
    In [10]: Timer('for i in xrange(1000): pass').timeit()
    Out[10]: 19.644791055468943
    xrange是range的C语言实现。更高效的内存管理。

    xrange:每次仅仅迭代一个对象

    range:一次生成全部数据,须要一个个扫描

    NOTE: 在Python3.0中取消了xrange函数,仅仅留range。无论这个range事实上就是xrange,仅仅只是名字变了。

    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    内置函数>列表推导>for循环>while循环

    http://blog.csdn.net/jerry_1126/article/details/41773277

    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    局部变量>全局变量

    import profile
    
    A = 5
    
    def param_test():
        B = 5
        res = 0
        for i in range(100000000):
            res = B + i
        return res
            
    if __name__=='__main__':
        profile.run('param_test()')
    >>> ===================================== RESTART =====================================
    >>> 
             5 function calls in 37.012 seconds  #全局变量測试结果:37 s
    
    
       Ordered by: standard name
    
    
       ncalls  tottime  percall  cumtime  percall filename:lineno(function)
            1   19.586   19.586   19.586   19.586 :0(range)
            1    1.358    1.358    1.358    1.358 :0(setprofile)
            1    0.004    0.004   35.448   35.448 <string>:1(<module>)
            1   15.857   15.857   35.443   35.443 Learn.py:5(param_test)
            1    0.206    0.206   37.012   37.012 profile:0(param_test())
            0    0.000             0.000          profile:0(profiler)
    
    
    
    
    >>> ===================================== RESTART =====================================
    >>> 
             5 function calls in 11.504 seconds    #局部变量測试结果: 11s
    
    
       Ordered by: standard name
    
    
       ncalls  tottime  percall  cumtime  percall filename:lineno(function)
            1    3.135    3.135    3.135    3.135 :0(range)
            1    0.006    0.006    0.006    0.006 :0(setprofile)
            1    0.000    0.000   11.497   11.497 <string>:1(<module>)
            1    8.362    8.362   11.497   11.497 Learn.py:5(param_test)
            1    0.000    0.000   11.504   11.504 profile:0(param_test())
            0    0.000             0.000          profile:0(profiler)
    
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    while 1 > while True

    while 1运行结果:

    import cProfile
    
    def while_1():
        tag = 0
        while 1:
            tag += 1
            if tag > 100000000:
                break
    
           
    if __name__=='__main__':
        cProfile.run('while_1()')
    
    >>> ===================================== RESTART =====================================
    >>> 
             4 function calls in 5.366 seconds
    
       Ordered by: standard name
    
       ncalls  tottime  percall  cumtime  percall filename:lineno(function)
            1    0.006    0.006    0.006    0.006 :0(setprofile)
            1    0.000    0.000    5.360    5.360 <string>:1(<module>)
            1    5.360    5.360    5.360    5.360 Learn.py:3(while_1)
            0    0.000             0.000          profile:0(profiler)
            1    0.000    0.000    5.366    5.366 profile:0(while_1())
    while True运行结果:
    import cProfile
    
    def while_true():
        tag = 0
        while True:
            tag += 1
            if tag > 100000000:
                break
           
    if __name__=='__main__':
        cProfile.run('while_true()')
    
    >>> ===================================== RESTART =====================================
    >>> 
             4 function calls in 8.236 seconds
    
       Ordered by: standard name
    
       ncalls  tottime  percall  cumtime  percall filename:lineno(function)
            1    0.012    0.012    0.012    0.012 :0(setprofile)
            1    0.000    0.000    8.224    8.224 <string>:1(<module>)
            1    8.224    8.224    8.224    8.224 Learn.py:10(while_true)
            0    0.000             0.000          profile:0(profiler)
            1    0.000    0.000    8.236    8.236 profile:0(while_true())
    NOTE: while 1比while True,运行快些,是由于Python 2.x中True相当于全局变量,非keyword.

    尽管前者比后者快些,但后者可读性无疑更好些.

    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    求交集集合比列表快

    列表測试结果:

    from time import time
    
    t1 = time()
    list_1 = [32,78,65,99,19,43,18,22,7,1,9,2,4,8,56]
    list_2 = [3,4,8,56,99,100]
    temp   = []
    for x in range(1000000):
        for i in list_2:
            for j in list_1:
                if i == j:
                    temp.append(i)
    t2 = time()
    print "Total time:", t2 - t1
    
    #測试结果:
    >>> 
    Total time: 13.6879999638
    集合測试结果:
    from time import time
    
    t1 = time()
    set_1 = set([32,78,65,99,19,43,18,22,7,1,9,2,4,8,56])
    set_2 = set([3,4,8,56,99,100])
    for x in range(1000000):
        set_same = set_1 & set_2
        
    t2 = time()
    print "Total time:", t2 - t1
    
    #測试结果:
    >>> 
    Total time: 0.611000061035
    NOTE: 用集合的方式取交集速度快的多。

    以下是经常使用的集合操作。

    >>> set1 = set([2,3,4,8,9])  #集合1
    >>> set2 = set([1,3,4,5,6])  #集合2
    >>> set1 & set2              #求交集
    set([3, 4])
    >>> set1 | set2              #求合集
    set([1, 2, 3, 4, 5, 6, 8, 9])
    >>> set1 - set2              #求差集
    set([8, 9, 2])
    >>> set1 ^ set2              #求异或:即排除共同部分
    set([1, 2, 5, 6, 8, 9])
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    直接交换两变量 > 借助中间变量

    要交换X,Y的值。有两种方法:

    1. 直接交换: X, Y = Y, X

    >>> X,Y = 1,2
    >>> X,Y
    (1, 2)
    >>> X, Y = Y, X
    >>> X,Y
    (2, 1)
    2.借助中间变量: T = X, X = Y, Y = X
    >>> X,Y = 1,2
    >>> X,Y
    (1, 2)
    >>> T = X; X = Y; Y = T
    >>> X,Y
    (2, 1)
    測试结果:


    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    is not速度快于!=

    在if条件推断中。能够用 if a is not None:或者 if a != None 前者执行速度快于后者.

    測试结果:


    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    ''.join(list)速度快于+或+=

    +測试结果:


    ''.join(list)測试结果:


    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    在循环体外运行函数比在循环中快

    所以要降低函数的调用次数





    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    **比pow()速度快

    測试结果:


    **其速度是pow()函数的几十倍,并且数据越大,越明显。

    **相当于Python的移位操作: 右移(>>) 和 左移(<<)比方说 2**2 = 4相当于 2 << 1 其速度相当!


    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    生成器速度比列表快


    前者是列表解析对象。后者是generate对象。所需的内存空间与列表大小无关。所以速度快些.

    在实际应用中,比方说创建一个集合,用生成器对象明显比列表对象要快些。


    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    浅拷贝的速度比深拷贝速度


    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    cPickle比pickle速度快

    pickle模块和cPickle模块都能够将Python对象永久存储在系统文件里。

    但cPickle是Python的C语言实现。因而速度更快些。

    请看以下对照:


    存储一个百万级别大小的列表,用cPickle模块差点儿10倍于pickle模块。

    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    int()比int(math.floor())快

    比方对浮点数取整,32.9,假设要取年龄的话。仅仅能是32。

    能够有两种方式

    一种: int(math.floor(32.9))   # floor先取32.0再转为整数

    一种: int(32.9)                  # 直接向下取整,math.floor()下多余的


    能够看出,另外一种方式要比第一种快的多。

    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    读取文件操作for循环比while高效

    測试项目: test.txt 3M的文件


    測试结果: 245.83s


    測试结果: 182.62s

    for line in open('filename'): 

        process(line)

    上面这样的方式应该读取文件的最佳方式:原因有三

    1.写法最简单

    2. 执行最高速

    3. 从内存使用角度来看,也是最好的

    NOTE: 

    1. readlines()是一次性载入全部的行,而xreadlines()是按须要载入。可避免大文件导致内存泄露

    2. readline()是迭代逐行读取。从内存角度来说。在大文件处理中。效率要比readlines()高

    3. 超大文件的话,用readlines()方式能够会导致内存崩溃。

    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    字典的迭代取值比直接取值高效

    >>> Timer('for v in d.values(): pass',"d={'a':1,'b':2,'c':3,'d':4,'e':5}").timeit(100000000)
    43.297132594847085
    >>> Timer('for v in d.itervalues(): pass',"d={'a':1,'b':2,'c':3,'d':4,'e':5}").timeit(100000000)
    36.16957129734047
    也就是说:

    # d.itervalues() 比d.values()要快些
    # d.iteritems()  比d.items()要快些
    # d.iterkeys()   比d.keys()要快些
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    创建字典常规方法比工厂方法要快

    常规方法:
    d={'x':1, 'y':2, 'z':3}
    工厂方法:例如以下面几种
    1. dict((['x',1], ['y',2], ['z',3]))

    2. dict(zip(('x', 'y', 'z'), (1, 2, 3)))

    3. dict(x=1, y=2, z=3)

    >>> Timer("d={'x':1, 'y':2, 'z':3}").timeit()
    0.19084989859678103
    >>> Timer("dict((['x',1], ['y',2], ['z',3]))")
    <timeit.Timer instance at 0x0000000002E06288>
    >>> Timer("dict((['x',1], ['y',2], ['z',3]))").timeit()
    1.5503493690009975
    相同工厂方法创建新字典速度比copy()函数慢
    >>> Timer("D2=copy.copy({'x':1, 'y':2, 'z':3})","import copy").timeit()
    1.074277245445046
    >>> Timer("D3=dict((['x',1], ['y',2], ['z',3]))").timeit()
    1.5830155758635556


  • 相关阅读:
    Android设计中的.9.png图片
    Socket原理
    word2vec中文类似词计算和聚类的使用说明及c语言源代码
    Scala之集合Collection
    使用C语言调用mysql数据库编程实战以及技巧
    Web学习篇之---html基础知识(一)
    μCOS-II系统之事件(event)的使用规则及Semaphore实例
    activiti自己定义流程之Spring整合activiti-modeler实例(一):环境搭建
    将ASP.NET用户控件转化为自定义控件
    【C#】Excel导出合并行和列并动态加载行与列
  • 原文地址:https://www.cnblogs.com/zhchoutai/p/6798337.html
Copyright © 2011-2022 走看看