从两个例子说起:
想要sum一个超大的list数据,最终会报出 MerrorError,耗费内存巨大。
例1.
>>> sum([i for i in xrange(100000000)]) Traceback (most recent call last): File "<input>", line 1, in <module> MemoryError
改用生成器表达式mode就不会有如此大的内存损耗
例2.
>>> sum(i for i in xrange(100000000)) 4999999950000000L
主要原因是迭代器的延迟特性,它不像列表把所有元素一次性加载到内存,而是以延迟计算方式返回元素。
迭代器需要调用next方法才返回下一个元素,由于sum函数已经使用了迭代器协议所以可以调用生成的生成器
对象直至生成器为空。
python 迭代器
访问集合元素的一种方式。迭代器对象(python中万事皆对象)从集合的第一个元素开始访问,直到结束。迭代器无法像列表一样根据位置访问,迭代器只能遍历一遍稍后会举例说明。
想list[0]这样是不可以支持的。
迭代器协议:从集合的第一个元素开始访问,直到结束。这是通俗理解的迭代器协议,访问元素使用next()方法顺序访问,
当没有元素时会发出StopIteration异常。for 循环可用于任何可迭代对象。
迭代器对象:遵循迭代器协议的python对象。迭代器提供了一个统一的访问集合接口,只要是实现了__iner__()或__getitem__()方法的对象,就可以使用
迭代器进行访问。
序列:str,list,tuple
非序列: dict, file
自定义类: 用户自定义的类实现了__iter()或__getitem__()方法的对象
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
>>> dir(list) ['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__delslice__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getslice__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__setslice__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort'] >>> dir(str) ['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__getslice__', '__gt__', '__hash__', '__init__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '_formatter_field_name_split', '_formatter_parser', 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill'] >>> dir(tuple) ['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__getslice__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'count', 'index'] >>> dir(dict) ['__class__', '__cmp__', '__contains__', '__delattr__', '__delitem__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'clear', 'copy', 'fromkeys', 'get', 'has_key', 'items', 'iteritems', 'iterkeys', 'itervalues', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values', 'viewitems', 'viewkeys', 'viewvalues'] >>> dir(file) ['__class__', '__delattr__', '__doc__', '__enter__', '__exit__', '__format__', '__getattribute__', '__hash__', '__init__', '__iter__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'close', 'closed', 'encoding', 'errors', 'fileno', 'flush', 'isatty', 'mode', 'name', 'newlines', 'next', 'read', 'readinto', 'readline', 'readlines', 'seek', 'softspace', 'tell', 'truncate', 'write', 'writelines', 'xreadlines']
创建迭代器对象:使用内建的工厂函数iter(iterable)可以获取迭代器对象,也可以使用对象内置的__iter__()方法生成迭代器。
语法:
iter(collection) -> iterator
iter(callable,sentinel) -> iterator
说明:
Get an iterator from an object.
In the first form, the argument must supply its own iterator, or be a sequence.
In the second form, the callable is called until it returns the sentinel.
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
>>> l1 = [2,4,6,8,10] >>> i1 = iter(l1) >>> i1 <listiterator object at 0x02B82D90> >>> i1.next() 2 >>> i1.next() 4 >>> i1.next() 6 >>> i1.next() 8 >>> i1.next() 10 >>> i1.next() Traceback (most recent call last): File "<input>", line 1, in <module> StopIteration
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
>>> i2 = l1.__iter__() >>> i2 <listiterator object at 0x02BB0390> >>> i2.next() 2 >>> i2.next() 4 >>> i2.next() 6 >>> i2.next() 8 >>> i2.next() 10 >>> i2.next() Traceback (most recent call last): File "<input>", line 1, in <module> StopIteration
当迭代器对象遍历结束之后再遍历的话将啥也不会产生,说明了迭代对象只能遍历一遍!只能遍历一遍!只能遍历一遍!
>>> i2.next() Traceback (most recent call last): File "<input>", line 1, in <module> StopIteration >>> for i in i2: ... print i ... >>>
PS:
for循环可用于任何可迭代对象
for循环开始时,会通过迭代协议传输给iter()内置函数,从而能够从迭代对象中获得一个迭代器,返回的对象含有需要的next()方法。
python 生成器
可以理解为一种实现了迭代器协议的数据类型,生成器==迭代器对象
生成器函数:与普通python函数定义相同,只不过使用yield关键字返回结果,yield语句一次返回一个结果,在每个结果中间会挂起函数状态,
保存程序运行的上下文,下次调用会从重新恢复挂起的函数上下文继续执行。
yield的功能:
1)将函数的返回结果做成迭代器,自动封装__iter__,__next__
2) 保存函数暂停 上下文,再次运行恢复暂停的上下文
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
>>> def fib(max): ... n, a, b = 0, 0, 1 ... while n < max: ... yield b ... a, b = b, a + b ... n = n + 1 >>> fib(6) <generator object fib at 0x02BA6C88> >>> fib(6).next() 1 >>> fib(6).next() 1 >>> fib(6).next() 1 >>> fib(6).next() 1 >>> fib(6).next() 1 >>> f = fib(6) >>> f.next() 1 >>> f.next() 1 >>> f.next() 2 >>> f.next() 3 >>> f.next() 5 >>> f.next() 8 >>> f.next() Traceback (most recent call last): File "<input>", line 1, in <module> StopIteration
生成器表达式:生成器表达式并不真正创建数字序列,而87是以返回一个生成器对象,一次只返回一个值
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
>>> list = [x for x in range(5)] >>> list [0, 1, 2, 3, 4] >>> iter = (x for x in range(5)) >>> iter <generator object <genexpr> at 0x02BA6C88> >>> iter.next() 0 >>> iter.next() 1 >>> iter.next() 2 >>> iter.next() 3 >>> iter.next() 4 >>> iter.next() Traceback (most recent call last): File "<input>", line 1, in <module> StopIteration
参考以下帖子,敬谢:
https://www.zhihu.com/question/20829330
http://www.cnblogs.com/spiritman/
https://www.cnblogs.com/fanison/p/7109655.html
https://www.cnblogs.com/liuxiaowei/p/7226531.html?utm_source=itdadao&utm_medium=referral