python 大文件读取

一般我们读取常用三个方法

.read()、.readline() 和 .readlines()，使用不好就会导致out of memory

python中用with语句打开和关闭文件，包括了抛出一个内部块异常，并且，for line in f其实是将文件对象f视为一个迭代器，自动的采用缓冲IO和内存管理，所以不必担心大文件。让系统来处理，其实是最简单的方式，交给解释器，就万事大吉了。

#If the file is line based
with open('...') as f:
    for line in f:
        process(line) # 


下面介绍两个方法

file = open("sample.txt")
while 1:
lines = file.readlines(1000)
if not lines:
    break
for line in lines:
    pass # do something






2
def read_in_chunks(filePath, chunk_size=1024*1024):
    """
    Lazy function (generator) to read a file piece by piece.
    Default chunk size: 1M
    You can set your own chunk size 
    """
    file_object = open(filePath)
    while True:
        chunk_data = file_object.read(chunk_size)
        if not chunk_data:
            break
        yield chunk_data
if __name__ == "__main__":
    filePath = './path/filename'
    for chunk in read_in_chunks(filePath):
        process(chunk) #

查看全文

相关阅读:
图片轮显效果大全
 firefox 对WebRTC支持
 Android AES加密算法及事实上现
 怎样以学习单片机为契机，逐步成为优秀的project师
 HTML中Select的使用具体解释
 POJ 3602 Typographical Ligatures
远程控制编写之屏幕传输 MFC实现屏幕截图发送bmp数据显示bmp图像
 blend
POJ3187 Backward Digit Sums
牛腩公布系统--HTTP 错误 403.14

原文地址：https://www.cnblogs.com/honglingjin/p/12705150.html