zoukankan html css js c++ java

python里如何计算大文件的md5

在python3中，有了一个hashlib，可以用来计算md5，这里先给出一个简单的例子：

import hashlib

sstr="i love hanyu"
print(hashlib.md5(sstr).hexdigest())

很遗憾的，出错了，错误信息是：

C:Python35python.exe C:/pylearn/bottlelearn/3.py
Traceback (most recent call last):
  File "C:/pylearn/bottlelearn/3.py", line 4, in <module>
    print(hashlib.md5(sstr).hexdigest())
TypeError: Unicode-objects must be encoded before hashing

Process finished with exit code 1

这里主要是考虑到传入的编码不同，会导致md5出问题，所以，要求传入前进行统一的编码，修改如下：

 import hashlib
 hashlib.sha256(str(random.getrandbits(256)).encode('utf-8')).hexdigest()

import hashlib

with open(hash_file) as file:
    control_hash = file.readline().rstrip("
")

wordlistfile = open(wordlist, "rb")
# ...
for line in wordlistfile:
    if hashlib.md5(line.rstrip(b'

')).hexdigest() == control_hash:

下面，来看看如何计算大文件的md5，如果只是简单的把文件都入到内存中，大文件会导致出现大问题，编码如下：

import hashlib

def hash_bytestr_iter(bytesiter, hasher, ashexstr=False):
    for block in bytesiter:
        hasher.update(block)
    return (hasher.hexdigest() if ashexstr else hasher.digest())

def file_as_blockiter(afile, blocksize=65536):
    with afile:
        block = afile.read(blocksize)
        while len(block) > 0:
            yield block
            block = afile.read(blocksize)


[(fname, hash_bytestr_iter(file_as_blockiter(open(fname, 'rb')), hashlib.md5()))
    for fname in fnamelst]

查看全文

相关阅读:
mysql 慢查询分析工具
 php+redis实现消息队列
 Mysql数据库千万级数据查询优化方案.....
windows下安装docker详细步骤
 Git基础使用教程
 redis实现消息队列&发布/订阅模式使用
 macos上改变输入法顺序
 ssh动态转发小记
 ubuntu上runsv/runit小记
 使用libcurl下载https地址的文件

原文地址：https://www.cnblogs.com/aomi/p/7047214.html