python 多进程处理图像，充分利用CPU

zoukankan html css js c++ java

python 多进程处理图像，充分利用CPU
默认情况下，Python程序使用一个CPU以单个进程运行。不过如果你是在最近几年配置的电脑，通常都是四核处理器，也就是有8个CPU。这就意味着在你苦苦等待Python脚本完成数据处理工作时，你的电脑其实有90%甚至更多的计算资源就在那闲着没事干！

得益于Python的 concurrent.futures 模块，我们只需3行代码，就能将一个普通数据处理脚本变为能并行处理数据的脚本！

普通Python处理数据方法

比方说，我们有一个全是图像数据的文件夹里面含有2000张彩色图片，用Python将每张图像灰度化。
import glob import cv2 import concurrent.futures import time def process_image(filename): # do sth here img = cv2.imread(filename) img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) return img if __name__ == "__main__": start = time.time() i = 1 filenames= glob.glob("train/*.jpg") for filename in filenames: img = process_image(filename) cv2.imwrite("result/" + str(i) + '.jpg', img) i += 1 print(time.time()-start)
　　这种方法所用的时间为220秒左右！

试试创建多进程

多线程和多进程模块在使用的时候, start() 方法和 join() 方法不能省, 有时候还需要使用 Queue, 随着需求越来越复杂, 如果没有良好的设计抽象出这部分功能,

代码量会越来越多,debug 的难度也会越来越大。concurrent.futures 模块可以把这些步骤抽象, 这样我们就不需要关注这些细节。
concurrent.futures主要使用的就是两个类，多线程：ThreadPoolExecutor多进程：ProcessPoolExecutor；

Executor类不能直接使用，而应该通过其子类TreadPoolExecutor，ProcessPoolExecutor来调用其方法。
注意：创建进程要在
if __name__ == "__main__":下面，否则报错
import time import glob import cv2 import concurrent.futures def process_image(filename): img = cv2.imread(filename) img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) return img if __name__ == "__main__": start = time.time() i = 1 filenames= glob.glob("train/*.jpg") p = concurrent.futures.ProcessPoolExecutor() 　　#p.map()函数调用时需要输入辅助函数和待处理的数据列表。　　#这个函数能帮我完成所有麻烦的工作，包括将列表分为多个子列表、将子列表发送到每个子进程、运行子进程以及合并结果等 result = p.map(process_image, filenames) #p.shutdown(wait=True) for processedimg in result: cv2.imwrite("result/"+str(i)+'.jpg',processedimg) i+=1 print(time.time()-start)
　　这种方法处理2000幅图片时间大约108秒，速度快了一半
查看全文

相关阅读:
python socket练习
 python异常处理
 python类的反射
 类的特殊成员方法
 staticmethod classmethod property方法
 类的多态
 类的析构、继承
 python subprocess模块
 python面向对象
 discuz 使模板中的函数不解析正常使用

原文地址：https://www.cnblogs.com/lzq116/p/12011685.html

python 多进程处理图像，充分利用CPU

普通Python处理数据方法

试试创建多进程