zoukankan      html  css  js  c++  java
  • Python标准模块--concurrent.futures(进程池,线程池)

    python为我们提供的标准模块concurrent.futures里面有ThreadPoolExecutor(线程池)和ProcessPoolExecutor(进程池)两个模块. 在这个模块里他们俩在用法上是一样的.

    concurrent.futures官方文档: https://docs.python.org/dev/library/concurrent.futures.html

    #1 介绍
    concurrent.futures模块提供了高度封装的异步调用接口
    ThreadPoolExecutor:线程池,提供异步调用
    ProcessPoolExecutor: 进程池,提供异步调用
    Both implement the same interface, which is defined 
    by the abstract Executor class. #2 基本方法 #submit(fn, *args, **kwargs) 异步提交任务 #map(func, *iterables, timeout=None, chunksize=1) 取代for循环submit的操作 #shutdown(wait=True) 相当于进程池的pool.close()+pool.join()操作 wait=True,等待池内所有任务执行完毕回收完资源后才继续 wait=False,立即返回,并不会等待池内的任务执行完毕 但不管wait参数为何值,整个程序都会等到所有任务执行完毕 submit和map必须在shutdown之前 #result(timeout=None) 取得结果 #add_done_callback(fn) 回调函数
    #介绍
    The ProcessPoolExecutor class is an Executor subclass that uses a pool of processes to execute calls asynchronously. ProcessPoolExecutor uses the multiprocessing module, which allows it to side-step the Global Interpreter Lock but also means that only picklable objects can be executed and returned.
    
    class concurrent.futures.ProcessPoolExecutor(max_workers=None, mp_context=None)
    An Executor subclass that executes calls asynchronously using a pool of at most max_workers processes. If max_workers is None or not given, it will default to the number of processors on the machine. If max_workers is lower or equal to 0, then a ValueError will be raised.
    
    
    # 用法示例
    from concurrent.futures import ThreadPoolExecutor
    import time
    
    def func(n):
        time.sleep(1)
        print(">>>", n)
        return n*n
    
    if __name__ == '__main__':
        t_pool = ThreadPoolExecutor(max_workers=5) # 线程池中最多不要超过cup个数*5
        t_list = []
        for i in range(20):
            res = t_pool.submit(func, i)
            t_list.append(res)
        t_pool.shutdown()   # 等待子线程结束, 再执行父进程 相当于相当于进程池的pool.close()+pool.join()操作
        for resl in t_list:
            print(resl.result()) # 结果是有序的, 这是因为t_list中的元素就是
                                # 有序的,所以循环迭代从结果对象中取出的值也是有序的
    ThreadPoolExecutor
    #介绍
    ThreadPoolExecutor is an Executor subclass that uses a pool of threads to execute calls asynchronously.
    class concurrent.futures.ThreadPoolExecutor(max_workers=None, thread_name_prefix='')
    An Executor subclass that uses a pool of at most max_workers threads to execute calls asynchronously.
    
    Changed in version 3.5: If max_workers is None or not given, it will default to the number of processors on the machine, multiplied by 5, assuming that ThreadPoolExecutor is often used to overlap I/O instead of CPU work and the number of workers should be higher than the number of workers for ProcessPoolExecutor.
    
    New in version 3.6: The thread_name_prefix argument was added to allow users to control the threading.Thread names for worker threads created by the pool for easier debugging.
    
    #用法
    与ThreadPoolExecutor相同, 将ThreadPoolExecutor换成Process就可以了
    ProcessPoolExecutor
    from concurrent.futures import ThreadPoolExecutor
    import time
    
    def func(n):
        time.sleep(1)
        print(">>>", n)
        return n*n
    
    if __name__ == '__main__':
        t_pool = ThreadPoolExecutor(max_workers=5)
        res_g = t_pool.map(func,range(20))# 取代了for + submit 得到的结果是一个生成器对象
        t_pool.shutdown()
        print("主线程")
        for ress in res_g:
            print(ress)
    map用法示例
    from concurrent.futures import ThreadPoolExecutor,ProcessPoolExecutor
    from multiprocessing import Pool
    import requests
    import json
    import os
    
    def get_page(url):
        print('<进程%s> get %s' %(os.getpid(),url))
        respone=requests.get(url)
        if respone.status_code == 200:
            return {'url':url,'text':respone.text}
    
    def parse_page(res):
        res=res.result()
        print('<进程%s> parse %s' %(os.getpid(),res['url']))
        parse_res='url:<%s> size:[%s]
    ' %(res['url'],len(res['text']))
        with open('db.txt','a') as f:
            f.write(parse_res)
    
    
    if __name__ == '__main__':
        urls=[
            'https://www.baidu.com',
            'https://www.python.org',
            'https://www.openstack.org',
            'https://help.github.com/',
            'http://www.sina.com.cn/'
        ]
    
        # p=Pool(3)
        # for url in urls:
        #     p.apply_async(get_page,args=(url,),callback=pasrse_page)
        # p.close()
        # p.join()
    
        p=ProcessPoolExecutor(3)
        for url in urls:
            p.submit(get_page,url).add_done_callback(parse_page) #parse_page拿到的是一个future对象obj,需要用obj.result()拿到结果
    回调函数
  • 相关阅读:
    【转载】openCV轮廓操作
    求两个已排序数组的中位数
    朴素贝叶斯分类
    Different Ways to Add Parentheses
    QSerialPort
    opencv鼠标绘制直线 C++版
    Word Break
    C++中 指针 与 引用 的区别
    敲入url到浏览器后会发生什么
    Sort List 分类: leetcode 算法 2015-07-10 15:35 1人阅读 评论(0) 收藏
  • 原文地址:https://www.cnblogs.com/af1y/p/10060552.html
Copyright © 2011-2022 走看看