zoukankan      html  css  js  c++  java
  • 同步提交,异步提交

    提交任务的两种方式:
      同步调用:提交完一个任务之后,就在原地等待,等待任务完完整整地运行完毕拿到结果后,再执行下一行代码,会导致任务是串行执行
    from concurrent.futures import ThreadPoolExecutor,ProcessPoolExecutor
    import time,random,os
    
    def task(name,n):
        print('%s%s is running' %(name,os.getpid()))
        time.sleep(random.randint(1,3))
        return n**2
    
    if __name__ == '__main__':
        # print(os.cpu_count())  #查看cpu的个数
        p=ProcessPoolExecutor(4)
    
        for i in range(10):
            # 同步提交
            res=p.submit(task,'进程pid: ',i).result()
            print(res)
        print("")
    
    结果:
    进程pid: 10720 is running
    0
    进程pid: 10724 is running
    1
    进程pid: 5948 is running
    4
    进程pid: 2068 is running
    9
    进程pid: 10720 is running
    16
    进程pid: 10724 is running
    25
    进程pid: 5948 is running
    36
    进程pid: 2068 is running
    49
    进程pid: 10720 is running
    64
    进程pid: 10724 is running
    81
    同步提交
      异步调用:提交完一个任务之后,不在原地等待,结果直接执行下一行代码,会导致任务是并发执行的
    from concurrent.futures import ThreadPoolExecutor,ProcessPoolExecutor
    import time,random,os
    
    def task(name,n):
        print('%s%s is running' %(name,os.getpid()))
        time.sleep(random.randint(1,3))
        return n**2
    
    if __name__ == '__main__':
        p=ProcessPoolExecutor(4)
        l = []
        for i in range(10):
            # 异步提交
            future = p.submit(task, '进程pid: ', i)
            l.append(future)
        p.shutdown(wait=True) #shutdown关闭进程池入口(不能将任务放入进程池)并且在原地等待进程池内所有任务运行完毕
    
        for future in l:
            print(future.result())
        print("")
    
    结果:
    进程pid: 10956 is running
    进程pid: 11040 is running
    进程pid: 10552 is running
    进程pid: 11332 is running
    进程pid: 10552 is running
    进程pid: 11040 is running
    进程pid: 10552 is running
    进程pid: 10956 is running
    进程pid: 11332 is running
    进程pid: 10956 is running
    0
    1
    4
    9
    16
    25
    36
    49
    64
    81
    异步提交

     案例:

    #进程池并发爬取网站
    from concurrent.futures import ProcessPoolExecutor
    import time,os
    import requests
    
    
    def get(url):
        print('%s GET %s' %(os.getpid(),url))
        time.sleep(3)
        response=requests.get(url)
        if response.status_code == 200:
            res=response.text
        else:
            res='下载失败'
        parse(res)
    
    def parse(res):
        time.sleep(1)
        print('%s 解析结果为%s' %(os.getpid(),len(res)))
    
    if __name__ == '__main__':
        urls=[
            'https://www.baidu.com',
            'https://www.sina.com.cn',
            'https://www.tmall.com',
            'https://www.jd.com',
            'https://www.python.org',
            'https://www.openstack.org',
            'https://www.baidu.com',
            'https://www.baidu.com',
            'https://www.baidu.com',
    
        ]
    
        p=ProcessPoolExecutor(9)
        l=[]
        start=time.time()
        for url in urls:
            future=p.submit(get,url)
            l.append(future)
        p.shutdown(wait=True)
    
        print('',time.time()-start)
    结果:
    11952 GET https://www.baidu.com
    11992 GET https://www.sina.com.cn
    7136 GET https://www.tmall.com
    11984 GET https://www.jd.com
    11948 GET https://www.python.org
    5952 GET https://www.openstack.org
    12056 GET https://www.baidu.com
    11128 GET https://www.baidu.com
    11728 GET https://www.baidu.com
    11952 解析结果为2443
    11992 解析结果为578360
    7136 解析结果为217570
    11984 解析结果为90905
    12056 解析结果为2443
    11128 解析结果为2443
    11728 解析结果为2443
    11948 解析结果为48413
    5952 解析结果为66284
    主 10.874621868133545
    用进程池爬取并发爬取网站
    from concurrent.futures import ProcessPoolExecutor
    import time,os
    import requests
    
    #并发下载,异步提交
    def get(url):
        print('%s GET %s' %(os.getpid(),url))
        time.sleep(3)
        response=requests.get(url)
        if response.status_code == 200:
            res=response.text
        else:
            res='下载失败'
        return res
    
    def parse(future):
        time.sleep(1)
        res=future.result()
        print('%s 解析结果为%s' %(os.getpid(),len(res)))
    
    if __name__ == '__main__':
        urls=[
            'https://www.baidu.com',
            'https://www.sina.com.cn',
            'https://www.tmall.com',
            'https://www.jd.com',
            'https://www.python.org',
            'https://www.openstack.org',
            'https://www.baidu.com',
            'https://www.baidu.com',
            'https://www.baidu.com',
    
        ]
    
        p=ProcessPoolExecutor(9)
    
        start=time.time()
        for url in urls:
            future=p.submit(get,url)
            # 异步调用:提交完一个任务之后,不在原地等待,而是直接执行下一行代码,会导致任务是并发执行的,,结果futrue对象会在任务运行完毕后自动传给回调函数
            future.add_done_callback(parse)  #parse会在任务运行完毕后自动触发,然后接收一个参数future对象
            #add_done_callback 添加一个回调函数
        p.shutdown(wait=True)
    
        print('',time.time()-start)
        print('',os.getpid())
    
    结果:
    10056 GET https://www.baidu.com
    11584 GET https://www.sina.com.cn
    3624 GET https://www.tmall.com
    12116 GET https://www.jd.com
    10072 GET https://www.python.org
    9784 GET https://www.openstack.org
    11924 GET https://www.baidu.com
    10272 GET https://www.baidu.com
    3084 GET https://www.baidu.com
    11744 解析结果为2443
    11744 解析结果为217570
    11744 解析结果为90981
    11744 解析结果为2443
    11744 解析结果为2443
    11744 解析结果为2443
    11744 解析结果为66304
    11744 解析结果为578616
    11744 解析结果为48181
    主 24.257058382034311744
    进程池异步爬取网站2
    from concurrent.futures import ThreadPoolExecutor
    from threading import current_thread
    import time,requests
    
    
    
    def get(url):
        print('%s GET %s' %(current_thread().name,url))
        time.sleep(3)
        response=requests.get(url)
        if response.status_code == 200:
            res=response.text
        else:
            res='下载失败'
        return res
    
    def parse(future):
        time.sleep(1)
        res=future.result()
        print('%s 解析结果为%s' %(current_thread().name,len(res)))
    
    if __name__ == '__main__':
        urls=[
            'https://www.baidu.com',
            'https://www.sina.com.cn',
            'https://www.tmall.com',
            'https://www.jd.com',
            'https://www.python.org',
            'https://www.openstack.org',
            'https://www.baidu.com',
            'https://www.baidu.com',
            'https://www.baidu.com',
    
        ]
    
        p=ThreadPoolExecutor(4)
    
        for url in urls:
            future=p.submit(get,url)
            future.add_done_callback(parse)
    
        p.shutdown(wait=True)
    
        print('',current_thread().name)
    
    #阻塞:遇到io行为,进入阻塞状态,等待一会。
    
    结果:
    ThreadPoolExecutor-0_0 GET https://www.baidu.com
    ThreadPoolExecutor-0_1 GET https://www.sina.com.cn
    ThreadPoolExecutor-0_2 GET https://www.tmall.com
    ThreadPoolExecutor-0_3 GET https://www.jd.com
    
    ThreadPoolExecutor-0_3 解析结果为90936
    ThreadPoolExecutor-0_3 GET https://www.python.org
    ThreadPoolExecutor-0_0 解析结果为2443
    ThreadPoolExecutor-0_0 GET https://www.openstack.org
    ThreadPoolExecutor-0_2 解析结果为217570
    ThreadPoolExecutor-0_2 GET https://www.baidu.com
    ThreadPoolExecutor-0_1 解析结果为578616
    ThreadPoolExecutor-0_1 GET https://www.baidu.com
    
    ThreadPoolExecutor-0_2 解析结果为2443
    ThreadPoolExecutor-0_2 GET https://www.baidu.com
    ThreadPoolExecutor-0_1 解析结果为2443
    ThreadPoolExecutor-0_0 解析结果为66304
    ThreadPoolExecutor-0_3 解析结果为48181
    
    ThreadPoolExecutor-0_2 解析结果为2443
    主 MainThread
    线程池爬取网站
  • 相关阅读:
    Content delivery network
    散列算法的基础原理 确保资料传递无误
    科学计算 NumPy 与C语言对比 N-dimensional array ndarray 元素元素操作 计算正太分布分位数 ndarray中的所有元素的类型都是相同的,而Python列表中的元素类型是任意的,所以ndarray在存储元素时内存可以连续,而python原生list就只能通过寻址方式找到下一个元素
    t
    百度 url 当在baidu搜索结果展示页,去点击标头时
    指定文件夹 指定文件后缀名 删除整个文件夹 git 冲突解决 create a new repository on the command line push an existing repository from the command line rebase
    修改MojoWeixin 只保留用户name 取消群昵称
    AnyEvent::HTTP 介绍
    AnyEvent::HTTP 介绍
    异步和同步http请求超时机制
  • 原文地址:https://www.cnblogs.com/zhouhao123/p/11154493.html
Copyright © 2011-2022 走看看