zoukankan      html  css  js  c++  java
  • gevent 和twisted模块实现并发

    对于多线程和多进程的缺点是在IO阻塞时会造成线程和进程的浪费,所以异步IO会是首选,有下面几种:

    一、异步IO

    1、asyncio + aiohttp + requests

    2、gevent + requests +grequests

    3、twisted

    4、tornado

    5、asyncio

    6、gevent+requests

    7、grequests

    gevent+requests

    import gevent
    import requests
    from gevent import monkey
    #替换内置的socket,更换成gevent封装的弄成非阻塞的
    monkey.patch_all()
    def fetch_async(method,url,req_kwargs):
        print(method,url,req_kwargs)
        response = requests.request(method=method,url=url,**req_kwargs)
        print(response.url,response.content)
    #发送请求,可以称下面为三个协程
    gevent.joinall([
        gevent.spawn(fetch_async,method="get",url="https://www.python.org/",req_kwargs={}),
        gevent.spawn(fetch_async, method="get", url="https://www.yahoo.com/", req_kwargs={}),
        gevent.spawn(fetch_async, method="get", url="https://github.com/", req_kwargs={}),
    ])
    

    利用gevent+urllib爬取网站如下:

    import gevent
    import requests
    import urllib.request
    from gevent import monkey
    #替换内置的socket,更换成gevent封装的弄成非阻塞的
    monkey.patch_all()
    def run_task(url):
        print("Visit --> %s" %url)
        try:
            response = urllib.request.urlopen(url)
            data = response.read()
            print("%d bytes received from %s." %(len(data),url))
        except Exception as e:
            print(e)
    
    if __name__ == '__main__':
        urls = ['https://www.baidu.com','https://docs.python.org/3/library/urllib.html','https://www.cnblogs.com/wangmo/p/7784867.html']
        greenlets = [gevent.spawn(run_task,url) for url in urls]
        gevent.joinall(greenlets)
    

    gevent协程池控制最大的协程数量

    import gevent
    import requests
    import urllib.request
    from gevent import monkey
    #替换内置的socket,更换成gevent封装的弄成非阻塞的
    monkey.patch_all()
    def fetch_async(method,url,req_kwargs):
        print(method,url,req_kwargs)
        response = requests.request(method=method,url=url,**req_kwargs)
        print(response.url,response.content)
    #发送请求,可以称下面为三个协程
    #发送请求(协程池控制最大协程数量)
    from gevent.pool import Pool
    pool = Pool(3)
    gevent.joinall([
        pool.spawn(fetch_async,method="get",url="https://www.python.org/",req_kwargs={}),
        pool.spawn(fetch_async, method="get", url="https://www.yahoo.com/", req_kwargs={}),
        pool.spawn(fetch_async, method="get", url="https://github.com/", req_kwargs={}),
    ])
    

    grequests内置有gevent.joinall

    import grequests
    request_list = [
        grequests.get('http://httpbin.org/delay/1', timeout=0.001),
        grequests.get('http://fakedomain/'),
        grequests.get('http://httpbin.org/status/500')
    ]
    ####执行并获取响应列表####
    response_list = grequests.map(request_list)
    print(response_list)
    

    twisted

    1、事件循环是,循环等待获取请求返回的内容

    2、当所有的请求都获取到了结果,事件循环会一直在循环,所以得判断当请求数与获取的结果数一样时,利用twisted.stop()停止事件循环:

    #发送http请求
    from twisted.web.client import getPage
    #事件循环
    from twisted.internet import reactor
    REV_COUNTER = 0
    REQ_COUNTER = 0
    def callback(contents):
        print(contents)
        global REV_COUNTER
        REV_COUNTER +=1
        if REV_COUNTER == REQ_COUNTER:
            #已经获取到请求的所有数量时关闭事件循环
            reactor.stop()
    url_list = ['http://www.bing.com', 'http://www.baidu.com', ]
    REQ_COUNTER = len(url_list)
    for url in url_list:
        deferred = getPage(bytes(url,encoding="utf8"))
        deferred.addCallback(callback)
    #时间循环等待返回的结果
    reactor.run()
    
    
  • 相关阅读:
    C#序列化和反序列化开发者在线 Builder.com.cn 更新时间:20080904
    提高C#编程水平不可不读的50个要诀 开发者在线 Builder.com.cn 更新时间:20080805作者: 来源:开发者在线
    CollapsiblePanelExtender这Ajax控件
    100个冷笑话,越往后越冷(郁闷时专用……)赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞赞
    [转]eclipse java build path
    [转]线程间操作无效: 从不是创建控件“”的线程访问它~~~的解决方法~
    水晶报表最后空页解决方法
    JQuery资源
    Windows Explorer中对所选文件增加右键菜单并关联自己程序的例子
    [转]Oracle如何复制表的sql语句
  • 原文地址:https://www.cnblogs.com/venvive/p/11657228.html
Copyright © 2011-2022 走看看