zoukankan      html  css  js  c++  java
  • scrapy代理-加请求头

    代理,需要在环境变量中设置
        from scrapy.contrib.downloadermiddleware.httpproxy import HttpProxyMiddleware
        
        方式一:使用默认
            os.environ
            {
                http_proxy:http://root:wowowo@192.168.11.11:9999/
                https_proxy:http://192.168.11.11:9999/
            }
        方式二:使用自定义下载中间件
        
        def to_bytes(text, encoding=None, errors='strict'):
            if isinstance(text, bytes):
                return text
            if not isinstance(text, six.string_types):
                raise TypeError('to_bytes must receive a unicode, str or bytes '
                                'object, got %s' % type(text).__name__)
            if encoding is None:
                encoding = 'utf-8'
            return text.encode(encoding, errors)
            
        class ProxyMiddleware(object):
            def process_request(self, request, spider):
                PROXIES = [
                    {'ip_port': '111.11.228.75:80', 'user_pass': ''},
                    {'ip_port': '120.198.243.22:80', 'user_pass': ''},
                    {'ip_port': '111.8.60.9:8123', 'user_pass': ''},
                    {'ip_port': '101.71.27.120:80', 'user_pass': ''},
                    {'ip_port': '122.96.59.104:80', 'user_pass': ''},
                    {'ip_port': '122.224.249.122:8088', 'user_pass': ''},
                ]
                proxy = random.choice(PROXIES)
                if proxy['user_pass'] is not None:
                    request.meta['proxy'] = to_bytes("http://%s" % proxy['ip_port'])
                    encoded_user_pass = base64.encodestring(to_bytes(proxy['user_pass']))
                    request.headers['Proxy-Authorization'] = to_bytes('Basic ' + encoded_user_pass)
                    print "**************ProxyMiddleware have pass************" + proxy['ip_port']
                else:
                    print "**************ProxyMiddleware no pass************" + proxy['ip_port']
                    request.meta['proxy'] = to_bytes("http://%s" % proxy['ip_port'])
        
        DOWNLOADER_MIDDLEWARES = {
           'step8_king.middlewares.ProxyMiddleware': 500,
        }
        
    """
    

      

  • 相关阅读:
    AcWing 1018. 最低通行费
    蓝桥杯赛第10届省赛
    P5745 【深基附B例】区间最大和
    P3383 【模板】线性筛素数
    第12届蓝桥杯赛国赛 小蓝买瓜子
    P4715 【深基16.例1】淘汰赛
    AcWing 1015. 摘花生
    第12届蓝桥杯赛省赛 种菜的最大价值
    linq to sql初步
    汇编语言学习笔记接收鼠标消息
  • 原文地址:https://www.cnblogs.com/catherine007/p/8649038.html
Copyright © 2011-2022 走看看