zoukankan      html  css  js  c++  java
  • requests代理爬取

    import requests
    import random
    if __name__ == "__main__":
        #不同浏览器的UA
        header_list = [
            # 遨游
            {"user-agent": "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Maxthon 2.0)"},
            # 火狐
            {"user-agent": "Mozilla/5.0 (Windows NT 6.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1"},
            # 谷歌
            {
                "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_0) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11"}
        ]
        #不同的代理IP
        proxy_list = [
            {"http": "112.115.57.20:3128"},
            {'http': '121.41.171.223:3128'}
        ]
        #随机获取UA和代理IP
        header = random.choice(header_list)
        proxy = random.choice(proxy_list)
        url = 'http://www.baidu.com/s?ie=UTF-8&wd=ip'
        #参数3:设置代理
        response = requests.get(url=url,headers=header,proxies=proxy)
        response.encoding = 'utf-8'
        with open('daili.html', 'wb') as fp:
            fp.write(response.content)
    
    
  • 相关阅读:
    16-面向对象之语法(1)
    4-编辑器IDE_for_python
    3-python入门学习路线
    2-学习方法心得
    1-语法基础
    NSMutableArray基本概念
    NSArray 与字符串
    NSArray文件读写
    NSArray排序
    NSArray 遍历
  • 原文地址:https://www.cnblogs.com/gerenboke/p/13389072.html
Copyright © 2011-2022 走看看