zoukankan      html  css  js  c++  java
  • python爬虫之requests的高级使用

    1、requests能上传文件

    # 导入requests模块
    import requests
    # 定义一个dict
    files = {'file': open('D:/360Downloads/1.txt', 'rb')}
    # post请求
    response = requests.post("http://httpbin.org/post",files=files)
    # 以字符串形式返回
    print(response.text)

    结果:

    {
    "args": {},
    "data": "",
    "files": {
    "file": "data:application/octet-stream;base64,ZGVtbzAxxOO6ww=="
    },
    "form": {},
    "headers": {
    "Accept": "*/*",
    "Accept-Encoding": "gzip, deflate",
    "Content-Length": "151",
    "Content-Type": "multipart/form-data; boundary=9502063320dadabde8e0197a299a933c",
    "Host": "httpbin.org",
    "User-Agent": "python-requests/2.23.0",
    "X-Amzn-Trace-Id": "Root=1-5e71d1bc-221f2f9c5a23aa1c11d21b3c"
    },
    "json": null,
    "origin": "111.77.5.100",
    "url": "http://httpbin.org/post"
    }


    Process finished with exit code 0

    2、requests能获取cookies(网页识别码)

    # 导入requests模块
    import requests
    # get请求
    response=requests.get("https://fanyi.baidu.com")
    # 获取cookies
    print(response.cookies)
    # 获取cookies信息
    print(tuple(response.cookies))
    # 调用items,遍历一个dict的key和value
    for key,value in response.cookies.items():
    print(key+"="+value)

    结果:

    <RequestsCookieJar[<Cookie BAIDUID=72BE4EB04DB39349C036BA1BDF4D2895:FG=1 for .baidu.com/>]>
    (Cookie(version=0, name='BAIDUID', value='72BE4EB04DB39349C036BA1BDF4D2895:FG=1', port=None, port_specified=False, domain='.baidu.com', domain_specified=True, domain_initial_dot=True, path='/', path_specified=True, secure=False, expires=1616058282, discard=False, comment=None, comment_url=None, rest={}, rfc2109=True),)

    BAIDUID=405DCB00DFE182D6581CBFAA3297C6BA:FG=1

    Process finished with exit code 0

    知道cookies的name,快速访问cookies的value:

    # 导入requests模块
    import requests
    # get请求
    response= requests.get('http://fanyi.baidu.com')
    # 访问cookies的值
    print(response.cookies['BAIDUID'])
    # 以元组形式返回cookies
    print(tuple(response.cookies))

    结果:

    B5A1A6A7B622F295DF802DA4D10F92CB:FG=1
    (Cookie(version=0, name='BAIDUID', value='B5A1A6A7B622F295DF802DA4D10F92CB:FG=1', port=None, port_specified=False, domain='.baidu.com', domain_specified=True, domain_initial_dot=True, path='/', path_specified=True, secure=False, expires=1616068429, discard=False, comment=None, comment_url=None, rest={}, rfc2109=True),)

    Process finished with exit code 0

    3、会话维持

    cookies一个作用就是模拟登入,做会话维持,如何把自己的cookies发送到服务器上:

    # 导入requests模块
    import requests
    # 导入json模块
    import json
    # 定义cookies,dict形式
    cookies={"number":"1234567"}
    # get请求,加上
    response=requests.get("http://httpbin.org/cookies",cookies=cookies)
    # 以字符串形式返回
    print(response.text)

    结果:

    {
    "cookies": {
    "number": "1234567"
    }
    }


    Process finished with exit code 0

    或者用requests.session也可以把自己的cookies发到服务器上:

    # 导入requests模块
    import requests
    # 建立session对象
    session = requests.session()
    # get请求
    response = session.get('http://httpbin.org/cookies/set/number/1234567')
    # 以字符串形式返回
    print(response.text)

    结果:

    {
    "cookies": {
    "number": "1234567"
    }
    }


    Process finished with exit code 0

    4、证书验证

    # 导入requests模块
    import requests
    # get请求
    response = requests.get('https://www.12306.cn')
    # 在请求https时,request会进行证书的验证,如果验证失败则会抛出异常
    print(response.status_code)

    如果无证书验证,会抛出异常。有证书验证,返回200。

    怎么关闭证书验证:

    # 导入requests模块
    import requests
    # get请求,关闭证书验证
    response = requests.get('https://www.12306.cn',verify=False)
    # 在请求https时,request会进行证书的验证,如果验证失败则会抛出异常
    print(response.status_code)

    结果:显示有warning

     关闭证书验证后,怎么消除waring:

    # 导入urllib3函数
    from requests.packages import urllib3
    # 导入requests模块
    import requests
    # 消除警告
    urllib3.disable_warnings()
    # get请求
    response = requests.get('https://www.12306.cn', verify=False)
    # 返回状态代码
    print(response.status_code)

    结果:200

     

    业精于勤而荒于嬉,勤劳一日,可得一日安眠;勤劳一生,可得幸福一生。因为,我们努力了;因为,天道酬勤。
  • 相关阅读:
    1.2 软件测试的分类和职业生涯
    1.1:软件测试的发展
    1,select查询详解
    7、网页
    6、开学典礼
    5、边框属性和display
    4、盒子模型和margin、padding
    3、字体、背景、和文本的属性
    2、Css中的样式选择器
    16. C# 对象初始化器
  • 原文地址:https://www.cnblogs.com/Mr-choa/p/12518117.html
Copyright © 2011-2022 走看看