zoukankan      html  css  js  c++  java
  • Python之requests模块详解

    模块说明:requests是使用Apache2 licensed 许可证的HTTP库,用python编写,比urllib2模块更简洁。
    Request支持HTTP连接保持和连接池,支持使用cookie保持会话,支持文件上传,支持自动响应内容的编码,支持国际化的URL和POST数据自动编码。在python内置模块的基础上进行了高度的封装,从而使得python进行网络请求时,变得人性化,使用Requests可以轻而易举的完成浏览器可有的任何操作。现代,国际化,友好。requests会自动实现持久连接keep-alive
    1)导入模块

    [Python] 纯文本查看 复制代码
    1
    import requests


    相信这个都能看懂。

    2)发送请求的简洁

    示例代码:获取一个网页(个人github)

    [Python] 纯文本查看 复制代码
    1
    2
    3
    4
    import requests
     
    r = requests.get('https://github.com/Ranxf')       # 最基本的不带参数的get请求
    r1 = requests.get(url='http://dict.baidu.com/s', params={'wd': 'python'})      # 带参数的get请求



    我们就可以使用该方式使用以下各种操作

    [Python] 纯文本查看 复制代码
    1
    2
    3
    4
    5
    6
    1   requests.get(‘[url]https://github.com/timeline.json[/url]’)                                # GET请求
    2   requests.post(“[url]http://httpbin.org/post[/url]”)                                        # POST请求
    3   requests.put(“[url]http://httpbin.org/put[/url]”)                                          # PUT请求
    4   requests.delete(“[url]http://httpbin.org/delete[/url]”)                                    # DELETE请求
    5   requests.head(“[url]http://httpbin.org/get[/url]”)                                         # HEAD请求
    6   requests.options(“[url]http://httpbin.org/get[/url]” )                                     # OPTIONS请求



    3)为url传递参数

    [Python] 纯文本查看 复制代码
    1
    2
    3
    4
    >>> url_params = {'key':'value'}       #    字典传递参数,如果值为None的键不会被添加到url中
    >>> r = requests.get('your url',params = url_params)
    >>> print(r.url)
      your url?key=value



    4)响应的内容

    [Python] 纯文本查看 复制代码
    01
    02
    03
    04
    05
    06
    07
    08
    09
    10
    11
    12
    13
    r.encoding                       #获取当前的编码
    r.encoding = 'utf-8'             #设置编码
    r.text                           #以encoding解析返回内容。字符串方式的响应体,会自动根据响应头部的字符编码进行解码。
    r.content                        #以字节形式(二进制)返回。字节方式的响应体,会自动为你解码 gzip 和 deflate 压缩。
     
    r.headers                        #以字典对象存储服务器响应头,但是这个字典比较特殊,字典键不区分大小写,若键不存在则返回None
     
    r.status_code                     #响应状态码
    r.raw                             #返回原始响应体,也就是 urllib 的 response 对象,使用 r.raw.read()  
    r.ok                              # 查看r.ok的布尔值便可以知道是否登陆成功
     #*特殊方法*#
    r.json()                         #Requests中内置的JSON解码器,以json形式返回,前提返回的内容确保是json格式的,不然解析出错会抛异常
    r.raise_for_status()             #失败请求(非200响应)抛出异常



    post发送json请求:

    [Python] 纯文本查看 复制代码
    1
    2
    3
    4
    5
    1 import requests
    2 import json
    3 
    4 r = requests.post('https://api.github.com/some/endpoint', data=json.dumps({'some': 'data'}))
    5 print(r.json())



    5)定制头和cookie信息

    [Python] 纯文本查看 复制代码
    1
    2
    3
    header = {'user-agent': 'my-app/0.0.1''}
    cookie = {'key':'value'}
     r = requests.get/post('your url',headers=header,cookies=cookie)
    [Python] 纯文本查看 复制代码
    1
    2
    3
    4
    5
    6
    data = {'some': 'data'}
    headers = {'content-type': 'application/json',
               'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:22.0) Gecko/20100101 Firefox/22.0'}
      
    r = requests.post('https://api.github.com/some/endpoint', data=data, headers=headers)
    print(r.text)



    6)响应状态码
    使用requests方法后,会返回一个response对象,其存储了服务器响应的内容,如上实例中已经提到的 r.text、r.status_code……
    获取文本方式的响应体实例:当你访问 r.text 之时,会使用其响应的文本编码进行解码,并且你可以修改其编码让 r.text 使用自定义的编码进行解码。

    [Python] 纯文本查看 复制代码
    1
    2
    3
    4
    1 r = requests.get('http://www.itwhy.org')
    2 print(r.text, ' {} '.format('*'*79), r.encoding)
    3 r.encoding = 'GBK'
    4 print(r.text, ' {} '.format('*'*79), r.encoding)



    实例代码:

    [Python] 纯文本查看 复制代码
    1
    2
    3
    4
    5
    6
    7
    import requests
     
    r = requests.get('https://github.com/Ranxf')       # 最基本的不带参数的get请求
    print(r.status_code)                               # 获取返回状态
    r1 = requests.get(url='http://dict.baidu.com/s', params={'wd': 'python'})      # 带参数的get请求
    print(r1.url)
    print(r1.text)        # 打印解码后的返回数据



    运行后得到:

    [Python] 纯文本查看 复制代码
    1
    2
    3
    4
    5
    6
    /usr/bin/python3.5 /home/rxf/python3_1000/1000/python3_server/python3_requests/demo1.py
    200
    [url]http://dict.baidu.com/s?wd=python[/url]
    …………
     
    Process finished with exit code 0


    r.status_code      #如果不是200,可以使用 r.raise_for_status() 抛出异常

    7)响应

    [Python] 纯文本查看 复制代码
    1
    2
    3
    4
    r.headers                                  #返回字典类型,头信息
    r.requests.headers                         #返回发送到服务器的头信息
    r.cookies                                  #返回cookie
    r.history                                  #返回重定向信息,当然可以在请求是加上allow_redirects = false 阻止重定向



    8)超时

    [Python] 纯文本查看 复制代码
    1
    r = requests.get('url',timeout=1)           #设置秒数超时,仅对于连接有效



    9)会话对象,能够跨请求保持某些参数

    [Python] 纯文本查看 复制代码
    1
    2
    3
    4
    5
    s = requests.Session()
    s.auth = ('auth','passwd')
    s.headers = {'key':'value'}
    r = s.get('url')
    r1 = s.get('url1')



    10)代{过}{滤}理

    [Python] 纯文本查看 复制代码
    1
    2
    proxies = {'http':'ip1','https':'ip2' }
    requests.get('url',proxies=proxies)





    汇总如下:

    [Python] 纯文本查看 复制代码
    01
    02
    03
    04
    05
    06
    07
    08
    09
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    # HTTP请求类型
    # get类型
    r = requests.get('https://github.com/timeline.json')
    # post类型
    r = requests.post("http://m.ctrip.com/post")
    # put类型
    r = requests.put("http://m.ctrip.com/put")
    # delete类型
    r = requests.delete("http://m.ctrip.com/delete")
    # head类型
    r = requests.head("http://m.ctrip.com/head")
    # options类型
    r = requests.options("http://m.ctrip.com/get")
     
    # 获取响应内容
    print(r.content) #以字节的方式去显示,中文显示为字符
    print(r.text) #以文本的方式去显示
     
    #URL传递参数
    payload = {'keyword': '香港', 'salecityid': '2'}
    r = requests.get("http://m.ctrip.com/webapp/tourvisa/visa_list", params=payload)
    print(r.url) #示例为[url]http://m.ctrip.com/webapp/tourvisa/visa_list?salecityid=2&keyword=[/url]香港
     
    #获取/修改网页编码
    r = requests.get('https://github.com/timeline.json')
    print (r.encoding)
     
     
    #json处理
    r = requests.get('https://github.com/timeline.json')
    print(r.json()) # 需要先import json   
     
    # 定制请求头
    url = 'http://m.ctrip.com'
    headers = {'User-Agent' : 'Mozilla/5.0 (Linux; Android 4.2.1; en-us; Nexus 4 Build/JOP40D) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.166 Mobile Safari/535.19'}
    r = requests.post(url, headers=headers)
    print (r.request.headers)
     
    #复杂post请求
    url = 'http://m.ctrip.com'
    payload = {'some': 'data'}
    r = requests.post(url, data=json.dumps(payload)) #如果传递的payload是string而不是dict,需要先调用dumps方法格式化一下
     
    # post多部分编码文件
    url = 'http://m.ctrip.com'
    files = {'file': open('report.xls', 'rb')}
    r = requests.post(url, files=files)
     
    # 响应状态码
    r = requests.get('http://m.ctrip.com')
    print(r.status_code)
         
    # 响应头
    r = requests.get('http://m.ctrip.com')
    print (r.headers)
    print (r.headers['Content-Type'])
    print (r.headers.get('content-type')) #访问响应头部分内容的两种方式
         
    # Cookies
    url = 'http://example.com/some/cookie/setting/url'
    r = requests.get(url)
    r.cookies['example_cookie_name']    #读取cookies
         
    url = 'http://m.ctrip.com/cookies'
    cookies = dict(cookies_are='working')
    r = requests.get(url, cookies=cookies) #发送cookies
     
    #设置超时时间
    r = requests.get('http://m.ctrip.com', timeout=0.001)
     
    #设置访问代{过}{滤}理
    proxies = {
               "http": "http://10.10.1.10:3128",
               "https": "http://10.10.1.100:4444",
              }
    r = requests.get('http://m.ctrip.com', proxies=proxies)
     
     
    #如果代{过}{滤}理需要用户名和密码,则需要这样:
    proxies = {
        "http": "http://user:pass@10.10.1.10:3128/",
    }




    11)GET请求代码示例

    [Python] 纯文本查看 复制代码
    01
    02
    03
    04
    05
    06
    07
    08
    09
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    # 1、无参数实例
     
    import requests
     
    ret = requests.get('https://github.com/timeline.json')
     
    print(ret.url)
    print(ret.text)
     
     
     
    # 2、有参数实例
     
    import requests
     
    payload = {'key1': 'value1', 'key2': 'value2'}
    ret = requests.get("http://httpbin.org/get", params=payload)
     
    print(ret.url)
    print(ret.text)



    12)POST请求代码示例

    [Python] 纯文本查看 复制代码
    01
    02
    03
    04
    05
    06
    07
    08
    09
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    # 1、基本POST实例
       
    import requests
       
    payload = {'key1': 'value1', 'key2': 'value2'}
    ret = requests.post("http://httpbin.org/post", data=payload)
       
    print(ret.text)
       
       
    # 2、发送请求头和数据实例
       
    import requests
    import json
       
    url = 'https://api.github.com/some/endpoint'
    payload = {'some': 'data'}
    headers = {'content-type': 'application/json'}
       
    ret = requests.post(url, data=json.dumps(payload), headers=headers)
       
    print(ret.text)
    print(ret.cookies)



    13)请求参数

    [Python] 纯文本查看 复制代码
    01
    02
    03
    04
    05
    06
    07
    08
    09
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    def request(method, url, **kwargs):
        """Constructs and sends a :class:`Request <Request>`.
     
        :param method: method for the new :class:`Request` object.
        :param url: URL for the new :class:`Request` object.
        :param params: (optional) Dictionary or bytes to be sent in the query string for the :class:`Request`.
        :param data: (optional) Dictionary, bytes, or file-like object to send in the body of the :class:`Request`.
        :param json: (optional) json data to send in the body of the :class:`Request`.
        :param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`.
        :param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`.
        :param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload.
            ``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')``
            or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string
            defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers
            to add for the file.
        :param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
        :param timeout: (optional) How long to wait for the server to send data
            before giving up, as a float, or a :ref:`(connect timeout, read
            timeout) <timeouts>` tuple.
        :type timeout: float or tuple
        :param allow_redirects: (optional) Boolean. Set to True if POST/PUT/DELETE redirect following is allowed.
        :type allow_redirects: bool
        :param proxies: (optional) Dictionary mapping protocol to the URL of the proxy.
        :param verify: (optional) whether the SSL cert will be verified. A CA_BUNDLE path can also be provided. Defaults to ``True``.
        :param stream: (optional) if ``False``, the response content will be immediately downloaded.
        :param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair.
        :return: :class:`Response <Response>` object
        :rtype: requests.Response
     
        Usage::
     
          >>> import requests
          >>> req = requests.request('GET', 'http://httpbin.org/get')
          <Response [200]>
        """


    参数示例代码

    [Python] 纯文本查看 复制代码
    001
    002
    003
    004
    005
    006
    007
    008
    009
    010
    011
    012
    013
    014
    015
    016
    017
    018
    019
    020
    021
    022
    023
    024
    025
    026
    027
    028
    029
    030
    031
    032
    033
    034
    035
    036
    037
    038
    039
    040
    041
    042
    043
    044
    045
    046
    047
    048
    049
    050
    051
    052
    053
    054
    055
    056
    057
    058
    059
    060
    061
    062
    063
    064
    065
    066
    067
    068
    069
    070
    071
    072
    073
    074
    075
    076
    077
    078
    079
    080
    081
    082
    083
    084
    085
    086
    087
    088
    089
    090
    091
    092
    093
    094
    095
    096
    097
    098
    099
    100
    101
    102
    103
    104
    105
    106
    107
    108
    109
    110
    111
    112
    113
    114
    115
    116
    117
    118
    119
    120
    121
    122
    123
    124
    125
    126
    127
    128
    129
    130
    131
    132
    133
    134
    135
    136
    137
    138
    139
    140
    141
    142
    143
    144
    145
    146
    147
    148
    149
    150
    151
    152
    153
    154
    155
    156
    157
    158
    159
    160
    161
    162
    163
    164
    165
    166
    167
    168
    169
    170
    171
    172
    173
    174
    175
    176
    177
    178
    179
    180
    181
    182
    183
    184
    185
    186
    187
    188
    189
    190
    191
    192
    193
    194
    195
    196
    197
    198
    199
    200
    201
    202
    203
    204
    205
    206
    207
    208
    209
    210
    211
    212
    213
    214
    215
    216
    217
    218
    219
    220
    221
    222
    223
    224
    225
    226
    def param_method_url():
        # requests.request(method='get', url='http://127.0.0.1:8000/test/')
        # requests.request(method='post', url='http://127.0.0.1:8000/test/')
        pass
     
     
    def param_param():
        # - 可以是字典
        # - 可以是字符串
        # - 可以是字节(ascii编码以内)
     
        # requests.request(method='get',
        # url='http://127.0.0.1:8000/test/',
        # params={'k1': 'v1', 'k2': '水电费'})
     
        # requests.request(method='get',
        # url='http://127.0.0.1:8000/test/',
        # params="k1=v1&k2=水电费&k3=v3&k3=vv3")
     
        # requests.request(method='get',
        # url='http://127.0.0.1:8000/test/',
        # params=bytes("k1=v1&k2=k2&k3=v3&k3=vv3", encoding='utf8'))
     
        # 错误
        # requests.request(method='get',
        # url='http://127.0.0.1:8000/test/',
        # params=bytes("k1=v1&k2=水电费&k3=v3&k3=vv3", encoding='utf8'))
        pass
     
     
    def param_data():
        # 可以是字典
        # 可以是字符串
        # 可以是字节
        # 可以是文件对象
     
        # requests.request(method='POST',
        # url='http://127.0.0.1:8000/test/',
        # data={'k1': 'v1', 'k2': '水电费'})
     
        # requests.request(method='POST',
        # url='http://127.0.0.1:8000/test/',
        # data="k1=v1; k2=v2; k3=v3; k3=v4"
        # )
     
        # requests.request(method='POST',
        # url='http://127.0.0.1:8000/test/',
        # data="k1=v1;k2=v2;k3=v3;k3=v4",
        # headers={'Content-Type': 'application/x-www-form-urlencoded'}
        # )
     
        # requests.request(method='POST',
        # url='http://127.0.0.1:8000/test/',
        # data=open('data_file.py', mode='r', encoding='utf-8'), # 文件内容是:k1=v1;k2=v2;k3=v3;k3=v4
        # headers={'Content-Type': 'application/x-www-form-urlencoded'}
        # )
        pass
     
     
    def param_json():
        # 将json中对应的数据进行序列化成一个字符串,json.dumps(...)
        # 然后发送到服务器端的body中,并且Content-Type是 {'Content-Type': 'application/json'}
        requests.request(method='POST',
                         url='http://127.0.0.1:8000/test/',
                         json={'k1': 'v1', 'k2': '水电费'})
     
     
    def param_headers():
        # 发送请求头到服务器端
        requests.request(method='POST',
                         url='http://127.0.0.1:8000/test/',
                         json={'k1': 'v1', 'k2': '水电费'},
                         headers={'Content-Type': 'application/x-www-form-urlencoded'}
                         )
     
     
    def param_cookies():
        # 发送Cookie到服务器端
        requests.request(method='POST',
                         url='http://127.0.0.1:8000/test/',
                         data={'k1': 'v1', 'k2': 'v2'},
                         cookies={'cook1': 'value1'},
                         )
        # 也可以使用CookieJar(字典形式就是在此基础上封装)
        from http.cookiejar import CookieJar
        from http.cookiejar import Cookie
     
        obj = CookieJar()
        obj.set_cookie(Cookie(version=0, name='c1', value='v1', port=None, domain='', path='/', secure=False, expires=None,
                              discard=True, comment=None, comment_url=None, rest={'HttpOnly': None}, rfc2109=False,
                              port_specified=False, domain_specified=False, domain_initial_dot=False, path_specified=False)
                       )
        requests.request(method='POST',
                         url='http://127.0.0.1:8000/test/',
                         data={'k1': 'v1', 'k2': 'v2'},
                         cookies=obj)
     
     
    def param_files():
        # 发送文件
        # file_dict = {
        # 'f1': open('readme', 'rb')
        # }
        # requests.request(method='POST',
        # url='http://127.0.0.1:8000/test/',
        # files=file_dict)
     
        # 发送文件,定制文件名
        # file_dict = {
        # 'f1': ('test.txt', open('readme', 'rb'))
        # }
        # requests.request(method='POST',
        # url='http://127.0.0.1:8000/test/',
        # files=file_dict)
     
        # 发送文件,定制文件名
        # file_dict = {
        # 'f1': ('test.txt', "hahsfaksfa9kasdjflaksdjf")
        # }
        # requests.request(method='POST',
        # url='http://127.0.0.1:8000/test/',
        # files=file_dict)
     
        # 发送文件,定制文件名
        # file_dict = {
        #     'f1': ('test.txt', "hahsfaksfa9kasdjflaksdjf", 'application/text', {'k1': '0'})
        # }
        # requests.request(method='POST',
        #                  url='http://127.0.0.1:8000/test/',
        #                  files=file_dict)
     
        pass
     
     
    def param_auth():
        from requests.auth import HTTPBasicAuth, HTTPDigestAuth
     
        ret = requests.get('https://api.github.com/user', auth=HTTPBasicAuth('wupeiqi', 'sdfasdfasdf'))
        print(ret.text)
     
        # ret = requests.get('http://192.168.1.1',
        # auth=HTTPBasicAuth('admin', 'admin'))
        # ret.encoding = 'gbk'
        # print(ret.text)
     
        # ret = requests.get('http://httpbin.org/digest-auth/auth/user/pass', auth=HTTPDigestAuth('user', 'pass'))
        # print(ret)
        #
     
     
    def param_timeout():
        # ret = requests.get('http://google.com/', timeout=1)
        # print(ret)
     
        # ret = requests.get('http://google.com/', timeout=(5, 1))
        # print(ret)
        pass
     
     
    def param_allow_redirects():
        ret = requests.get('http://127.0.0.1:8000/test/', allow_redirects=False)
        print(ret.text)
     
     
    def param_proxies():
        # proxies = {
        # "http": "61.172.249.96:80",
        # "https": "http://61.185.219.126:3128",
        # }
     
        # proxies = {'http://10.20.1.128': 'http://10.10.1.10:5323'}
     
        # ret = requests.get("http://www.proxy360.cn/Proxy", proxies=proxies)
        # print(ret.headers)
     
     
        # from requests.auth import HTTPProxyAuth
        #
        # proxyDict = {
        # 'http': '77.75.105.165',
        # 'https': '77.75.105.165'
        # }
        # auth = HTTPProxyAuth('username', 'mypassword')
        #
        # r = requests.get("http://www.google.com", proxies=proxyDict, auth=auth)
        # print(r.text)
     
        pass
     
     
    def param_stream():
        ret = requests.get('http://127.0.0.1:8000/test/', stream=True)
        print(ret.content)
        ret.close()
     
        # from contextlib import closing
        # with closing(requests.get('http://httpbin.org/get', stream=True)) as r:
        # # 在此处理响应。
        # for i in r.iter_content():
        # print(i)
     
     
    def requests_session():
        import requests
     
        session = requests.Session()
     
        ### 1、首先登陆任何页面,获取cookie
     
        i1 = session.get(url="http://dig.chouti.com/help/service")
     
        ### 2、用户登陆,携带上一次的cookie,后台对cookie中的 gpsd 进行授权
        i2 = session.post(
            url="http://dig.chouti.com/login",
            data={
                'phone': "8615131255089",
                'password': "xxxxxx",
                'oneMonth': ""
            }
        )
     
        i3 = session.post(
            url="http://dig.chouti.com/link/vote?linksId=8589623",
        )
        print(i3.text)
     参数示例代码
  • 相关阅读:
    进程间通信 之 管道
    单调递增连续最长子序列
    使用EasyUI实现加入和删除功能
    android file.createnewfile ioexception
    60个可爱的云图案设计,激发你的灵感
    关于Platinum库的MediaRender具体C++代码实现探讨
    《程序员的第一年》---------- 学会抛出异常 你的程序人生才幸福
    LA 3027 Corporative Network 并查集记录点到根的距离
    Struts2 学习第一步准备工作
    Android编程心得-图片自适应心得
  • 原文地址:https://www.cnblogs.com/fisherpau/p/14279220.html
Copyright © 2011-2022 走看看