zoukankan      html  css  js  c++  java
  • python3-requests库的使用

    同步请求库requests用来做测试和简单爬虫其实非常好用的,今天来讲一讲,毕竟不熟悉就用,吃了很大亏啊,文档一定要好好看

    http://docs.python-requests.org/zh_CN/latest/user/quickstart.html

    一、最简单常用的用法

    GET请求

    response = requests.get('http://httpbin.org/get')
    print(response.text)
    
    # 输出
    {
      "args": {}, 
      "headers": {
        "Accept": "*/*", 
        "Accept-Encoding": "gzip, deflate", 
        "Connection": "close", 
        "Host": "httpbin.org", 
        "User-Agent": "python-requests/2.21.0"
      }, 
      "origin": "xx.xx.xx.xx", 
      "url": "http://httpbin.org/get"
    }

    POST请求

    form = {'name': 'happy_codes'}
    response = requests.post('http://httpbin.org/post', data=form)
    print(response.text)
    
    # form表单数据
    {
      "args": {}, 
      "data": "", 
      "files": {}, 
      "form": {
        "name": "happy_codes"
      }, 
      "headers": {
        "Accept": "*/*", 
        "Accept-Encoding": "gzip, deflate", 
        "Connection": "close", 
        "Content-Length": "16", 
        "Content-Type": "application/x-www-form-urlencoded", 
        "Host": "httpbin.org", 
        "User-Agent": "python-requests/2.21.0"
      }, 
      "json": null, 
      "origin": "xx.xx.xx.xx", 
      "url": "http://httpbin.org/post"
    }

    二、加UA,加cookies,加代理

    cookies除了使用dict之外,还可以使用cookiejar类,还可以直接给字符串

    proxies={'http:': 'http://127.0.0.1', 'https': 'http:127.0.0.1'} 

    意思是http协议和https协议使用怎样的代理,没配置正确,就不会用代理,切记。

    headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
                             "(KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"}
    
    cookies = {"STM": "1545720205", 'haha': '123'}
    
    response = requests.get('http://httpbin.org/get', headers=headers, cookies=cookies,
                            proxies={'http': 'http://125.123.122.10:42207', 'https': 'http://125.123.122.10:42207'})
    
    print(response.text)
    
    # 输出
    {
      "args": {}, 
      "headers": {
        "Accept": "*/*", 
        "Accept-Encoding": "gzip, deflate", 
        "Connection": "close", 
        "Cookie": "STM=1545720205; haha=123", 
        "Host": "httpbin.org", 
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
      }, 
      "origin": "125.123.122.10:42207", 
      "url": "http://httpbin.org/get"
    }

    其实可以加的,都写在注释里面了,GET,POST都一样:

    def request(method, url, **kwargs):
        """Constructs and sends a :class:`Request <Request>`.
    
        :param method: method for the new :class:`Request` object.
        :param url: URL for the new :class:`Request` object.
        :param params: (optional) Dictionary, list of tuples or bytes to send
            in the body of the :class:`Request`.
        :param data: (optional) Dictionary, list of tuples, bytes, or file-like
            object to send in the body of the :class:`Request`.
        :param json: (optional) A JSON serializable Python object to send in the body of the :class:`Request`.
        :param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`.
        :param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`.
        :param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload.
            ``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')``
            or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string
            defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers
            to add for the file.
        :param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
        :param timeout: (optional) How many seconds to wait for the server to send data
            before giving up, as a float, or a :ref:`(connect timeout, read
            timeout) <timeouts>` tuple.
        :type timeout: float or tuple
        :param allow_redirects: (optional) Boolean. Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to ``True``.
        :type allow_redirects: bool
        :param proxies: (optional) Dictionary mapping protocol to the URL of the proxy.
        :param verify: (optional) Either a boolean, in which case it controls whether we verify
                the server's TLS certificate, or a string, in which case it must be a path
                to a CA bundle to use. Defaults to ``True``.
        :param stream: (optional) if ``False``, the response content will be immediately downloaded.
        :param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair.
        :return: :class:`Response <Response>` object
        :rtype: requests.Response
    
        Usage::
    
          >>> import requests
          >>> req = requests.request('GET', 'https://httpbin.org/get')
          <Response [200]>
        """
    View Code

    三、session类的使用

    Session类的作用是用来维持一个会话,可以让多个请求共用cookie和headers和proxies

    headers->dict类型,可以通过 session.headers.update(headers) 更新

    cookies->cookie Jar类, 可使用 session.cookies.set(key, value) 更新

    proxies->dict类型, 可以通过直接赋值 session.proxies = proxies 更新

    通过 session.get() 发起请求

    headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
                             "(KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"}
    
    proxy = '59.61.38.48:34719'
    
    # requests.Session类 session
    = requests.Session() session.headers.update(headers) session.cookies.set('STM', '1231214') session.cookies.set('S', '123123') proxies = { 'http': 'http://%s' % proxy, 'https': 'http://%s' % proxy } session.proxies = proxies print(session.get('http://httpbin.org/get').text)

    # 输出

    {
    "args": {},
    "headers": {
      "Accept": "*/*",
      "Accept-Encoding": "gzip, deflate",
      "Cache-Control": "max-age=259200",
      "Connection": "close",
      "Cookie": "S=123123; STM=1231214",
      "Host": "httpbin.org",
      "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
    },
    "origin": "59.61.38.48",
    "url": "http://httpbin.org/get"
    }

  • 相关阅读:
    php 制表符(\t) 与单引号的疑惑。
    preg_replace 正则替换的疑惑
    命令行下使用curl,采集数据遇到的问题。
    vmware player 里的window xp 安装wamp遇到的问题
    2014年11月05日
    2014年11月05日
    让开发人员搞业务是不对的
    2014年11月05日
    web应用.表格很重要
    业务复杂
  • 原文地址:https://www.cnblogs.com/haoabcd2010/p/10336964.html
Copyright © 2011-2022 走看看