zoukankan      html  css  js  c++  java
  • 爬虫相关

    requests模块

    import requests
    response=requests.get('https://www.baidu.com')
    print(response.text)
    import requests
    # requests.get()
    # requests.post()
    # requests.head()
    # 像get,post,head里面都返回了request()
    response=requests.request(
        url='https://www.zhihu.com',
        method='get',
        # params={},    #url上传递的参数
        # data={}  #post默认的content-type
        # json={}     #content-type为application/json形式传递
        headers={
            # 表示何种方式请求的,这里模拟浏览器
            'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 Safari/537.36',
        }
    )
    print(response.text)
    比较重要的headers参数
    import requests
    response=requests.request(
        url='https://i.cnblogs.com/Configure.aspx',
        method='get',
        cookies={'.CNBlogsCookie':'xxx'}
    )
    print(response.text)
    比较重要的cookie参数
    def request(method, url, **kwargs):
        """Constructs and sends a :class:`Request <Request>`.
    
        :param method: method for the new :class:`Request` object.
        :param url: URL for the new :class:`Request` object.
        :param params: (optional) Dictionary or bytes to be sent in the query string for the :class:`Request`.
        :param data: (optional) Dictionary, bytes, or file-like object to send in the body of the :class:`Request`.
        :param json: (optional) json data to send in the body of the :class:`Request`.
        :param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`.
        :param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`.
        :param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': ('filename', fileobj)}``) for multipart encoding upload.
        :param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
        :param timeout: (optional) How long to wait for the server to send data
            before giving up, as a float, or a :ref:`(connect timeout, read
            timeout) <timeouts>` tuple.
        :type timeout: float or tuple
        :param allow_redirects: (optional) Boolean. Set to True if POST/PUT/DELETE redirect following is allowed.
        :type allow_redirects: bool
        :param proxies: (optional) Dictionary mapping protocol to the URL of the proxy.
        :param verify: (optional) whether the SSL cert will be verified. A CA_BUNDLE path can also be provided. Defaults to ``True``.
        :param stream: (optional) if ``False``, the response content will be immediately downloaded.
        :param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair.
        :return: :class:`Response <Response>` object
        :rtype: requests.Response
    
        Usage::
    
          >>> import requests
          >>> req = requests.request('GET', 'http://httpbin.org/get')
          <Response [200]>
        """
    
        # By using the 'with' statement we are sure the session is closed, thus we
        # avoid leaving sockets open which can trigger a ResourceWarning in some
        # cases, and look like a memory leak in others.
        with sessions.Session() as session:
            return session.request(method=method, url=url, **kwargs)
    
    更多参数
    其他参数

    beautifulsoup模块

    来自https://www.cnblogs.com/yuanchenqi/articles/7617280.html

    写出漂亮的博客就是为了以后看着更方便的。
  • 相关阅读:
    怎样才有资格被称为开源软件
    [翻译]开发Silverlight 2.0的自定义控件
    网上Silverlight项目收集
    Google 分析的基准化测试
    IIS 承载的WCF服务失败
    Lang.NET 2008 相关Session
    Silverlight 2.0 beta1 堆栈
    asp.net 性能调较
    SQL Server 2005 的nvarchar(max),varchar(max)来救火
    LINQPad
  • 原文地址:https://www.cnblogs.com/zhaowei5/p/10498300.html
Copyright © 2011-2022 走看看