zoukankan      html  css  js  c++  java
  • urllib2 document

    18.6.22 Examples


    18.6.22 Examples

    This example gets the python.org main page and displays the first 100 bytes of it:

    >>> import urllib2
    >>> f = urllib2.urlopen('http://www.python.org/')
    >>> print f.read(100)
    <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
    <?xml-stylesheet href="./css/ht2html
    

    Here we are sending a data-stream to the stdin of a CGI and reading the data it returns to us. Note that this example will only work when the Python installation supports SSL.

    >>> import urllib2
    >>> req = urllib2.Request(url='https://localhost/cgi-bin/test.cgi',
    ...                       data='This data is passed to stdin of the CGI')
    >>> f = urllib2.urlopen(req)
    >>> print f.read()
    Got Data: "This data is passed to stdin of the CGI"
    

    The code for the sample CGI used in the above example is:

    #!/usr/bin/env python
    import sys
    data = sys.stdin.read()
    print 'Content-type: text-plain\n\nGot Data: "%s"' % data
    

    Use of Basic HTTP Authentication:

    import urllib2
    # Create an OpenerDirector with support for Basic HTTP Authentication...
    auth_handler = urllib2.HTTPBasicAuthHandler()
    auth_handler.add_password(realm='PDQ Application',
                              uri='https://mahler:8092/site-updates.py',
                              user='klem',
                              passwd='kadidd!ehopper')
    opener = urllib2.build_opener(auth_handler)
    # ...and install it globally so it can be used with urlopen.
    urllib2.install_opener(opener)
    urllib2.urlopen('http://www.example.com/login.html')
    

    build_opener() provides many handlers by default, including a ProxyHandler. By default, ProxyHandler uses the environment variables named <scheme>_proxy, where <scheme> is the URL scheme involved. For example, the http_proxy environment variable is read to obtain the HTTP proxy's URL.

    This example replaces the default ProxyHandler with one that uses programatically-supplied proxy URLs, and adds proxy authorization support with ProxyBasicAuthHandler.

    proxy_handler = urllib2.ProxyHandler({'http': 'http://www.example.com:3128/'})
    proxy_auth_handler = urllib2.HTTPBasicAuthHandler()
    proxy_auth_handler.add_password('realm', 'host', 'username', 'password')
    
    opener = build_opener(proxy_handler, proxy_auth_handler)
    # This time, rather than install the OpenerDirector, we use it directly:
    opener.open('http://www.example.com/login.html')
    

    Adding HTTP headers:

    Use the headers argument to the Request constructor, or:

    import urllib2
    req = urllib2.Request('http://www.example.com/')
    req.add_header('Referer', 'http://www.python.org/')
    r = urllib2.urlopen(req)
    

    OpenerDirector automatically adds a User-Agent: header to every Request. To change this:

    import urllib2
    opener = urllib2.build_opener()
    opener.addheaders = [('User-agent', 'Mozilla/5.0')]
    opener.open('http://www.example.com/')
    

    Also, remember that a few standard headers (Content-Length:, Content-Type: and Host:) are added when the Request is passed to urlopen() (or OpenerDirector.open()).

  • 相关阅读:
    协方差的意义
    ios7新特性实践
    微信支付大盗--黑色产业链
    UVA 297 Quadtrees(四叉树建树、合并与遍历)
    HDU 2876 Ellipse, again and again
    java中接口的定义与实现
    Oracle Linux 6.3下安装Oracle 11g R2(11.2.0.3)
    Fortran使用隐形DO循环和reshape给一维和多维数组赋初值
    Java实现 蓝桥杯VIP 算法训练 成绩的等级输出
    Java实现 蓝桥杯VIP 算法训练 成绩的等级输出
  • 原文地址:https://www.cnblogs.com/lexus/p/2821281.html
Copyright © 2011-2022 走看看