zoukankan      html  css  js  c++  java
  • urllib 和urllib2 模块使用简例

    一、最简单的使用

    import urllib,urllib2
    
    response = urllib2.urlopen("https://www.baidu.com")
    print response.read()
    View Code

    二、构造Request对象

    request = urllib2.Request("https://www.baidu.com")
    response = urllib2.urlopen(request)
    print response.read()
    View Code

    三、通过POST 、GET 方式请求

      POST

    values = {'username':'test','passwrod':'123'}
    data = urllib.urlencode(values)
    print data    # username=test&passwrod=123
    request = urllib2.Request("https://www.baidu.com",data=data)
    response = urllib2.urlopen(request)
    print response.read()
    View Code

      GET

    value = {}
    value['username']='test'
    value['password']='123'
    data = urllib.urlencode(value)
    url = "https://www.baidu.com"+"?"+data
    print url    #   https://www.baidu.com?username=test&password=123
    request = urllib2.Request(url=url)
    response = urllib2.urlopen(request)
    print response.read()
    View Code

    四、quote,进行编码

    a = '哈哈'
    A = urllib.quote(a)
    print A
    B = urllib.unquote(A)
    print B
    View Code

      urlencode在 三 中的 GET 部分已有样例

    五、设置请求头 header

    url = "https://www.baidu.com"
    value = {"username":"test","password":"123"}
    data = urllib.urlencode(value)
    header = {
            "User-Agent":"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:50.0) Gecko/20100101 Firefox/50.0",
            "Referer":"http://tieba.baidu.com/f?kw=%E4%BF%9D%E5%AE%9A&ie=utf-8&pn=50"
        }
    request = urllib2.Request(url=url,data=data,headers=header)
    response = urllib2.urlopen(request)
    print response.read()
    View Code

    urlopen是urllib2.OpenerDirector的一个实例,一个opener ,一个特殊的默认的opener.因此,这个opener并不能总是满足我们的需求,
    这个时候,就需要我们自己构造自己的opener了。

    源码摘录

    _opener = None
    def install_opener(opener):
        global _opener
        _opener = opener
    # ————————————————————————————————————————————————
    def urllopen():
        """..."""
        return opener.open(url, data, timeout)
    View Code

    六、设置代理

    enable_proxy = True
    proxy_handler = urllib2.ProxyHandler({"http" : 'http://some-proxy.com:8080'})
    null_proxy = urllib2.ProxyHandler({})
    if enable_proxy:
        opener = urllib2.build_opener(proxy_handler) #创建一个opener对象
    else:
        opener = urllib2.build_opener(null_proxy)
    #
    urllib2.install_opener(opener)  #全局应用该opener
    
    request = urllib2.Request("https://www.baidu.com")
    response = opener.open(request)
    response = urllib2.urlopen(request)
    # 
    print response.read()
    View Code

    七、操作cookie

    import cookielib
    
    #创建一个CookieJar实例来保存cookie
    cookie = cookielib.CookieJar()
    
    # 创建 Cookie 处理器
    handler = urllib2.HTTPCookieProcessor(cookie)
    
    #创建一个 opener
    opener = urllib2.build_opener(handler)
    
    # 用带有cookie 处理器的opener 来请求url
    response = opener.open("https://www.baidu.com")
    # 
    for item in cookie:
        print item     #<Cookie BIDUPSID=25441729620BF793C1BE08CA0B43C8D4 for .baidu.com/>
        print 'Name = '+item.name    #Name = BIDUPSID
        print 'Value = '+item.value    #Value = 25441729620BF793C1BE08CA0B43C8D4
    View Code

    八、保存cookie到文件

    import cookielib
    
    filename = "/home/an/savecookie.test"
    #创建一个 MozillaCookieJar 对象来保存cookie ,稍后写入对象
    cookie = cookielib.MozillaCookieJar(filename)
    # 创建 cookie 处理器
    handle = urllib2.HTTPCookieProcessor(cookie)
    #构建 handler
    opener = urllib2.build_opener(handle)
    
    response  = opener.open("http://www.baidu.com")
    #保存cookie到文件
    cookie.save(ignore_discard=True,ignore_expires=True)
    # ignore_discard 即使cookie被丢弃也保存下来。
    # ignore_expires 如果该文件中的cookie已存在,那么就覆盖
    View Code

    九、从文件中取出cookie并使用

    import cookielib
    
    cookie = cookielib.MozillaCookieJar()
    cookie.load("/home/an/savecookie.test",ignore_expires=True,ignore_discard=True)
    
    handler = urllib2.HTTPCookieProcessor(cookie)
    opener = urllib2.build_opener(handler)
    
    request = urllib2.Request("http://www.baidu.com")
    response = opener.open(request)
    print response.read()
    View Code
  • 相关阅读:
    MySQL中mysqldump导出数据的使用
    MySQL中show语法使用总结
    Nginx配置项优
    Elasticsearch5.2.2安装
    SSH实现双向认证
    MySQL5.6主从复制搭建基于日志(binlog)
    清除Pycharm设置的方法
    Python3编程技巧
    django组件-中间件
    Python打杂之路
  • 原文地址:https://www.cnblogs.com/jijizhazha/p/6370065.html
Copyright © 2011-2022 走看看