zoukankan      html  css  js  c++  java
  • 网页下载器urllib2实例

    1、最简洁方法

    response:直接请求

    response.getcode():获取状态码

    response.read():读取内容 

    #coding:utf8
    import urllib2
    url = "http://www.baidu.com"
    print '第一种方法'
    response1 = urllib2.urlopen(url)
    print response1.getcode()
    print len(response1.read())

    第一种方法
    200
    118765

    2、添加data、http header

    request = urllib2.Request(url):创建request对象

    request.add_data:添加数据

    request.add_header:添加http的header

    response = urllib2.urlopen(request):发送请求获取结果

    print '第二种方法'
    request = urllib2.Request(url)
    request.add_header("user-agent", "Mozilla/5.0")
    response2 = urllib2.urlopen(request)
    print response2.getcode()
    print len(response2.read())

    第二种方法
    200
    118649

    3、添加特殊情景的处理器

    cj = cookielib.CookieJar():创建cookie容器

    opener = 。。。:创建1个opener

    urllib2.install_opener(opener):给urllib2安装opener

    print '第三种方法'
    cj = cookielib.CookieJar()
    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
    response3 = urllib2.urlopen(url)
    print response3.getcode()
    print cj
    print response3.read()

    第三种方法
    200
    <CookieJar[]>
    <!DOCTYPE html>
    <!--STATUS OK-->

  • 相关阅读:
    HackerRank "Arithmetic Expressions" !
    HackerRank "Poker Nim"
    HackerRank "Nimble Game"
    HackerRank "Misère Nim"
    HackerRank "Triangle Numbers"
    HackerRank "Flipping the Matrix"
    HackerRank "Chessboard Game, Again!"
    HackerRank "Tower Breakers, Again!"
    HackerRank
    HackerRank "Richie Rich"
  • 原文地址:https://www.cnblogs.com/strawqqhat/p/10602395.html
Copyright © 2011-2022 走看看