zoukankan      html  css  js  c++  java
  • Python 爬虫入门3种方法

     Python 2.0

    url = "http://www.baidu.com" 
    print '第一种方法' 
    response1 = urllib2.urlopen(url) 
    print response1.getcode() 
    print len(response1.read()) 
    
    print '第二种方法' 
    request = urllib2.Request(url) 
    request.add_header("user-agent","Mozilla/5.0") 
    response2 = urllib2.urlopen(request) 
    print response2.getcode() 
    print len(response2.read()) 
    
    print '第三种方法' 
    cj = cookielib.CookieJar() 
    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) 
    urllib2.install_opener(opener) 
    response3 = urllib2.urlopen(url) 
    print response3.getcode() 
    print cj print 
    response3.read()

    Python 3.0

    第一种方法
    import urllib.request
    import http.cookiejar
    
    url="http://www.baidu.com"
    
    print('第一种方法:')
    response1 = urllib.request.urlopen(url)
    
    print(response1.getcode())
    print(len(response1.read()))
    
    print('第二种方法')
    request = urllib.request.Request(url)
    request.add_header('user-agent','Mozilla/5.0')
    response2 =urllib.request.urlopen(request)
    print(response1.getcode())
    print(len(response2.read()))
    
    print('第三种方法')
    cj = http.cookiejar.CookieJar()
    opener= urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
    urllib.request.install_opener(opener)
    response3 =urllib.request.urlopen(url)
    print(response3.getcode())
    print(cj)
    print(response3.read())

    参考:http://www.imooc.com/article/16363

  • 相关阅读:
    uva11729
    1.RabbitMQ介绍
    4.RabbitMQ Linux安装
    3.RabbitMQ 第一个程序
    2.RabbitMQ Window环境安装
    hdu 1757 A Simple Math Problem 矩阵快速幂
    hdu2222 Keywords Search AC自动机
    hdu 2159 二维完全背包
    POJ 3449 Geometric Shapes 判断多边形相交
    POJ 2826 An Easy Problem? 判断线段相交
  • 原文地址:https://www.cnblogs.com/youmingkuang/p/7458488.html
Copyright © 2011-2022 走看看