zoukankan      html  css  js  c++  java
  • python网页下载

    python 2.7版本下可以运行

    import urllib2

    def getHtml(url):
    response = None
    requset = None
    headers = {'User-Agent':'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-us; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6'}
    try:
    requset = urllib2.Request(url,headers = headers)
    response = urllib2.urlopen(requset)
    html_body = response.read()
    return html_body
    except urllib2.URLError as e:
    if hasattr(e,'code'):
    print 'Error code:',e.code
    elif hasattr(e,'reason'):
    print 'Reason:',e.code
    finally:
    if response:
    response.close()
    def saveHtml(file_name, file_content):
    with open(file_name.replace('/', '_') + ".html", "wb") as f:
    f.write(file_content)
    html = getHtml("https://www.baidu.com/")
    saveHtml("xxx", html)
    #show me------------------------------
    print html
  • 相关阅读:
    php分页问题
    php中memcached的使用
    Linux安装Git
    day06
    day07
    day03
    day05
    day04
    列表的操作
    初识数据类型
  • 原文地址:https://www.cnblogs.com/studyskill/p/8213777.html
Copyright © 2011-2022 走看看