zoukankan      html  css  js  c++  java
  • python gzip get url

    import urllib2, gzip, StringIO
    
    __author__ = "Mark Pilgrim (mark@diveintomark.org)"
    __license__ = "Python"
    
    def get(uri):
        request = urllib2.Request(uri)
        request.add_header("Accept-encoding", "gzip")
        usock = urllib2.urlopen(request)
        data = usock.read()
        if usock.headers.get('content-encoding', None) == 'gzip':
            data = gzip.GzipFile(fileobj=StringIO.StringIO(data)).read()
        return data
    
    if __name__ == '__main__':
        import sys
        uri = sys.argv[1:] and sys.argv[1] or 'http://leknor.com/'
        print get(uri)
    
    <div>
                   <div><h3 class="title">Example 11.12. Using the redirect handler to detect permanent redirects</h3>
     http://diveintopython.org/http_web_services/redirects.html                 <h2 class="title"><a name="oa.gzip"></a>11.8. Handling compressed data</h2></div></div>
    http://diveintopython.org/http_web_services/gzip_compression.html<br />
    <br />
    http://rationalpie.wordpress.com/2010/06/02/python-streaming-gzip-decompression/<br />
    <br />
    <br />
    <br />
    <div class="primary">
    				<h1>Python mechanize gzip response handling</h1>
    				<p>Mechanize is awesome. The documentation is shit. The gzip support is non-existent. Some sites like Yahoo! require gzip support.</p>
    <pre>def ungzipResponse(r,b):
    	headers = r.info()
    	if headers['Content-Encoding']=='gzip':
    		import gzip
    		gz = gzip.GzipFile(fileobj=r, mode='rb')
    		html = gz.read()
    		gz.close()
    		headers["Content-type"] = "text/html; charset=utf-8"
    		r.set_data( html )
    		b.set_response(r)
    
    b = Browser()
    b.addheaders.append( ['Accept-Encoding','gzip'] )
    r = b.open('http://some-gzipped-site.com')
    ungzipResponse(r,b)
    print r.read()
    

    http://unformatt.com/news/python-mechanize-gzip-response-handling/

    http://news.ycombinator.com/item?id=1424488
    good article
    http://betterexplained.com/articles/how-to-optimize-your-site-with-gzip-compression/
  • 相关阅读:
    模板:高精度求积
    模板:求n累加和
    模板:求A/B高精度值
    模板:堆
    模板:素数筛
    模板:前缀和
    模板:单调队列(Sliding Window)
    模板:最长上升子序列(LIS)
    [转]Asp.net mvc 网站之速度优化 -- 页面缓存
    [转]ASP.NET MVC3 + EF 性能优化解决方案以及最优架构
  • 原文地址:https://www.cnblogs.com/lexus/p/1806736.html
Copyright © 2011-2022 走看看