zoukankan html css js c++ java

python urllib2

第一种方法：(其中url，可用ftp、file等协议)

>>> import urllib2
>>> response = urllib2.urlopen('http://python.org/')
>>> print response.read(),response.geturl(),response.getcode()

第二种方法：

>>> import urllib2
>>> req=urllib2.Request("http://python.org")
>>> response=urllib2.urlopen(req)
>>> print response.read(),response.geturl(),response.getcode()

　如果想在post数据的话则，请求头的话，格式为字典

import urllib
data={} #data为字典，此处略去
data = urllib.urlencode(values)
headers = { 'User-Agent' : user_agent }
req = urllib2.Request(url, data，headers)
response=urllib2.urlopen(req)

异常处理：

不能处理一个respons时，urlopen抛出一个urlerror。HTTPerror是HTTP URL在特别的情况下被抛出的URLError的一个子类。

urlerror：
通常，urlerror被抛出是因为没有网络连接（没有至特定服务器的连接）或者特定的服务器不存在。在这种情况下，含有reason属性的异常将被抛出，以一种包含错误代码和文本错误信息的tuple形式。

from urllib2 import Request, urlopen, URLError
req = Request(someurl)
try:
    response = urlopen(req)
except URLError, e:
    if hasattr(e, 'reason'):
        print 'We failed to reach a server.'
        print 'Reason: ', e.reason
    elif hasattr(e, 'code'):
        print 'The server couldn\'t fulfill the request.'
        print 'Error code: ', e.code
    else:
        pass

　　或者：

from urllib2 import Request, urlopen, URLError, HTTPError
req = Request(someurl)
try:
    response = urlopen(req)
except HTTPError, e:
    print 'The server couldn\'t fulfill the request.'
    print 'Error code: ', e.code
except URLError, e:
    print 'We failed to reach a server.'
    print 'Reason: ', e.reason

　　其他一些方法介绍：

geturl（）–它返回被获取网页的真正的url。这是很有用的，因为urlopen（或使用的opener对象）也许会伴随一个重定向。
获取的网页url也许和要求的网页url不一样。

info（）–它返回一个像字典的对象来描述获取的网页，尤其是服务器发送的头。它现在一般是httplib.HTTPMessage的一个实例。
典型的头包含'Content-length', 'Content-type', 等等。看一下Quick Reference to HTTP Headers中，HTTP头列表，还有
关于他们简单的解释和使用方法。

import urllib2
req=urllib2.Request("http://python.org")
response = urllib2.urlopen(req)
print "response.geturl() ",response.geturl()
print "response.info():\n",response.info()

执行结果：

response.geturl()  http://python.org
response.info():
Date: Mon, 13 May 2013 13:11:23 GMT
Server: Apache/2.2.16 (Debian)
Last-Modified: Sun, 12 May 2013 03:41:45 GMT
ETag: "105800d-52ab-4dc7d2dd81040"
Accept-Ranges: bytes
Content-Length: 21163
Vary: Accept-Encoding
Connection: close
Content-Type: text/html

查看全文

相关阅读:
《剑指offer》— JavaScript（29）最小的K个数
 《剑指offer》— JavaScript（28）数组中出现次数超过一半的数字
 《剑指offer》— JavaScript（27）字符串的排列
 《剑指offer》— JavaScript（26）二叉搜索树与双向链表
 《剑指offer》— JavaScript（25）复杂链表的复制
 【备忘】接口
 【备忘】WPF基础
 UWP-动态磁贴
 UWP-磁贴初识
 【备忘】C#语言基础-2

原文地址：https://www.cnblogs.com/TianMG/p/3076363.html