zoukankan      html  css  js  c++  java
  • urllib2

    import urllib2
    response = urllib2.urlopen("http://www.baidu.com")
    print response.read()

    urlopen(url, data, timeout)

    构造Requset

    import urllib2

    request = urllib2.Request("http://www.baidu.com")
    response = urllib2.urlopen(request)
    print response.read()

    POST方式:
    import urllib
    import urllib2

    values = {"username":"1016903103@qq.com","password":"XXXX"}
    data = urllib.urlencode(values)
    url = "https://passport.csdn.net/account/login?from=http://my.csdn.net/my/mycsdn"
    request = urllib2.Request(url,data)
    response = urllib2.urlopen(request)
    print response.read()

    GET方式:
    import urllib
    import urllib2
    values={}
    values['username'] = "1016903103@qq.com"
    values['password']="XXXX"
    data = urllib.urlencode(values)
    url = "http://passport.csdn.net/account/login"
    geturl = url + "?"+data
    request = urllib2.Request(geturl)
    response = urllib2.urlopen(request)
    print response.read()

    设置Headers

    import urllib
    import urllib2

    url = 'http://www.server.com/login'
    user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
    values = {'username' : 'cqc',  'password' : 'XXXX' }
    headers = { 'User-Agent' : user_agent }
    data = urllib.urlencode(values)
    request = urllib2.Request(url, data, headers)
    response = urllib2.urlopen(request)
    page = response.read()

    对付”反盗链”的方式,对付防盗链,服务器会识别headers中的referer是不是它自己,如果不是,有的服务器不会响应,所以我们还可以在headers中加入referer

    headers = { 'User-Agent' : 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'  ,'Referer':'http://www.zhihu.com/articles' }

    Proxy(代理)的设置

    import urllib2
    enable_proxy = True
    proxy_handler = urllib2.ProxyHandler({"http" : 'http://some-proxy.com:8080'})
    null_proxy_handler = urllib2.ProxyHandler({})
    if enable_proxy:
    opener = urllib2.build_opener(proxy_handler)
    else:
    opener = urllib2.build_opener(null_proxy_handler)
    urllib2.install_opener(opener)

    Timeout 设置

    import urllib2
    response = urllib2.urlopen('http://www.baidu.com', timeout=10)

    import urllib2
    response = urllib2.urlopen('http://www.baidu.com',data, 10)

    使用 HTTP 的 PUT 和 DELETE 方法
    request = urllib2.Request(uri, data=data)
    request.get_method = lambda: 'PUT' # or 'DELETE'
    response = urllib2.urlopen(request)

    Python爬虫入门五之URLError异常处理

    http://blog.csdn.net/cqcre

  • 相关阅读:
    @SpringBootApplication
    springboot自动装配介绍
    WebServerInitializedEvent &ApplicationRunner
    springboot 潜入式web容器
    Unable to import maven project: See logs for details
    spring boot2 运行环境
    [ERROR] Failed to execute goal org.apache.maven.plugins:maven-war-plugin:2.1.1:war
    DefaultHandlerExceptionResolver : Failed to read HTTP message: org.springframework.http.converter.HttpMessageNotReadableException: Required request body is missing
    idea中maven下载jar包不完整问题
    Python_报错:SyntaxError: EOL while scanning string literal
  • 原文地址:https://www.cnblogs.com/lly-lly/p/5390949.html
Copyright © 2011-2022 走看看