zoukankan      html  css  js  c++  java
  • urllib(最基本的)库的应用

    Urllib库


    python内置的http请求库

    1、urllib.request 请求模块

    2、urllib.error 异常处理模块(try,catch)

    3、urllib.parse url解析模块

    4、urllib.robotparser robots.txr解析模块


    urlopen

    get请求

    import urllib.request
    reponse=urllib.request.urlopen("http://www.baidu.com")
    print(reponse.read().decode('utf-8'))#.read()读请求内容

    post请求

    import urllib.parse#貌似加不加都行
    import urllib.request
    data=bytes(urllib.parse.urlencode({'name':'汪国强'}),encoding='utf-8')
    response=urllib.request.urlopen('http://httpbin.org/post',data=data)
    print(response.read().decode('utf-8'))

    urllib.error

    import urllib.request
    import socket
    import urllib.error
    try:
        response=urllib.request.urlopen('http://httpbin.org/get',timeout=0.01)
    except urllib.error.URLError as e:  #超时属于URLError
        if isinstance(e.reason,socket.timeout):
            print('timeout')

    对响应的一些处理

    状态码、响应头

    import urllib.request
    import socket
    import urllib.error
    response=urllib.request.urlopen('http://www.baidu.com')
    print(response.status)
    print('-----------------')
    print(response.getheaders())
    print('-----------------')
    print(response.getheader('Server'))

    得到:

    200        状态码
    -----------------

    响应头
    [('Date', 'Mon, 25 Dec 2017 09:59:01 GMT'), ('Content-Type', 'text/html; charset=utf-8'), ('Transfer-Encoding', 'chunked'), ('Connection', 'Close'), ('Vary', 'Accept-Encoding'), ('Set-Cookie', 'BAIDUID=C941C9CEBE13F4D6264663E5A10D4603:FG=1; expires=Thu, 31-Dec-37 23:55:55 GMT; max-age=2147483647; path=/; domain=.baidu.com'), ('Set-Cookie', 'BIDUPSID=C941C9CEBE13F4D6264663E5A10D4603; expires=Thu, 31-Dec-37 23:55:55 GMT; max-age=2147483647; path=/; domain=.baidu.com'), ('Set-Cookie', 'PSTM=1514195941; expires=Thu, 31-Dec-37 23:55:55 GMT; max-age=2147483647; path=/; domain=.baidu.com'), ('Set-Cookie', 'BDSVRTM=0; path=/'), ('Set-Cookie', 'BD_HOME=0; path=/'), ('Set-Cookie', 'H_PS_PSSID=25394_1453_21119_25178_22157; path=/; domain=.baidu.com'), ('P3P', 'CP=" OTI DSP COR IVA OUR IND COM "'), ('Cache-Control', 'private'), ('Cxy_all', 'baidu+e8e6fa769a31bd4f787c267655da18e6'), ('Expires', 'Mon, 25 Dec 2017 09:58:11 GMT'), ('X-Powered-By', 'HPHP'), ('Server', 'BWS/1.1'), ('X-UA-Compatible', 'IE=Edge,chrome=1'), ('BDPAGETYPE', '1'), ('BDQID', '0xc958493100031c89'), ('BDUSERID', '0')]

    -----------------

    指定的响应头内容

    BWS/1.1


    如果想在请求时加上请求头怎么办?

    import urllib.request
    import urllib.parse
    head={
        "Host": "httpbin.org",
        "Upgrade-Insecure-Requests": "1",
    }
    dic={'name':'165498489'}
    data=bytes(urllib.parse.urlencode(dic),encoding='utf-8')
    request=urllib.request.Request('http://httpbin.org/post',data=data,headers=head,method='POST')
    response=urllib.request.urlopen(request)
    print(response.read().decode('utf-8'))

    或者使用request.add_header()

    import urllib.request,parser
    
    dic={'name':'165498489'}
    data=bytes(urllib.parse.urlencode(dic),encoding='utf-8')
    request=urllib.request.Request('http://httpbin.org/post',data=data,method='POST')
    request.add_header(
        "Upgrade-Insecure-Requests", "1"
    )
    response=urllib.request.urlopen(request)
    print(response.read().decode('utf-8'))

    Handler

    代理

    使用代理ip

    import urllib.request
    proxy_handler=urllib.request.ProxyHandler({
        'http':'http://116.199.115.78:80/'
    })
    opener=urllib.request.build_opener(proxy_handler)
    response=opener.open('http://httpbin.org/ip')
    print(response.read().decode('utf-8'))
  • 相关阅读:
    338. 比特位计数
    300.最长上升子序列
    git 钩子服务代码
    thinkphp5.1 封装文件上传模块
    Theano 基础
    使用anaconda和pycharm搭建多python本版的开发环境
    GIT常用命令
    Thinkphp 获取数据表随机值
    在Windows中利用.bat提交git代码到不同分支
    Windows .bat 常量
  • 原文地址:https://www.cnblogs.com/wang666/p/8110801.html
Copyright © 2011-2022 走看看