zoukankan      html  css  js  c++  java
  • python 爬取整理

    请求部分

    url解析

    from urllib import parse
    url = "http://www.baidu.com/s?"
    info = {"wd":"kidd"}
    url = url + parse.urlencode(info)
    print(url) #http://www.baidu.com/s?wd=kidd

    url的编码与解码

    为何要这需要使用呢?

    如果一个请求中包含?=  / + 等特殊符号时可能会发生冲突。如果你直接 http://www.baidu.com/s?wd=/a+b=?/ 搜过内容肯定会有差别。

    from urllib import parse
    # 编码
    url = "http://www.baidu.com/s?wd="
    info = parse.quote("/a+b=?/")
    url += info
    print(url) # http://www.baidu.com/s?wd=/a%2Bb%3D%3F/
    
    # 解码
    parse_url = parse.unquote(url)
    print(parse_url) # http://www.baidu.com/s?wd=/a+b=?/

    requests好像不能实现,如果能实现麻烦告诉我。

    requests的post请求

    data数据不是字典

    data = "name=kidd"
    response = requests.post("http://httpbin.org/post",data=data)
    print(response.text)

    返回结果,放在data中

    "{
      "args": {}, 
      "data": "name=kidd", 
      "files": {}, 
      "form": {}, 
      "headers": {
        "Accept": "*/*", 
        "Accept-Encoding": "gzip, deflate", 
        "Content-Length": "9", 
        "Host": "httpbin.org", 
        "User-Agent": "python-requests/2.23.0", 
        "X-Amzn-Trace-Id": "Root=1-5edeee36-d00dd8b083c14254ec60605a"
      }, 
      "json": null, 
      "origin": "39.77.220.193", 
      "url": "http://httpbin.org/post"
    }"

    data是字典

    data = {"name":"kidd"}
    response = requests.post("http://httpbin.org/post",data=data)
    print(response.text)

    返回数据,放在form中,数据在form才算成功

    {
      "args": {}, 
      "data": "", 
      "files": {}, 
      "form": {
        "name": "kidd"
      }, 
      "headers": {
        "Accept": "*/*", 
        "Accept-Encoding": "gzip, deflate", 
        "Content-Length": "9", 
        "Content-Type": "application/x-www-form-urlencoded", 
        "Host": "httpbin.org", 
        "User-Agent": "python-requests/2.23.0", 
        "X-Amzn-Trace-Id": "Root=1-5edeeee5-f0544530bbb1b22824acd930"
      }, 
      "json": null, 
      "origin": "39.77.220.193", 
      "url": "http://httpbin.org/post"
    }
  • 相关阅读:
    don't run elasticsearch as root.
    详细讲解安全升级MySQL的方法
    mysql sql优化实例1(force index使用)
    mysql的force index
    【C++】string类用法
    【GAN】GAN设计与训练集锦
    【C++】VS Code配置
    【Windows】win10:硬件良好,软件系统出错
    【Windows】快速启动软件 非点击软件图标 无限弹窗
    【Ubuntu】利用sudo修改/etc/sudoers翻车
  • 原文地址:https://www.cnblogs.com/py-peng/p/13070837.html
Copyright © 2011-2022 走看看