zoukankan      html  css  js  c++  java
  • 2021年3月25日

    时间:2个小时左右

    代码:100行左右

    博客:1

    内容:使用了flask框架进行了疫情数据的爬取

    import requests
    import json
    
    # https://view.inews.qq.com/g2/getOnsInfo?name=disease_h5
    # https://view.inews.qq.com/g2/getOnsInfo?name=disease_foreign
    
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36"
    
    }
    # 历史数据
    url_history = "https://view.inews.qq.com/g2/getOnsInfo?name=disease_foreign"
    resp = requests.get(url_history, headers)
    # 获取页面的json字符串
    json_data = resp.text
    print(type(json_data))
    print(json_data)
    # 把json转换为dict
    d_data = json.loads(json_data)
    print(type(d_data))
    print(d_data)
    
    print(d_data['data'])
    data_history = json.loads(d_data['data'])
    for item in data_history.keys():
        print(item)
        print(data_history[item])
    
    # details 详细数据
    url = "https://view.inews.qq.com/g2/getOnsInfo?name=disease_h5"
    
    resp = requests.get(url, headers)
    
    json_data = resp.text
    print(json_data)
    # 转换为字典
    dict_data = json.loads(json_data)
    print(dict_data)
    
    data = json.loads(dict_data['data'])
    for item in data.keys():
        print(item)
        print(data[item])
    
    print(data['chinaTotal'])
    print(data['chinaAdd'])
    
    print(data['areaTree'])
    
    print("*******")
    
    i = 1
    for item in data['areaTree'][0]:
        print(item)
    
    for item in data['areaTree'][0]['children']:
        print(item)

  • 相关阅读:
    [BUUOJ记录] [强网杯 2019]随便注(三种方法)
    Content Security Policy (CSP)内容安全策略总结
    [HGAME Week2] Cosmos的博客后台
    [BUUOJ记录] [ACTF2020 新生赛]Include
    PHP弱类型hash比较缺陷
    CTF常见源码泄漏总结
    Sqlmap Tamper绕过脚本详解
    Golden Pyramid
    Prime Palindrome Golf
    Min and Max
  • 原文地址:https://www.cnblogs.com/j-y-s/p/14903195.html
Copyright © 2011-2022 走看看