zoukankan      html  css  js  c++  java
  • 6月1日

    爬取疫情数据

    首先第一步,我们要打开这个网页看一眼,这个网页实际上是一堆json字符串,只有文字信息,所以我们先要导入第三方的requests工具和解析json字符串的json工具

    import requests
    import json

    之后通过该工具访问页面获取到页面文本(对于这个页面来说就是json字符)

    r = requests.get(url, headers)
    res = json.loads(r.text)
    data_all = json.loads(res["data"])

    再根据获取到的信息构筑键值对

    复制代码
    details = []
    
    update_time = data_all["lastUpdateTime"]
    data_country = data_all["areaTree"]
    data_province = data_country[0]["children"]
    
    for pro_infos in data_province:
        province = pro_infos["name"]
        # print(province)
        for city_infos in pro_infos["children"]:
            city = city_infos["name"]
            confirm = city_infos["total"]["confirm"]
            confirm_add = city_infos["today"]["confirm"]
            heal = city_infos["total"]["heal"]
            dead = city_infos["total"]["dead"]
            suspect = city_infos["total"]["suspect"]
            details.append([update_time, province, city, confirm, confirm_add, heal, dead, suspect])
    复制代码

    接下来,导入数据库交互组件pyMySQL

    import pymysql

    建立连接与游标

    conn = pymysql.connect(host="127.0.0.1", port=3306, user="root", password="260702266", database="virus", charset="utf8")
    cursor = conn.cursor()

    通过一个循环来遍历detail数组,用插入语句导入到数据库里

    复制代码
    count = 0
    while count < len(details):
        sql = "insert into newebd(time,province,city,nowConfirm,Confirm,heal,dead,suspect) values(%s,%s,%s,%s,%s,%s," 
              "%s,%s) "
        count = count+1
        try:
            cursor.executemany(sql, details)
            conn.commit()
        except:
            conn.rollback()
    复制代码

    这样就完成了数据爬取到数据库

  • 相关阅读:
    利用dockerfile定制镜像
    发布Docker 镜像到dockerhub
    Docker 停止容器
    133. Clone Graph
    132. Palindrome Partitioning II
    131. Palindrome Partitioning
    130. Surrounded Regions
    129. Sum Root to Leaf Numbers
    128. Longest Consecutive Sequence
    127. Word Ladder
  • 原文地址:https://www.cnblogs.com/ldy2396/p/14911942.html
Copyright © 2011-2022 走看看