zoukankan      html  css  js  c++  java
  • timesacledb 测试demo数据运行

    timesacledb 的安装还是使用docker,对于测试数据需要提前下载

    启动timesacledb

    使用支持gis 的镜像,后边需要使用

    docker run -d --name timescaledb -p 5432:5432 timescale/timescaledb-postgis

    预备环境

    • 下载测试数据
    https://timescaledata.blob.core.windows.net/datasets/nyc_data.tar.gz
    • 创建数据库&&扩展加载timesacledb
    CREATE DATABASE nyc_data;
    c nyc_data
    CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;

    加载数据

    • 导入schema
    psql -U postgres -d nyc_data -h localhost < nyc_data.sql
    • 导入数据
    psql -U postgres -d nyc_data -h localhost -c "COPY rides FROM nyc_data_rides.csv CSV"

    运行查询

    • 基本查询
    SELECT date_trunc('day', pickup_datetime) as day, avg(fare_amount) FROM rides WHERE passenger_count > 1 AND pickup_datetime < '2016-01-08' GROUP BY day ORDER BY day;
    SELECT date_trunc('day', pickup_datetime) as day, COUNT(*) FROM rides GROUP BY day ORDER BY day LIMIT 5;
    • 时序查询
    SELECT time_bucket('5 minute', pickup_datetime) AS five_min, count(*) FROM rides WHERE pickup_datetime < '2016-01-01 02:00' GROUP BY five_min ORDER BY five_min;

    系统数据


    查询细节

    • 通过explain 查看timesacledb 的工作
    EXPLAIN SELECT * FROM rides;

    gis 查询

    • 让nyc_data 支持gis
    CREATE EXTENSION postgis;
    ALTER TABLE rides ADD COLUMN pickup_geom geometry(POINT,2163);
    ALTER TABLE rides ADD COLUMN dropoff_geom geometry(POINT,2163);
    • 生成geo 数据(有点慢,需要花点时间)
    UPDATE rides SET pickup_geom = ST_Transform(ST_SetSRID(ST_MakePoint(pickup_longitude,pickup_latitude),4326),2163);
    UPDATE rides SET dropoff_geom = ST_Transform(ST_SetSRID(ST_MakePoint(dropoff_longitude,dropoff_latitude),4326),2163);
    • gis 查询
    SELECT time_bucket('30 minutes', pickup_datetime) AS thirty_min, COUNT(*) AS near_times_sq
      FROM rides
      WHERE ST_Distance(pickup_geom, ST_Transform(ST_SetSRID(ST_MakePoint(-73.9851,40.7589),4326),2163)) < 400
        AND pickup_datetime < '2016-01-01 14:00'
      GROUP BY thirty_min ORDER BY thirty_min;
    • 系统生成数据

    说明

    总的来说简单,同时具有时序数据库的特点,对于我们来说不需要关注数据多的时候性能的问题,还是很不错的,对于ha 以及集群功能还有待
    研究

    参考资料

    https://docs.timescale.com/v0.9/tutorials/tutorial-hello-nyc
    https://docs.timescale.com/v0.9/getting-started/installation/mac/installation-homebrew

  • 相关阅读:
    pycharm 代码块缩进、左移
    os.popen(cmd) .read() 获取执行后的结果且带有换行符
    数据分析基础路线了解
    Jupyter中python3之numpy练习
    Mysql数据库操作命令行小结
    Mysql配置主从同步的基本步骤
    百度翻译爬虫-Web版(自动生成sign)
    windows中的常用Dos命令
    Cookie安全隐患DOM演示
    bash漏洞技术层面分析(cgi-bin)
  • 原文地址:https://www.cnblogs.com/rongfengliang/p/9230899.html
Copyright © 2011-2022 走看看