列式存储在数据分析(olap)上有天然的优势,以下是使用cstore fdw + hasura graphql-engine 加速
数据的分析
环境准备
- docker-compose 文件
version: '3'
services:
graphql-engine:
image: hasura/graphql-engine:v1.1.0-beta.3
ports:
- "8080:8080"
environment:
HASURA_GRAPHQL_DATABASE_URL: postgres://postgres:dalong@pgspider-cstore:5432/postgres
HASURA_GRAPHQL_ENABLE_CONSOLE: "true" # set to "false" to disable console
HASURA_GRAPHQL_ENABLED_LOG_TYPES: startup, http-log, webhook-log, websocket-log, query-log
pgspider-pg_cron:
image: dalongrong/pgspider:pg_cron
ports:
- "5432:5432"
environment:
- "POSTGRES_PASSWORD=dalong"
pgspider-cstore:
image: dalongrong/pgspider:cstore
ports:
- "5433:5432"
environment:
- "POSTGRES_PASSWORD=dalong"
- 启动&&数据准备
启动docker-compose 服务
docker-compose up -d
下载demo 数据
wget http://examples.citusdata.com/customer_reviews_1998.csv.gz
wget http://examples.citusdata.com/customer_reviews_1999.csv.gz
加压文件:
gzip -d customer_reviews_1998.csv.gz
gzip -d customer_reviews_1999.csv.gz
copy 数据到容器,使用docker cp 命令
使用cstore fdw
- 启用扩展
CREATE EXTENSION cstore_fdw;
- 创建server
CREATE SERVER cstore_server FOREIGN DATA WRAPPER cstore_fdw;
- 创建外部表
CREATE FOREIGN TABLE customer_reviews
(
customer_id TEXT,
review_date DATE,
review_rating INTEGER,
review_votes INTEGER,
review_helpful_votes INTEGER,
product_id CHAR(10),
product_title TEXT,
product_sales_rank BIGINT,
product_group TEXT,
product_category TEXT,
product_subcategory TEXT,
similar_product_ids CHAR(10)[]
)
SERVER cstore_server
OPTIONS(compression 'pglz');
- 导入数据
容器内部
导入数据:
COPY customer_reviews FROM 'customer_reviews_1998.csv' WITH CSV;
COPY customer_reviews FROM 'customer_reviews_1999.csv' WITH CSV;
hasura graphql-engine 集成
- 添加track
- 数据查询
此查询是通过graphql 的,实际和以前demo 的sql是一样的
参考资料
https://github.com/citusdata/cstore_fdw
https://github.com/citusdata/pg_cron
https://github.com/rongfengliang/pgspider-docker