zoukankan      html  css  js  c++  java
  • saiku之行速度优化(三)

    经历了前两轮优化之后,saiku由不可使用,优化到可以使用,不过在分析大量日志数据的时候,还有顿卡的感觉!继续观察背后执行的Sql,决定将注意力关注到索引上面!

    日志的主要使用场景是:固定日期维度的数据分析,也就是说where条件一定跟着日期等于某一天,那么纠结的是:每个字段都建立索引,还是和日期建立联合索引。归结到底就是单个字段的索引效率与联合索引的效率优劣对比!

    Postgresql数据表:saiku_search_detail

    表结构:

    CREATE TABLE test.saiku_search_detail
    (
      rpt_date date,
      from_area_id bigint,
      from_value_id bigint,
      in_track_id bigint,
      gid character varying,
      current_city_id bigint,
      dist_city_id bigint,
      category_name_id bigint,
      page_id bigint,
      utmr_page_id bigint,
      num bigint,
      id bigint,
      partner smallint
    )

    条数:8,510,490。大概851万

    测试步骤:

    一、裸表

    对一个日期进行查询:

    1.1 单个条件

    select
      count(1)
    from test.saiku_search_detail
    where rpt_date = '2016-05-13'

    结果:1110ms

    "Aggregate  (cost=160934.85..160934.86 rows=1 width=0)"
    "  ->  Seq Scan on saiku_search_detail  (cost=0.00..160816.78 rows=47230 width=0)"
    "        Filter: (rpt_date = '2016-05-13'::date)"

    1.2 两个条件

    select
      count(1)
    from test.saiku_search_detail
    where rpt_date = '2016-05-13'
    and from_area_id = 135

    结果:1782ms

    "Aggregate  (cost=184432.32..184432.33 rows=1 width=0)"
    "  ->  Seq Scan on saiku_search_detail  (cost=0.00..184431.73 rows=236 width=0)"
    "        Filter: ((rpt_date = '2016-05-13'::date) AND (from_area_id = 135))"

    没有任何异议,0个索引!

    二、对两个字段分别添加索引:

    --btree索引
    CREATE INDEX saiku_search_detail_from_area_id_idx
      ON saiku_search_detail
      USING btree
      (from_area_id);
    --hash索引
    CREATE INDEX saiku_search_detail_rpt_date_idx
      ON saiku_search_detail
      USING hash
      (rpt_date);

    2.1 单个条件

    select
      count(1)
    from saiku_search_detail
    where rpt_date = '2016-05-13'

    结果:83ms

    "Aggregate  (cost=8.02..8.03 rows=1 width=0)"
    "  ->  Index Scan using saiku_search_detail_rpt_date_idx on saiku_search_detail  (cost=0.00..8.02 rows=1 width=0)"
    "        Index Cond: (rpt_date = '2016-05-13'::date)"

    使用了索引

    2.2 两个条件

    select
      count(1)
    from saiku_search_detail
    where rpt_date = '2016-05-13'
    and from_area_id = 135

    结果:149ms

    "Aggregate  (cost=8.02..8.03 rows=1 width=0)"
    "  ->  Index Scan using saiku_search_detail_rpt_date_idx on saiku_search_detail  (cost=0.00..8.02 rows=1 width=0)"
    "        Index Cond: (rpt_date = '2016-05-13'::date)"
    "        Filter: (from_area_id = 135)"

    使用了一个索引,第二个索引没有生效。尝试修改sql的条件顺序:

    select
      count(1)
    from saiku_search_detail
    where from_area_id = 135
    and rpt_date = '2016-05-13'

    结果一样!这说明在Postgresql里面,建立两个索引字段,只会一个起作用!

    三、建立联合索引

    --复合索引,两个字段都添加索引
    CREATE INDEX saiku_search_detail_rpt_date_from_area_idx
      ON test.saiku_search_detail
      USING btree
      (rpt_date, from_area_id);
      

    3.1 单个条件查询&建立索引的第一个字段

    select
      count(1)
    from test.saiku_search_detail
    where rpt_date = '2016-05-13'

    结果:66ms

    "Aggregate  (cost=47843.00..47843.01 rows=1 width=0)"
    "  ->  Bitmap Heap Scan on saiku_search_detail  (cost=2220.63..47362.94 rows=192025 width=0)"
    "        Recheck Cond: (rpt_date = '2016-05-13'::date)"
    "        ->  Bitmap Index Scan on saiku_search_detail_rpt_date_from_area_idx  (cost=0.00..2172.62 rows=192025 width=0)"

    可见使用了部分索引

    3.2 两个条件查询

    select
      count(1)
    from test.saiku_search_detail
    where rpt_date = '2016-05-13'
    and from_area_id = 135

    结果:65ms

    "Aggregate  (cost=46124.99..46125.00 rows=1 width=0)"
    "  ->  Bitmap Heap Scan on saiku_search_detail  (cost=1509.67..45857.37 rows=107047 width=0)"
    "        Recheck Cond: ((rpt_date = '2016-05-13'::date) AND (from_area_id = 135))"
    "        ->  Bitmap Index Scan on saiku_search_detail_rpt_date_from_area_idx  (cost=0.00..1482.90 rows=107047 width=0)"

    使用了索引

    总结

    • 废话:如果两个字段做为筛选条件,那么联合索引最优。
    • 收益:在日志分析过程中,除了日期的单个字段做为索引,其他的单个字段索引都不起作用,应该删除
    • 纠结:仅仅在日期建立单个索引,还是建立多个包含日期的复合索引?根据使用场景自己决定吧
  • 相关阅读:
    8.SpringMVC参数传递
    9.SpringMVC和json结合传递参数
    20160815命令行进入其他盘
    5.SpringMVC静态文件的访问
    6.SpringMVC注解启用
    7.SpringMVC注解优化
    3.SpringMVC修改配置文件路径和给界面传递数据
    rails
    SVN Merge合并 Patch打补丁
    Windows环境下怎么在文件夹下打开cmd命令行
  • 原文地址:https://www.cnblogs.com/liqiu/p/5494967.html
Copyright © 2011-2022 走看看