zoukankan      html  css  js  c++  java
  • Hive之执行计划分析(explain)

    • Hive是通过把sql转换成对应mapreduce程序,然后提交到Hadoop上执行,查看具体的执行计划可以通过执行explain sql知晓
    • 一条sql会被转化成由多个阶段组成的步骤,每个步骤有执行顺序和依赖关系,可以称之为有向无环图(DAG:Directed Acyclic Graph)
    • 这些步骤可能包含:元数据的操作,文件系统的操作,map/reduce计算等
    • 语法格式:
    EXPLAIN [EXTENDED|DEPENDENCY|AUTHORIZATION|LOCKS|VECTORIZATION] query
    
    • explain输出内容包括:
      • 抽象语法树
      • 执行计划不同阶段的依赖关系
      • 各个阶段的描述
    • extended输出更加详细的信息
    • denpendency输出依赖的数据源
    • authorization输出执行sql授权信息
    • locks 输出锁情况
    • vectorization相关
      • Adds detail to the EXPLAIN output showing why Map and Reduce work is not vectorized.
      • Syntax: EXPLAIN VECTORIZATION [ONLY] [SUMMARY|OPERATOR|EXPRESSION|DETAIL]
      • ONLY option suppresses most non-vectorization elements.
      • SUMMARY (default) shows vectorization information for the PLAN (is vectorization enabled) and a summary of Map and Reduce work.
      • OPERATOR shows vectorization information for operators. E.g. Filter Vectorization. Includes all information of SUMMARY.
      • EXPRESSION shows vectorization information for expressions. E.g. predicateExpression. Includes all information of SUMMARY and OPERATOR.
      • DETAIL shows detail-level vectorization information. It includes all information of SUMMARY, OPERATOR, and EXPRESSION.
    • 带上FORMATTED 关键子,可以json格式输出
    • sort order: +表示升序 -表示降序
    • 大概了解一下相关的执行情况
    # explain默认
    0: jdbc:hive2://> explain select * from sort_test sort by id desc limit 10;
    +--------------------------------------------------------------------------------------------------+--+
    |                                             Explain                                              |
    +--------------------------------------------------------------------------------------------------+--+
    | STAGE DEPENDENCIES:                                                                              |
    |   Stage-1 is a root stage                                                                        |
    |   Stage-2 depends on stages: Stage-1                                                             |
    |   Stage-0 depends on stages: Stage-2                                                             |
    |                                                                                                  |
    | STAGE PLANS:                                                                                     |
    |   Stage: Stage-1                                                                                 |
    |     Map Reduce                                                                                   |
    |       Map Operator Tree:                                                                         |
    |           TableScan                                                                              |
    |             alias: sort_test                                                                     |
    |             Statistics: Num rows: 8 Data size: 890 Basic stats: COMPLETE Column stats: NONE      |
    |             Select Operator                                                                      |
    |               expressions: id (type: int), name (type: string)                                   |
    |               outputColumnNames: _col0, _col1                                                    |
    |               Statistics: Num rows: 8 Data size: 890 Basic stats: COMPLETE Column stats: NONE    |
    |               Reduce Output Operator                                                             |
    |                 key expressions: _col0 (type: int)                                               |
    |                 sort order: -                                                                    |
    |                 Statistics: Num rows: 8 Data size: 890 Basic stats: COMPLETE Column stats: NONE  |
    |                 value expressions: _col1 (type: string)                                          |
    |       Reduce Operator Tree:                                                                      |
    |         Select Operator                                                                          |
    |           expressions: KEY.reducesinkkey0 (type: int), VALUE._col0 (type: string)                |
    |           outputColumnNames: _col0, _col1                                                        |
    |           Statistics: Num rows: 8 Data size: 890 Basic stats: COMPLETE Column stats: NONE        |
    |           Limit                                                                                  |
    |             Number of rows: 10                                                                   |
    |             Statistics: Num rows: 8 Data size: 890 Basic stats: COMPLETE Column stats: NONE      |
    |             File Output Operator                                                                 |
    |               compressed: false                                                                  |
    |               table:                                                                             |
    |                   input format: org.apache.hadoop.mapred.SequenceFileInputFormat                 |
    |                   output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat       |
    |                   serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe                |
    |                                                                                                  |
    |   Stage: Stage-2                                                                                 |
    |     Map Reduce                                                                                   |
    |       Map Operator Tree:                                                                         |
    |           TableScan                                                                              |
    |             Reduce Output Operator                                                               |
    |               key expressions: _col0 (type: int)                                                 |
    |               sort order: -                                                                      |
    |               Statistics: Num rows: 8 Data size: 890 Basic stats: COMPLETE Column stats: NONE    |
    |               value expressions: _col1 (type: string)                                            |
    |       Reduce Operator Tree:                                                                      |
    |         Select Operator                                                                          |
    |           expressions: KEY.reducesinkkey0 (type: int), VALUE._col0 (type: string)                |
    |           outputColumnNames: _col0, _col1                                                        |
    |           Statistics: Num rows: 8 Data size: 890 Basic stats: COMPLETE Column stats: NONE        |
    |           Limit                                                                                  |
    |             Number of rows: 10                                                                   |
    |             Statistics: Num rows: 8 Data size: 890 Basic stats: COMPLETE Column stats: NONE      |
    |             File Output Operator                                                                 |
    |               compressed: false                                                                  |
    |               Statistics: Num rows: 8 Data size: 890 Basic stats: COMPLETE Column stats: NONE    |
    |               table:                                                                             |
    |                   input format: org.apache.hadoop.mapred.TextInputFormat                         |
    |                   output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat      |
    |                   serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe                      |
    |                                                                                                  |
    |   Stage: Stage-0                                                                                 |
    |     Fetch Operator                                                                               |
    |       limit: 10                                                                                  |
    |       Processor Tree:                                                                            |
    |         ListSink                                                                                 |
    |                                                                                                  |
    +--------------------------------------------------------------------------------------------------+--+
    
    # authorization
    0: jdbc:hive2://> explain formatted authorization  select * from sort_test sort by id desc limit 10;
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
    |                                                                                                               Explain                                                                                                               |
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
    | {"CURRENT_USER":"root","OPERATION":"SWITCHDATABASE","INPUTS":["badou@sort_test"],"OUTPUTS":["hdfs://master:9000/tmp/hive/root/fac1e10c-babb-4927-886e-411b3e9190fb/hive_2018-10-18_11-04-47_534_1155924552647075339-1/-mr-10000"]}  |
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
    

    参考资料

    【0】Hive wiki - LanguageManual Explain

    【1】hive入门学习:explain执行计划的理解

  • 相关阅读:
    Redis安装过程
    TDDL-剖析淘宝TDDL
    jQuery 遍历
    util-C# 复杂条件查询(sql 复杂条件查询)查询解决方案
    util-判断当前年份所处的季度,并返回当前季度开始的月份
    省市区(县)三级联动代码(js 数据源)
    JS:实用功能
    C#常用格式输出
    算术运算符
    C# 中的结构类型(struct type)
  • 原文地址:https://www.cnblogs.com/wadeyu/p/9809581.html
Copyright © 2011-2022 走看看