zoukankan      html  css  js  c++  java
  • 【原创】大叔经验分享(33)hive select count为0

    hive建表后直接将数据文件拷贝到table目录下,select * 可以查到数据,但是select count(1) 一直返回0,这个是因为hive中有个配置

    hive.stats.autogather=true

    Enables automated gathering of table-level statistics for newly created tables and table partitions, such as tables created with the INSERT OVERWRITE statement. The parameter does not produce column-level statistics, such as those generated by CBO. If disabled, administrators must manually generate the table-level statistics for newly generated tables and table partitions with the ANALYZE TABLE statement.

    可以通过describe来查看table的统计信息

    DESCRIBE EXTENDED $table_name;

    有个配置控制是否使用talbe的统计信息

    hive.compute.query.using.stats=true

    Instructs Hive to use statistics when generating query plans

    很多人建议的处理方法是

    set hive.compute.query.using.stats=false;

    正解应该是

    ANALYZE TABLE $table_name COMPUTE STATISTICS;

    ANALYZE TABLE $table_name partition(p=$1) COMPUTE STATISTICS;

    即重新计算统计信息


    参考:https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_hive-performance-tuning/content/ch_cost-based-optimizer.html

  • 相关阅读:
    java并发之CopyOnWriteArraySet
    MySQL 锁
    MySQL 汇总
    MySQL 删除重复数据
    插入排序-PHP
    选择排序-PHP
    快速排序-PHP
    一文轻松读懂微服务、集群、分布式的概念和区别一文轻松读懂微服务、集群、分布式的概念和区别
    k8s与docker简介
    冒泡排序-PHP
  • 原文地址:https://www.cnblogs.com/barneywill/p/10431323.html
Copyright © 2011-2022 走看看