Hive tuning tips - 走看看

zoukankan html css js c++ java

Hive tuning tips

1. limit
Hive has a configuration property to enable sampling of source data for use with LIMIT:
hive.limit.optimize.enable, set this parameter to true to optimize limit operation.
2. PARALLEL
if your job was designed to some stages, if these stages has no dependencies, you can execute them parallel by
setting the parameter : set hive.exec.parallel=true;
3. adjust the mapper and reducer task count
The default value of hive.exec.reducers.bytes.per.reduceris 1 GB. Changing this
value to 750 MB causes Hive to estimate four reducers for this job:
hive> set hive.exec.reducers.bytes.per.reducer=750000000;

--combinehiveinputformat
set hive.input.format;
set mapred.child.java.opts = -Xmx524m;
set hive.exec.reducers.bytes.per.reducer=100000000;
set hive.merge.size.per.task=10010001000;

Looking for a job working at Home about MSBI

查看全文

相关阅读:
shell流程控制
 shell编程变量介绍与表达式详解
 shell编程简介
 反向代理与负载均衡
 存储库之mongodb，redis，mysql
请求库之requests，selenium
解析库之re、beautifulsoup、pyquery
爬虫基本原理
 Django 函数和方法的区别
 Django 知识补漏单例模式

原文地址：https://www.cnblogs.com/huaxiaoyao/p/4364610.html

Copyright © 2011-2022 走看看