Hive安装配置
参考网址
http://blog.yidooo.net/archives/apache-hive-installation.html
http://www.cnblogs.com/linjiqin/archive/2013/03/04/2942402.html
Hbase In Action(HBase实战)和Hbase:The Definitive Guide(HBase权威指南)两本书中,有很多入门级的代码,可以选择自己感兴趣的check out。地址分别为 https://github.com/HBaseinaction https://github.com/larsgeorge/hbase-book 。
解压
$ tar -xzvf hive-x.y.z.tar.gz
tar zcvf hive-0.13.1.tar.gz hive
Hive配置
复制
cd conf
cp hive-default.xml.template hive-site.xml
cp hive-env.sh.template hive-env.sh
cp hive-log4j.properties.template hive-log4j.properties
cp hive-exec-log4j.properties.template hive-exec-log4j.properties
安装Mysql JDBC Connector
存储元数据是采用第三方的mysql数据库,这种情况下需要下载一个数据包mysql-connector-java-5.1.26-bin.jar,放到hive的lib目录下
cp mysql-connector-java-5.1.26-bin.jar hive/lib
修改配置
hive-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <!--<property> <name>hive.metastore.warehouse.dir</name> <value>hdfs://localhost:9000/hive/warehousedir</value> </property>-->
<property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> <description>location of default database for the warehouse</description> </property>
<!--<property> <name>hive.exec.scratchdir</name> <value>hdfs://localhost:9000/hive/scratchdir</value> </property>-->
<property> <name>hive.exec.scratchdir</name> <value>/tmp/hive-${user.name}</value> <description>Scratch space for Hive jobs</description> </property>
<property> <name>hive.querylog.location</name> <value>/local/usr/hive/logs</value> </property>
<!--<property> <name>hive.querylog.location</name> <value>/tmp/${user.name}</value> <description> Location of Hive run time structured log file </description> </property>-->
<property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://192.168.0.177:3306/hive?createDatabaseIfNotExist=true</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>yunho201311</value> </property> <property> <name>hive.aux.jars.path</name> <value>file:///usr/local/hive/lib/hive-hbase-handler-0.13.1.jar,file:///usr/local/hive/lib/protobuf-java-2.5.0.jar,file:///usr/local/hive/lib/hbase-client-0.98.6.1-hadoop2.jar,file:///usr/local/hive/lib/hbase-common-0.98.6.1-hadoop2.jar,file:///usr/local/hive/lib/zookeeper-3.4.5.jar,file:///usr/local/hive/lib/guava-11.0.2.jar</value> </property> <!--<property> <name>hive.metastore.uris</name> <value>thrift://192.168.0.177:9083</value> </property>--> <property> <name>hive.zookeeper.quorum</name> <value>192.168.0.177</value> <description>The list of ZooKeeper servers to talk to. This is only needed for read/write locks.</description> </property> </configuration> |
hive.aux.jars.path的value中间不允许有空格,回车,换行什么的,全部写在一行上就行了,不然会出各种错。这些jar要复制到hive的lib中。
•hive-env.sh
# Set HADOOP_HOME to point to a specific hadoop install directory HADOOP_HOME=/usr/local/hadoop
# Hive Configuration Directory can be controlled by: export HIVE_CONF_DIR=/usr/local/hive/conf
|
启动hive
./hive -hiveconf hive.root.logger=DEBUG,console
报的各种错误及解决方法
报错:
ERROR exec.DDLTask: java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/mapreduce/TableInputFormatBase
缺少 hbase-server-0.98.6.1-hadoop2.jar
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingInterface
缺少 hbase-protocol-0.98.6.1-hadoop2.jar
14/11/04 13:34:57 [main]: ERROR exec.DDLTask: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:MetaException(message:java.io.IOException: java.lang.reflect.InvocationTargetException
Caused by: java.lang.ClassNotFoundException: org.cloudera.htrace.Trace
缺少 htrace-core-2.04.jar
Specified key was too long; max key length is 767 bytes
这个问题,在网上看了一些资料,解决方法:
alter database hive character set latin1;
删除hive数据库,新建时设置为latin1即可;
设置数据库字符集
latin1 -- cp1252 West European
latin1_swedish_ci
启动
[root@iiot-test-server1 bin]# sh hive
Logging initialized using configuration in file:/usr/local/hive/conf/hive-log4j.properties
hive> show tables;
OK
dev_opt
dev_opt1
Time taken: 0.489 seconds, Fetched: 2 row(s)
集成
只想使用hive查询hbase的数据,并不对hbase数据进行修改,因此使用外表即可。
Hbase中的表 dev_opt
'dev_opt', {NAME => 'opt', DATA_BLOCK_ENCODING => ' true
NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '
0', COMPRESSION => 'NONE', VERSIONS => '1', TTL =>
'FOREVER', MIN_VERSIONS => '0', KEEP_DELETED_CELLS
=> 'false', BLOCKSIZE => '65536', IN_MEMORY => 'fal
se', BLOCKCACHE => 'true'}
dev_opt:只有一个列簇opt,下有两个列:opt:dvid,opt:value
创建外表
CREATE EXTERNAL TABLE dev_opt(key string, opt map<string,string>)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = "opt:")
TBLPROPERTIES("hbase.table.name" = "dev_opt");
在hive中查询:
Hive> select * from dev_opt;
结果:
…
bd98db741dfa471f8fbf413841da4b7e-test_yz-2014-10-29 15:37:11 {"dvid":"7","value":"1"}
c95808d07d83430d919b3766cafc3ff3-username-2014-10-22 09:51:13 {"dvid":"5","value":"commandvaluestr"}
Time taken: 0.138 seconds, Fetched: 38 row(s)
CREATE EXTERNAL TABLE dev_opt1(key string, dvid int, value string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = "opt:dvid,opt:value")
TBLPROPERTIES("hbase.table.name" = "dev_opt");
在hive中查询:
Hive> select * from dev_opt1;
…
bd98db741dfa471f8fbf413841da4b7e-test_yz-2014-10-29 15:36:34 3 1
bd98db741dfa471f8fbf413841da4b7e-test_yz-2014-10-29 15:37:11 7 1
c95808d07d83430d919b3766cafc3ff3-username-2014-10-22 09:51:13 5 commandvaluestr
Time taken: 0.986 seconds, Fetched: 38 row(s)
hive> select * from dev_opt1 where dvid = 5;
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/CompatibilityFactory
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addHBaseDependencyJars(TableMapReduceUtil.java:707)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:752)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.configureJobConf(HBaseStorageHandler.java:392)
at org.apache.hadoop.hive.ql.plan.PlanUtils.configureJobConf(PlanUtils.java:849)
at org.apache.hadoop.hive.ql.plan.MapWork.configureJobConf(MapWork.java:503)
at org.apache.hadoop.hive.ql.plan.MapredWork.configureJobConf(MapredWork.java:68)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:368)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.CompatibilityFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 26 more
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. org/apache/hadoop/hbase/CompatibilityFactory
hbase-hadoop2-compat-0.98.6.1-hadoop2.jar
hive> select * from dev_opt1 where dvid = 5;
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/CompatibilityFactory
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addHBaseDependencyJars(TableMapReduceUtil.java:707)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:752)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.configureJobConf(HBaseStorageHandler.java:392)
at org.apache.hadoop.hive.ql.plan.PlanUtils.configureJobConf(PlanUtils.java:849)
at org.apache.hadoop.hive.ql.plan.MapWork.configureJobConf(MapWork.java:503)
at org.apache.hadoop.hive.ql.plan.MapredWork.configureJobConf(MapredWork.java:68)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:368)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.CompatibilityFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 26 more
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. org/apache/hadoop/hbase/CompatibilityFactory
hbase-hadoop2-compat-0.98.6.1-hadoop2.jar
hbase-hadoop-compat-0.98.6.1-hadoop2.jar
java.lang.NoClassDefFoundError: org/cliffc/high_scale_lib/Counter
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addHBaseDependencyJars(TableMapReduceUtil.java:707)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:752)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.configureJobConf(HBaseStorageHandler.java:392)
at org.apache.hadoop.hive.ql.plan.PlanUtils.configureJobConf(PlanUtils.java:849)
at org.apache.hadoop.hive.ql.plan.MapWork.configureJobConf(MapWork.java:503)
at org.apache.hadoop.hive.ql.plan.MapredWork.configureJobConf(MapredWork.java:68)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:368)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.lang.ClassNotFoundException: org.cliffc.high_scale_lib.Counter
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 26 more
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. org/cliffc/high_scale_lib/Counter
high-scale-lib-1.1.1.jar
hive> select * from dev_opt1 where dvid = 5;
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1413857279729_0001, Tracking URL = http://iiot-test-server1:8088/proxy/application_1413857279729_0001/
Kill Command = /usr/local/hadoop/bin/hadoop job -kill job_1413857279729_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2014-11-06 13:42:31,721 Stage-1 map = 0%, reduce = 0%
2014-11-06 13:42:40,043 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.91 sec
MapReduce Total cumulative CPU time: 2 seconds 910 msec
Ended Job = job_1413857279729_0001
MapReduce Jobs Launched:
Job 0: Map: 1 Cumulative CPU: 2.91 sec HDFS Read: 259 HDFS Write: 134 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 910 msec
OK
100WJL001M11000000000001-lmm-2014-11-01 20:00:31 5 12
c95808d07d83430d919b3766cafc3ff3-username-2014-10-22 09:51:13 5 commandvaluestr
Time taken: 34.5 seconds, Fetched: 2 row(s)