zoukankan      html  css  js  c++  java
  • hbase0.96与hive0.12整合高可靠文档及问题总结

    本文链接:http://www.aboutyun.com/thread-7881-1-1.html

    问题导读:
    1.hive安装是否需要安装mysql?
    2.hive是否分为客户端和服务器端?
    3.hive的元数据库有哪两种?
    4.hive与hbase整合的关键是什么?
    5.hive的安装是否必须安装hadoop?
    6.hive与hbase整合需要做哪些准备工作?






    网上有很多资料,看到大部分都是一致的,看到一篇国外的文章,原来都是翻译的,并没有经过实践。这里记录一下实践的过程。
    本篇是在:
    hadoop2.2完全分布式最新高可靠安装文档
    http://www.aboutyun.com/thread-7684-1-1.html
    hbase 0.96整合到hadoop2.2三个节点全分布式安装高可靠文档 
    http://www.aboutyun.com/thread-7746-1-1.html
    基础上的一个继续:
    因为derby数据库使用的局限性,我们采用mysql作为元数据库。
    derby存在什么缺陷
    1.derby不能多个客户端登录
    2.derby登录必须在相同目录下,否则可能会找不到所创建的表。
    比如在/hive目录下启动hive程序,那么所创建的表就会存储在/hive下面保存。如果在/home下面,所创建的表就会在/home下面保存。这样导致初学者摸不着头脑。如果还不是不太明白可以,可以参考
    -----------
    hive使用derby作为元数据库找达到所创建表的原因
    http://www.aboutyun.com/thread-7803-1-1.html
    -----------
    下面我们开始安装:
    1.下载hivehive
    链接: http://pan.baidu.com/s/1eQw0o50 密码: mgy6
    2. 安装:
    tar zxvf hive-0.12.0.tar.gz 
    重命令名为:hive文件夹
    达到如下效果:
    <ignore_js_op> 




    3. 替换jar包,与hbase0.96和hadoop2.2版本一致

    由于我们下载的hive是基于hadoop1.3和hbase0.94的,所以必须进行替换,因为我们的hbse0.96是基于hadoop2.2的,所以我们必须先解决hive的hadoop版本问题,目前我们从官网下载的hive都是用1.几的版本编译的,因此我们需要自己下载源码来用hadoop2.X的版本重新编译hive,这个过程也很简单,只需要如下步骤:

    (1)进入/usr/hive/lib
    <ignore_js_op> 

    上面只是截取了一部分:

    (2)同步hbase的版本
    先cd到hive0.12.0/lib下,将hive-0.12.0/lib下hbase-0.94开头的那两个jar包删掉,然后从/home/hadoop/hbase-0.96.0-hadoop2/lib下hbase开头的包都拷贝过来

    find /usr/hbase/hbas/lib -name "hbase*.jar"|xargs -i cp {} ./
    <ignore_js_op> 


    (3)基本的同步完成了
    重点检查下zookeeper和protobuf的jar包是否和hbase保持一致,如果不一致,
    拷贝protobuf.**.jar和zookeeper-3.4.5.jar到hive/lib下。


    (4)用mysql当原数据库,
    找一个mysql的jdbcjar包mysql-connector-java-5.1.10-bin.jar也拷贝到hive-0.12.0/lib下

    可以通过下面命令来查找是否存在

    <ignore_js_op> 

    如果不存在则下载:
    链接: http://pan.baidu.com/s/1gdCDoGj 密码: 80yl
    --------------------------------------------------------------------------
    注意  mysql-connector-java-5.1.10-bin.jar
    修改权限为777 (chmod 777 mysql-connector-java-5.1.10-bin.jar)
    --------------------------------------------------------------------------
    还有,看一下hbase与hive的通信包是否存在:
    <ignore_js_op> 

    可以通过下面命令:

    aboutyun@master:/usr/hive/lib$ find -name hive-hbase-handler*
    ./hive-hbase-handler-0.13.0-SNAPSHOT.jar
    不存在则下载:

    链接: http://pan.baidu.com/s/1gd9p0Fh 密码: 94g1

    4. 安装mysql
    • Ubuntu 采用apt-get安装
    • sudo apt-get install mysql-server
    • 建立数据库hive
    • create database hivemeta
    • 创建hive用户,并授权
    • grant all on hive.* to hive@'%'  identified by 'hive';  
    • flush privileges;  
    对于musql的安装不熟悉,可以参考:

    Ubuntu下面卸载以及安装mysql
    http://www.aboutyun.com/thread-7788-1-1.html
    -------------------------------------------
    上面命令解释一下:
    • sudo apt-get install mysql-server安装数据服务器,如果想尝试通过其他客户端远程连接,则还需要安装mysql-client

    • create database hivemeta
    这个使用来存储hive元数据,所创建的数据库

    • grant all on hive.* to hive@'%'  identified by 'hive';  这个是授权,还是比较重要的,否则hive客户端远程连接会失败
    里面的内容不要照抄:需要根据自己的情况来修改。上面的用户名和密码都为hive。

    如果连接不成功尝试使用root用户

    1. grant all on hive.* to 'root'@'%'identified by '123';
    2. flush privileges;
    复制代码



    ------------------------------------------

    4. 修改hive-site文件配置:

    <ignore_js_op> 


    下面配置需要注意的是:
    (1)使用的是mysql的root用户,密码为123,如果你是用的hive,把用户名和密码该为hive即可:
    <ignore_js_op> 



    (2)hdfs新建文件并授予权限
    <ignore_js_op> 

    对于上面注意

    bin/hadoop fs -mkdir     /hive/warehouse
    bin/hadoop fs -mkdir      /hive/scratchdir
    bin/hadoop fs -chmod g+w  /hive/warehouse
    bin/hadoop fs -chmod g+w   /hive/scratchdir

    (3)hive.aux.jars.path切忌配置正确
    不能有换行或则空格。特别是换行,看到很多文章都把他们给分开了,这对很多新手是一个很容易掉进去的陷阱。

    1. <property>
    2.   <name>hive.aux.jars.path</name>
    3.   <value>file:///usr/hive/lib/hive-hbase-handler-0.13.0-SNAPSHOT.jar,file:///usr/hive/lib/protobuf-java-2.5.0.jar,file:///usr/hive/lib/hbase-client-0.96.0-hadoop2.jar,file:///usr/hive/lib/hbase-common-0.96.0-hadoop2.jar,file:///usr/hive/lib/zookeeper-3.4.5.jar,file:///usr/hive/lib/guava-11.0.2.jar</value>
    4. </property>
    复制代码




    <ignore_js_op> 


    上面问题解决,把下面内容放到hive-site文件即可
    --------------------------------
    这里介绍两种配置方式,一种是远程配置,一种是本地配置。最好选择远程配置

    远程配置

    1. <configuration>
    2. <property>
    3.   <name>hive.metastore.warehouse.dir</name>
    4.   <value>hdfs://master:8020/hive/warehouse</value>
    5. </property>
    6. <property>
    7.   <name>hive.exec.scratchdir</name>
    8.   <value>hdfs://master:8020/hive/scratchdir</value>
    9. </property>
    10. <property>
    11.   <name>hive.querylog.location</name>
    12.   <value>/usr/hive/logs</value>
    13. </property>
    14. <property>  
    15.   <name>javax.jdo.option.ConnectionURL</name>  
    16.   <value>jdbc:mysql://172.16.77.15:3306/hiveMeta?createDatabaseIfNotExist=true</value>  
    17. </property>  
    18. <property>  
    19.   <name>javax.jdo.option.ConnectionDriverName</name>  
    20.   <value>com.mysql.jdbc.Driver</value>  
    21. </property>  
    22. <property>  
    23.   <name>javax.jdo.option.ConnectionUserName</name>  
    24.   <value>hive</value>  
    25. </property>  
    26. <property>  
    27.   <name>javax.jdo.option.ConnectionPassword</name>  
    28.   <value>hive</value>  
    29. </property> 
    30. <property>
    31.   <name>hive.aux.jars.path</name>
    32.   <value>file:///usr/hive/lib/hive-hbase-handler-0.13.0-SNAPSHOT.jar,file:///usr/hive/lib/protobuf-java-2.5.0.jar,file:///usr/hive/lib/hbase-client-0.96.0-hadoop2.jar,file:///usr/hive/lib/hbase-common-0.96.0-hadoop2.jar,file:///usr/hive/lib/zookeeper-3.4.5.jar,file:///usr/hive/lib/guava-11.0.2.jar</value>
    33. </property>
    34. <property>
    35.   <name>hive.metastore.uris</name>  
    36.   <value>thrift://172.16.77.15:9083</value>  
    37. </property>  
    38. </configuration>
    复制代码




    本地配置:

    1. <configuration>  
    2. <property>  
    3.   <name>hive.metastore.warehouse.dir</name>  
    4.   <value>/user/hive_remote/warehouse</value>  
    5. </property>  
    6. <property>  
    7.   <name>hive.metastore.local</name>  
    8.   <value>true</value>  
    9. </property>  
    10. <property>  
    11.   <name>javax.jdo.option.ConnectionURL</name>  
    12.   <value>jdbc:mysql://localhost/hive_remote?createDatabaseIfNotExist=true</value>  
    13. </property>  
    14. <property>  
    15.   <name>javax.jdo.option.ConnectionDriverName</name>  
    16.   <value>com.mysql.jdbc.Driver</value>  
    17. </property>  
    18. <property>  
    19.   <name>javax.jdo.option.ConnectionUserName</name>  
    20.   <value>root</value>  
    21. </property>  
    22. <property>  
    23.   <name>javax.jdo.option.ConnectionPassword</name>  
    24.   <value>123</value>  
    25. </property>  
    26. </configuration>  
    复制代码




    -------------------------------------------------------------------------------------


    5. 修改其它配置:


    1.修改hadoop的hadoop-env.sh(否则启动hive汇报找不到类的错误)
    <ignore_js_op> 

    2.修改$HIVE_HOME/bin的hive-config.sh,增加以下三行
    <ignore_js_op> 







    首先说一些遇到的各种问题
    1.遇到的问题

    问题1:元数据库未启动
    这里首先概括一下,会遇到的问题。首先需要启动元数据库,通过下面命令:
    (1)hive --service metastore
    (2)hive  --service metastore -hiveconf hive.root.logger=DEBUG,console  

    注释:
    -hiveconf hive.root.logger=DEBUG,console命令的含义是进入debug模式,便于寻找错误


    如果不启用元数据库,而是使用下面命令

    1. hive
    复制代码




    你会遇到下面错误



    1. Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
    2.         at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:295)
    3.         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:679)
    4.         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
    5.         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    6.         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    7.         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    8.         at java.lang.reflect.Method.invoke(Method.java:606)
    9.         at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
    10. Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
    11.         at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1345)
    12.         at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:62)
    13.         at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
    14.         at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2420)
    15.         at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2432)
    16.         at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:289)
    17.         ... 7 more
    18. Caused by: java.lang.reflect.InvocationTargetException
    19.         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    20.         at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    21.         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    22.         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    23.         at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1343)
    24.         ... 12 more
    25. Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: 
    26. java.net.ConnectException: Connection refused
    27.         at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
    28.         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:288)
    29.         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:169)
    30.         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    31.         at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    32.         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    33.         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    34.         at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1343)
    35.         at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:62)
    36.         at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
    37.         at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2420)
    38.         at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2432)
    39.         at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:289)
    40.         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:679)
    41.         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
    42.         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    43.         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    44.         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    45.         at java.lang.reflect.Method.invoke(Method.java:606)
    46.         at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
    47. Caused by: java.net.ConnectException: Connection refused
    48.         at java.net.PlainSocketImpl.socketConnect(Native Method)
    49.         at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
    50.         at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
    51.         at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
    52.         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    53.         at java.net.Socket.connect(Socket.java:579)
    54.         at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
    55.         ... 19 more
    56. )
    57.         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:334)
    58.         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:169)
    59.         ... 17 more
    复制代码



    问题2:元数据库启动状态是什么样子的
    <ignore_js_op> 

    1. hive --service metastore
    2. Starting Hive Metastore Server
    3. 14/05/27 20:14:51 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
    4. 14/05/27 20:14:51 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
    5. 14/05/27 20:14:51 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
    6. 14/05/27 20:14:51 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
    7. 14/05/27 20:14:51 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
    8. 14/05/27 20:14:51 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
    9. 14/05/27 20:14:51 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
    复制代码



    刚开始遇到这种情况,我知道是因为可能没有配置正确,这个耗费了很长时间,一直没有找到正确的解决方案。当再次执行


    hive --service metastore
    命令的时候报4083端口被暂用: 报错如下红字部分。表示9083端口已经被暂用,也就是说客户端已经和主机进行了通信,当我在进行输入hive命令的时候,进入下面图1界面
    <ignore_js_op> 

    图1

    1. Could not create ServerSocket on address 0.0.0.0/0.0.0.0:9083.
    2.         at org.apache.thrift.transport.TServerSocket.<init>(TServerSocket.java:93)
    3.         at org.apache.thrift.transport.TServerSocket.<init>(TServerSocket.java:75)
    4.         at org.apache.hadoop.hive.metastore.TServerSocketKeepAlive.<init>(TServerSocketKeepAlive.java:34)
    5.         at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:4291)
    6.         at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:4248)
    7.         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    8.         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    9.         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    10.         at java.lang.reflect.Method.invoke(Method.java:606)
    11.         at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
    12. Exception in thread "main" org.apache.thrift.transport.TTransportException: Could not create ServerSocket on address 0.0.0.0/0.0.0.0:9083.
    13.         at org.apache.thrift.transport.TServerSocket.<init>(TServerSocket.java:93)
    14.         at org.apache.thrift.transport.TServerSocket.<init>(TServerSocket.java:75)
    15.         at org.apache.hadoop.hive.metastore.TServerSocketKeepAlive.<init>(TServerSocketKeepAlive.java:34)
    16.         at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:4291)
    17.         at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:4248)
    18.         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    19.         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    20.         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    21.         at java.lang.reflect.Method.invoke(Method.java:606)
    22.         at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
    复制代码

    对于端口的暂用,可以采用下面命令杀掉进程


    1. netstat -ap|grep 4083
    复制代码

    上面主要的作用是查出暂用端口的进程id,然后使用下面命令杀掉进程即可

    1. kill -9 进程号
    复制代码

    详细可以查看下面内容:
    使用配置hadoop中常用的Linux命令

    问题3:hive.aux.jars.path配置中含有看换行或则空格,报错如下


    错误表现1:/usr/hive/lib/hbase-client-0.96.0-
    hadoop2.jar
    整个路径错位,导致系统不能识别,这个错位,其实就是换行。

    1. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
    2. java.io.FileNotFoundException: File does not exist: hdfs://hydra0001/opt/module/hive-0.10.0-cdh4.3.0/lib/hive-builtins-0.10.0-cdh4.3.0.jar
    3. 2014-05-24 19:32:06,563 ERROR exec.Task (SessionState.java:printError(440)) - Job Submission failed with exception 'java.io.FileNotFoundException(File file:/usr/hive/lib/hbase-client-0.96.0-
    4. hadoop2.jar does not exist)'
    5. java.io.FileNotFoundException: File file:/usr/hive/lib/hbase-client-0.96.0-
    6. hadoop2.jar does not exist
    7.         at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:520)
    8.         at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:398)
    9.         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:337)
    10.         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
    11.         at org.apache.hadoop.mapreduce.JobSubmitter.copyRemoteFiles(JobSubmitter.java:139)
    12.         at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:212)
    13.         at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:300)
    14.         at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:387)
    15.         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
    16.         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
    17.         at java.security.AccessController.doPrivileged(Native Method)
    18.         at javax.security.auth.Subject.doAs(Subject.java:415)
    19.         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    20.         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
    21.         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
    22.         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
    23.         at java.security.AccessController.doPrivileged(Native Method)
    24.         at javax.security.auth.Subject.doAs(Subject.java:415)
    25.         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    26.         at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
    27.         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
    28.         at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:424)
    29.         at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
    30.         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:152)
    31.         at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
    32.         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1481)
    33.         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1258)
    34.         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1092)
    35.         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:932)
    36.         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:922)
    37.         at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
    38.         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
    39.         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
    40.         at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790)
    41.         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
    42.         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
    43.         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    44.         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    45.         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    46.         at java.lang.reflect.Method.invoke(Method.java:606)
    47.         at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
    48. 2014-05-24 19:32:06,571 ERROR ql.Driver (SessionState.java:printError(440)) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
    复制代码



    错误表现2:

    1. <property>
    2.   <name>hive.aux.jars.path</name>
    3.   <value>
    4. file:///usr/hive/lib/hive-hbase-handler-0.13.0-SNAPSHOT.jar,
    5. file:///usr/hive/lib/protobuf-java-2.5.0.jar,
    6. file:///usr/hive/lib/hbase-client-0.96.0-hadoop2.jar,
    7. file:///usr/hive/lib/hbase-common-0.96.0-hadoop2.jar,
    8. file:///usr/hive/lib/zookeeper-3.4.5.jar,
    9. file:///usr/hive/lib/guava-11.0.2.jar</value>
    10. </property>
    11. <property>
    复制代码




    上面看那上去很整洁,但是如果直接复制到配置文件中,就会产生下面错误。

    1. Caused by: java.net.URISyntaxException: Illegal character in scheme name at index 0: 
    2. file:///usr/hive/lib/protobuf-java-2.5.0.jar
    3.         at java.net.URI$Parser.fail(URI.java:2829)
    4.         at java.net.URI$Parser.checkChars(URI.java:3002)
    5.         at java.net.URI$Parser.checkChar(URI.java:3012)
    6.         at java.net.URI$Parser.parse(URI.java:3028)
    7.         at java.net.URI.<init>(URI.java:753)
    8.         at org.apache.hadoop.fs.Path.initialize(Path.java:203)
    9.         ... 37 more
    10. Job Submission failed with exception 'java.lang.IllegalArgumentException(java.net.URISyntaxException: Illegal character in scheme name at index 0: 
    11. file:///usr/hive/lib/protobuf-java-2.5.0.jar)'
    12. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
    复制代码








    验证hive与hbase的整合:
    一、启动hbase与hive
    启动hbase

    1. hbase shell
    复制代码


    <ignore_js_op> 



    启动hive
    (1)启动元数据库
    <ignore_js_op> 


    1. CREATE TABLE hbase_table_1(key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val") TBLPROPERTIES ("hbase.table.name" = "xyz");
    复制代码

    上面的含义是在hive中建表hbase_table_1,通过org.apache.hadoop.hive.hbase.HBaseStorageHandler这个类映射,在hbase建立与之对应的xyz表。
    (1)执行这个语句之前:
    首先查看hbase与hive:
    hbase为空:
    <ignore_js_op> 



    hive为空
    <ignore_js_op> 



    (2)执行

    1. CREATE TABLE hbase_table_1(key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val") TBLPROPERTIES ("hbase.table.name" = "xyz");
    复制代码

    <ignore_js_op> 



    (3)对比发生变化
    hbase显示新建表xyz
    <ignore_js_op> 


    hive显示新建表hbase_table_1

    <ignore_js_op> 



    三、验证整合,在hbase插入表

    (1)通过hbase添加数据
    在hbase中插入一条记录:

    1. put 'xyz','10001','cf1:val','www.aboutyun.com'
    复制代码

    <ignore_js_op> 



    分别查看hbase与hive表发生的变化:
    (1)hbase变化
    <ignore_js_op> 


    (2)hive变化
    <ignore_js_op> 

    (2)通过hive添加数据
    对于网上流行的通过pokes表,插入这里没有执行成功,通过网上查询,可能是hive0.12的一个bug.详细可以查看:

    1. INSERT OVERWRITE TABLE hbase_table_1 SELECT * FROM pokes;
    2. Total MapReduce jobs = 1
    3. Launching Job 1 out of 1
    4. Number of reduce tasks is set to 0 since there's no reduce operator
    5. java.lang.IllegalArgumentException: Property value must not be null
    6. at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
    7. at org.apache.hadoop.conf.Configuration.set(Configuration.java:810)
    8. at org.apache.hadoop.conf.Configuration.set(Configuration.java:792)
    9. at org.apache.hadoop.hive.ql.exec.Utilities.copyTableJobPropertiesToConf(Utilities.java:1996)
    10. at org.apache.hadoop.hive.ql.exec.FileSinkOperator.checkOutputSpecs(FileSinkOperator.java:864)
    11. at org.apache.hadoop.hive.ql.io.HiveOutputFormatImpl.checkOutputSpecs(HiveOutputFormatImpl.java:67)
    12. at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458)
    13. at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:342)
    14. at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
    15. at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
    16. at java.security.AccessController.doPrivileged(Native Method)
    17. at javax.security.auth.Subject.doAs(Subject.java:415)
    18. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    19. at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
    20. at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
    21. at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
    22. at java.security.AccessController.doPrivileged(Native Method)
    23. at javax.security.auth.Subject.doAs(Subject.java:415)
    24. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    25. at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
    26. at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
    27. at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:424)
    28. at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
    29. at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:152)
    30. at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
    31. at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1481)
    32. at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1258)
    33. at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1092)
    34. at org.apache.hadoop.hive.ql.Driver.run(Driver.java:932)
    35. at org.apache.hadoop.hive.ql.Driver.run(Driver.java:922)
    36. at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
    37. at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
    38. at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
    39. at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790)
    40. at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
    41. at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
    42. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    43. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    44. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    45. at java.lang.reflect.Method.invoke(Method.java:606)
    46. at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
    47. Job Submission failed with exception 'java.lang.IllegalArgumentException(Property value must not be null)'
    48. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
    复制代码




    网上找了很多资料,这个可能是一个bug,在hive0.13.0已经修复。
    详细见:
    https://issues.apache.org/jira/browse/HIVE-5515

  • 相关阅读:
    python 获取浏览器窗口句柄
    实现远程连接 Win10的Ubuntu子系统下的MySQL数据库
    Postman 测试微信小程序后台接口
    使用Postman获取小程序码时如何解决47001报错
    富文本编辑框比较
    PIL 生成随机验证码图片
    哪里买书合算
    在文件中读取列表功能
    python函数01
    修改文件内容
  • 原文地址:https://www.cnblogs.com/yunkaifa/p/3761594.html
Copyright © 2011-2022 走看看