zoukankan      html  css  js  c++  java
  • presto-cli通过hive查询hdfs

    1.  启动hive metastore

    2. 启动hive thrift接口

    参考:http://www.cnblogs.com/kisf/p/7497261.html

    3. 下载presto server

    wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.186/presto-server-0.186.tar.gz
    

     

    4. 配置

    config.properties

    coordinator=true
    node-scheduler.include-coordinator=true
    http-server.http.port=20000
    query.max-memory=40GB
    query.max-memory-per-node=4GB
    exchange.http-client.request-timeout=20s
    http-server.threads.max=400
    discovery-server.enabled=true
    discovery.uri=http://10.183.225.158:20000
    

      该配置将coordinate与worker配置再一台机器上。集群部署建议分开:

    etc/config.properties包含 Presto Server 相关的配置,每一个 Presto Server 可以同时作为 coordinator 和 worker 使用。你可以将他们配置在一个节点上,但是,在一个大的集群上建议分开配置以提高性能。
    
    coordinator 的最小配置:
    coordinator=true
    node-scheduler.include-coordinator=false
    http-server.http.port=9090
    task.max-memory=1GB
    discovery-server.enabled=true
    discovery.uri= http://xx1:9090
    
    worker 的最小配置:
    coordinator=false
    http-server.http.port=9090
    task.max-memory=1GB
    discovery.uri= http://xx2:9090
    
    可选的,作为测试,你可以在一个节点上同时配置两者:
    coordinator=true
    node-scheduler.include-coordinator=true
    http-server.http.port=9090
    task.max-memory=1GB
    discovery-server.enabled=true
    discovery.uri=http://xx3:9090

    node.properties

    node.environment=xxxoipresto01
    node.data-dir=/data/slot0/presto/presto-01-coordinator
    node.id=presto-01-coord
    

      

    jvm.properties

    -server
    -Xmx10G
    -XX:+UseConcMarkSweepGC
    -XX:+ExplicitGCInvokesConcurrent
    -XX:+CMSClassUnloadingEnabled
    -XX:+AggressiveOpts
    -XX:+HeapDumpOnOutOfMemoryError
    -XX:ReservedCodeCacheSize=600M
    -XX:NativeMemoryTracking=summary
    -Dcom.sun.management.jmxremote
    -Dcom.sun.management.jmxremote.authenticate=false
    -Dcom.sun.management.jmxremote.ssl=false
    -Dcom.sun.management.jmxremote.port=20001
    -Dcom.sun.management.jmxremote.rmi.port=20001
    

      

    log.properties

    com.facebook.presto=INFO
    

      

    etc/catalog/hive.properties

    connector.name=hive-hadoop2
    hive.metastore.uri=thrift://hadoop1:9083
    hive.config.resources=/xxx/soft/hadoop-2.7.3/etc/hadoop/core-site.xml,/xxx/soft/hadoop-2.7.3/etc/hadoop/hdfs-site.xml
    
    hive.metastore.authentication.type=KERBEROS
    hive.metastore.service.principal=hive/hadoop1@JENKIN.COM
    hive.metastore.client.principal=presto@JENKIN.COM
    hive.metastore.client.keytab=/etc/presto/presto.keytab
    
    hive.hdfs.authentication.type=KERBEROS
    hive.hdfs.impersonation.enabled=false
    hive.hdfs.presto.principal=presto@JENKIN.COM
    hive.hdfs.presto.keytab=/etc/presto/presto.keytab
    

      

     

    4. 启动presto coordinate

    /usr/bin/nohup /xxx/soft/lesports-presto-server-0.176/bin/launcher run --node-config /etc/presto/01/node.properties --jvm-config /etc/presto/01/jvm.config --config /etc/presto/01/config.properties --log-levels-file /etc/presto/01/log.properties >/xxx/soft/lesports-presto-server-0.176/stdout.log 2>&1 &
    

      这里为了做集群隔离把配置文件放到了另外的目录。

    启动是否成功看日志,或者http://10.183.225.158:20000/看管理界面,或者:

    netstat -nlp | grep 20000
    

      

    5. 下载presto-cli

    wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.186/presto-cli-0.186-executable.jar
    

      

    6. 改名为presto-cli

    mv presto-cli-0.186-executable.jar presto-cli
    

      

    7. 启动

    ./presto-cli --server hadoop1:20000 --catalog hive --schema jenkindb
    

      10.183.225.158的hostname为hadoop1

    8. 查询

    presto:jenkindb> show tables;
      Table   
    ----------
     jenkintb 
     test1    
    (2 rows)
    
    Query 20171020_023712_00002_a78ua, FINISHED, 1 node
    Splits: 18 total, 18 done (100.00%)
    0:01 [2 rows, 49B] [2 rows/s, 51B/s]
    
    presto:jenkindb> select * from jenkintb;
     id |   name   
    ----+----------
      1 | jenkin   
      2 | jenkin.k 
      3 | anne     
    (3 rows)
    
    Query 20171020_023724_00003_a78ua, FINISHED, 1 node
    Splits: 17 total, 17 done (100.00%)
    0:00 [3 rows, 27B] [8 rows/s, 72B/s]
    

      

    hive导入数据参考:http://www.cnblogs.com/kisf/p/7497261.html

    问题解决:

    问题:failed: No worker nodes available 

    解决:presto coordinate config.properties增加 node-scheduler.include-coordinator=true

    问题:org.apache.hadoop.ipc.RemoteException: User: presto@JENKIN.COM is not allowed to impersonate root

    解决:hive.properties增加 hive.hdfs.impersonation.enabled=false

  • 相关阅读:
    C++ 将对象写入文件 并读取
    IronPython fail to add reference to WebDriver.dll
    How to Capture and Decrypt Lync Server 2010 TLS Traffic Using Microsoft Tools
    .net code injection
    数学系学生应该知道的十个学术网站
    Difference Between Currency Swap and FX Swap
    Swift开源parser
    谈谈我对证券公司一些部门的理解(前、中、后台)[z]
    JDK8记FullGC时候Metaspace内存不会被垃圾回收
    JVM源码分析之JDK8下的僵尸(无法回收)类加载器[z]
  • 原文地址:https://www.cnblogs.com/kisf/p/7698304.html
Copyright © 2011-2022 走看看