zoukankan      html  css  js  c++  java
  • presto-cli通过hive查询hdfs

    1.  启动hive metastore

    2. 启动hive thrift接口

    参考:http://www.cnblogs.com/kisf/p/7497261.html

    3. 下载presto server

    wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.186/presto-server-0.186.tar.gz
    

     

    4. 配置

    config.properties

    coordinator=true
    node-scheduler.include-coordinator=true
    http-server.http.port=20000
    query.max-memory=40GB
    query.max-memory-per-node=4GB
    exchange.http-client.request-timeout=20s
    http-server.threads.max=400
    discovery-server.enabled=true
    discovery.uri=http://10.183.225.158:20000
    

      该配置将coordinate与worker配置再一台机器上。集群部署建议分开:

    etc/config.properties包含 Presto Server 相关的配置,每一个 Presto Server 可以同时作为 coordinator 和 worker 使用。你可以将他们配置在一个节点上,但是,在一个大的集群上建议分开配置以提高性能。
    
    coordinator 的最小配置:
    coordinator=true
    node-scheduler.include-coordinator=false
    http-server.http.port=9090
    task.max-memory=1GB
    discovery-server.enabled=true
    discovery.uri= http://xx1:9090
    
    worker 的最小配置:
    coordinator=false
    http-server.http.port=9090
    task.max-memory=1GB
    discovery.uri= http://xx2:9090
    
    可选的,作为测试,你可以在一个节点上同时配置两者:
    coordinator=true
    node-scheduler.include-coordinator=true
    http-server.http.port=9090
    task.max-memory=1GB
    discovery-server.enabled=true
    discovery.uri=http://xx3:9090

    node.properties

    node.environment=xxxoipresto01
    node.data-dir=/data/slot0/presto/presto-01-coordinator
    node.id=presto-01-coord
    

      

    jvm.properties

    -server
    -Xmx10G
    -XX:+UseConcMarkSweepGC
    -XX:+ExplicitGCInvokesConcurrent
    -XX:+CMSClassUnloadingEnabled
    -XX:+AggressiveOpts
    -XX:+HeapDumpOnOutOfMemoryError
    -XX:ReservedCodeCacheSize=600M
    -XX:NativeMemoryTracking=summary
    -Dcom.sun.management.jmxremote
    -Dcom.sun.management.jmxremote.authenticate=false
    -Dcom.sun.management.jmxremote.ssl=false
    -Dcom.sun.management.jmxremote.port=20001
    -Dcom.sun.management.jmxremote.rmi.port=20001
    

      

    log.properties

    com.facebook.presto=INFO
    

      

    etc/catalog/hive.properties

    connector.name=hive-hadoop2
    hive.metastore.uri=thrift://hadoop1:9083
    hive.config.resources=/xxx/soft/hadoop-2.7.3/etc/hadoop/core-site.xml,/xxx/soft/hadoop-2.7.3/etc/hadoop/hdfs-site.xml
    
    hive.metastore.authentication.type=KERBEROS
    hive.metastore.service.principal=hive/hadoop1@JENKIN.COM
    hive.metastore.client.principal=presto@JENKIN.COM
    hive.metastore.client.keytab=/etc/presto/presto.keytab
    
    hive.hdfs.authentication.type=KERBEROS
    hive.hdfs.impersonation.enabled=false
    hive.hdfs.presto.principal=presto@JENKIN.COM
    hive.hdfs.presto.keytab=/etc/presto/presto.keytab
    

      

     

    4. 启动presto coordinate

    /usr/bin/nohup /xxx/soft/lesports-presto-server-0.176/bin/launcher run --node-config /etc/presto/01/node.properties --jvm-config /etc/presto/01/jvm.config --config /etc/presto/01/config.properties --log-levels-file /etc/presto/01/log.properties >/xxx/soft/lesports-presto-server-0.176/stdout.log 2>&1 &
    

      这里为了做集群隔离把配置文件放到了另外的目录。

    启动是否成功看日志,或者http://10.183.225.158:20000/看管理界面,或者:

    netstat -nlp | grep 20000
    

      

    5. 下载presto-cli

    wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.186/presto-cli-0.186-executable.jar
    

      

    6. 改名为presto-cli

    mv presto-cli-0.186-executable.jar presto-cli
    

      

    7. 启动

    ./presto-cli --server hadoop1:20000 --catalog hive --schema jenkindb
    

      10.183.225.158的hostname为hadoop1

    8. 查询

    presto:jenkindb> show tables;
      Table   
    ----------
     jenkintb 
     test1    
    (2 rows)
    
    Query 20171020_023712_00002_a78ua, FINISHED, 1 node
    Splits: 18 total, 18 done (100.00%)
    0:01 [2 rows, 49B] [2 rows/s, 51B/s]
    
    presto:jenkindb> select * from jenkintb;
     id |   name   
    ----+----------
      1 | jenkin   
      2 | jenkin.k 
      3 | anne     
    (3 rows)
    
    Query 20171020_023724_00003_a78ua, FINISHED, 1 node
    Splits: 17 total, 17 done (100.00%)
    0:00 [3 rows, 27B] [8 rows/s, 72B/s]
    

      

    hive导入数据参考:http://www.cnblogs.com/kisf/p/7497261.html

    问题解决:

    问题:failed: No worker nodes available 

    解决:presto coordinate config.properties增加 node-scheduler.include-coordinator=true

    问题:org.apache.hadoop.ipc.RemoteException: User: presto@JENKIN.COM is not allowed to impersonate root

    解决:hive.properties增加 hive.hdfs.impersonation.enabled=false

  • 相关阅读:
    signal(SIGCHLD, SIG_IGN)和signal(SIGPIPE, SIG_IGN);
    关于pthread_cond_wait使用while循环判断的理解
    linux的sleep()和usleep()的使用和区别
    C中结构体的存储分配
    扯扯python的多线程的同步锁 Lock RLock Semaphore Event Condition
    线程属性的初始化以及销毁
    Mysql数据库一个表字段中存了id,并以逗号分隔,id对应的详细信息在另一个表中
    sqlyog 注册码
    Oracle 12C卸载图文教程
    Oracle12c Release1安装图解(详解)
  • 原文地址:https://www.cnblogs.com/kisf/p/7698304.html
Copyright © 2011-2022 走看看