zoukankan      html  css  js  c++  java
  • Postgres by BigSQL and Hadoop_fdw

    测试Postgresql和远程Hive的Join操作。

    测试环境

    Centos6.8

    HDP2.4集群,其中Hive Server2位于主机名为hdp的主机上

    Postgres by BigSQL(pg96)

    Installation Steps

    由于Postgres by BigSQL上有编译好的hadoop_fdw,只需用其pgc命令直接安装,否则要去编译hadoop_fdw源代码,这个编译过程中缺少各种依赖就放弃了,编译参考bulid

    下载包:

    $ wget http://oscg-downloads.s3.amazonaws.com/packages/postgresql-9.5.7-1-x64-bigsql.rpm
    

    以sudo权限安装rpm包:

    $ sudo yum localinstall postgresql-9.6.2-2-x64-bigsql.rpm
    

    Postgresql被安装到/opt/postgresql/pg96,Postgresql使用的所有库都位于/opt/postgresql/pg96/lib目录中,以减少冲突和其他不兼容的可能性。你可以添加--prefix以将包安装到你所指定的位置。

    你也可以将前面2步合在一起:

    $ sudo yum install http://oscg-downloads.s3.amazonaws.com/packages/postgresql-9.6.2-2-x64-bigsql.rpm
    

    Configure and initializing PostgreSQL Server

    以sudo权限执行下面命令:

    $ sudo /opt/postgresql/pgc start pg96
    

    Using the Database

    加载postgres环境变量:

    $ . /opt/postgresql/pg96/pg96.env
    

    查看pg96的状态:

    $ sudo /opt/postgresql/pgc status
    

    进入数据库:

    $ /opt/postgresql/pg96/bin/psql -U postgres -d postgres
    

    安装HadoopFDW前需要准备环境

    • Hadoop集群,并且其他机器可以访问hive的默认端口10000(这里使用的是HDP)
    • 将Hadoop集群中如下2个jar文件放到postgresql server机器上,我这里放到/opt/hadoop/hive-client-lib(若没有此目录,自行创建)
    /usr/hdp/2.4.0.0-169/
        |
        `--- hadoop/
             |
             `--- hadoop-common-2.7.1.2.4.0.0-169.jar
        |
        `--- hive/
             |
             `--- lib
                  |
                  `--- hive-jdbc-1.2.1000.2.4.0.0-169-standalone.jar
    

    postgresql server查看jar文件:

    $ ls /opt/hadoop/hive-client-lib/
    hadoop-common-2.7.1.2.4.0.0-169.jar  hive-jdbc-1.2.1000.2.4.0.0-169-standalone.jar
    
    • 测试Jdbc连接Hive
      在postgreSQL host上,用下面的内容创建一个小的Jdbc程序HiveJdbcClient.java
    import java.sql.Connection;
    import java.sql.DriverManager;
    import java.sql.ResultSet;
    import java.sql.SQLException;
    import java.sql.Statement;
    
    public class HiveJdbcClient {
    
        private static final String url      = "jdbc:hive2://hdp:10000";
        private static final String user     = "hive";
        private static final String password = "123456";
        private static final String query    = "SHOW DATABASES";
    
        private static final String driverName = "org.apache.hive.jdbc.HiveDriver";
    
        public static void main(String[] args) throws SQLException {
    
            try {
                Class.forName(driverName);
            } catch (ClassNotFoundException e) {
                e.printStackTrace();
                System.exit(1);
            }
    
            Connection con = DriverManager.getConnection(url, user, password);
            Statement stmt = con.createStatement();
    
            System.out.println("Running: " + query);
            ResultSet res = stmt.executeQuery(query);
    
            while (res.next()) {
                System.out.println(res.getString(1));
            }
        }
    }
    

    注意:hdp主机名和对应ip需要映射到/etc/hosts中。

    编译:

    javac HiveJdbcClient.java
    

    运行下面的命令执行程序:

    java -cp .:$(echo /opt/hadoop/hive-client-lib/*.jar | tr ' ' :) HiveJdbcClient
    

    最后2行输出:

    Running: SHOW DATABASES
    default
    
    • 假设jdk安装在/opt/jdk1.8.0_111,执行如下命令:
    ln -s /opt/jdk1.8.0_111/jre/lib/amd64/server/libjvm.so /opt/postgresql/pg96/lib/libjvm.so
    
    • /etc/profile中添加如下2句,并且source
    export LD_LIBRARY_PATH=/opt/jdk1.8.0_111/jre/lib/amd64/server:$LD_LIBRARY_PATH
    export HADOOP_FDW_CLASSPATH=/opt/postgresql/pg96/lib/postgresql/Hadoop_FDW.jar:$(echo /opt/hadoop/hive-client-lib/*.jar | tr ' ' :)
    

    其中LD_LIBRARY_PATH设置libjvm.so的父目录的环境变量,Hadoop_FDW.jar为后面安装完hadoop_fdw后生成在此目录中。

    以上所有配置完成后,重启pg96服务,使用下面命令:

    cd /opt/postgresql
    
    ./pgc restart pg96
    

    Install and Enable Hadoop-FDW

    ./pgc install hadoop_fdw2-pg96
    

    在hive所在机器上创建测试所需的表

    hive> show databases;
    OK
    default
    
    hive> create table test_fdw(id int, height float);
    
    hive> insert into test_fdw values(1, 1.68);
    
    hive> select * from test_fdw;
    OK
    1	1.68
    

    进入pg96使用

    /opt/postgresql/pg96/bin/psql -U postgres
    
    CREATE EXTENSION hadoop_fdw;
    
    CREATE SERVER hadoop_server FOREIGN DATA WRAPPER hadoop_fdw
      OPTIONS (HOST 'hdp', PORT '10000');
      
    CREATE USER MAPPING FOR PUBLIC SERVER hadoop_server;
    
    create foreign table foreign_hive(
         id int,
         height float)
         server hadoop_server OPTIONS (TABLE 'test_fdw');
    
    select * from foreign_hive;
     id |      height      
    ----+------------------
      1 | 1.67999994754791
    (1 row)
    

    测试Hive与本地Postgresql的join

    在postgresql上建表:

    create table local_postgresql (id int, name text);
    
    insert into local_postgresql values(1, 'li'),(2, 'wang');
    

    测试join查询:

    select * from foreign_hive join local_postgresql on foreign_hive.id= local_postgresql.id;
     id |      height      | id | name 
    ----+------------------+----+------
      1 | 1.67999994754791 |  1 | li
    (1 row)
    

    参考网址:

  • 相关阅读:
    VysorPro助手
    Play 2D games on Pixel running Android Nougat (N7.1.2) with Daydream View VR headset
    Play 2D games on Nexus 6P running Android N7.1.1 with Daydream View VR headset
    Native SBS for Android
    ADB和Fastboot最新版的谷歌官方下载链接
    How do I install Daydream on my phone?
    Daydream Controller手柄数据的解析
    蓝牙BLE传输性能及延迟分析
    VR(虚拟现实)开发资源汇总
    Android(Java)控制GPIO的方法及耗时分析
  • 原文地址:https://www.cnblogs.com/zeppelin/p/7252861.html
Copyright © 2011-2022 走看看