zoukankan      html  css  js  c++  java
  • 部署开启了Kerberos身份验证的大数据平台集群外客户端

    转载请注明出处 :http://www.cnblogs.com/xiaodf/

    本文档主要用于说明,如何在集群外节点上,部署大数据平台的客户端,此大数据平台已经开启了Kerberos身份验证。通过客户端用户在集群外就可以使用集群内的服务了,如查询集群内的hdfs数据,提交spark任务到集群内执行等操作。
    具体部署步骤如下所示:

    1、拷贝集群内hadoop相关组件包到客户端

    本地创建目录/opt/cloudera/parcels

    mkdir –R /opt/cloudera/parcels
    

    拷贝组件包CDH-5.7.2-1.cdh5.7.2.p0.18到目录/opt/cloudera/parcels
    进入目录建立软连接

    cd /opt/cloudrea/parcels
    ln –s CDH-5.7.2-1.cdh5.7.2.p0.18 CDH
    

    2、拷贝集群内hadoop相关配置文件到客户端

    创建目录/etc/hadoop,将/etc/hadoop/conf文件夹放入该目录,node1为集群内节点

    mkdir /etc/hadoop
    scp -r node1:/etc/hadoop/conf /etc/hadoop
    

    创建目录/etc/hive,将/etc/hive/conf文件夹放入该目录

    mkdir /etc/hive
    scp -r node1:/etc/hive/conf /etc/hive
    

    创建目录/etc/spark,将/etc/spark/conf文件夹放入该目录

    mkdir /etc/spark
    scp -r node1:/etc/spark/conf /etc/spark
    

    3、拷贝集群内身份验证相关配置文件krb5.conf到客户端

    scp node1:/etc/krb5.conf  /etc
    

    4、运行客户端脚本client.sh,文件内容如下:

    export HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop
    export HADOOP_CONF=/etc/hadoop/conf
    export HADOOP_CONF_DIR=/etc/hadoop/conf
    export YARN_CONF_DIR=/etc/hadoop/conf
    export SPARK_CONF_DIR=/etc/spark/conf
    #export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
    CDH_HOME="/opt/cloudera/parcels/CDH"
    export PATH=$CDH_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin/:$PATH
    ##beeline 连接hive进行sql查询
    cd /opt/cloudera/parcels/CDH/bin
    ./beeline -u "jdbc:hive2://node7:10000/;principal=hive/node7@HADOOP.COM" --config /etc/hive/conf
    ##执行hdfs命令
    #./hdfs --config /etc/hadoop/conf dfs -ls /
    ##提交spark命令
    #cd /opt/cloudera/parcels/CDH/lib/spark/bin
    #./spark-shell
    

    注意:
    1、客户端要与集群时间同步,否则身份认证会失败;
    2、客户端host要添加集群hosts,集群hosts可连接集群某一点获取;
    3、集群已开启kerberos身份验证,执行shell命令前,需要kinit进行身份验证,示例如下:

    #kinit认证命令
    [root@node5 client]# kinit -kt /home/user01.keytab user01
    #查看当前用户
    [root@node5 client]# klist
    Ticket cache: FILE:/tmp/krb5cc_0
    Default principal: user01@HADOOP.COM
    
    Valid starting       Expires              Service principal
    12/01/2016 20:48:50  12/02/2016 20:48:50  krbtgt/HADOOP.COM@HADOOP.COM
    	renew until 12/08/2016 20:48:50
    

    4、spark jdbc编程,同样需要调用kerberos身份验证,示例如下,完整工程看【spark jdbc 示例】目录下Security

    package kerberos.spark;
    
    
    import org.apache.hadoop.security.UserGroupInformation;
    
    import java.io.IOException;
    import java.sql.Connection;
    import java.sql.DriverManager;
    import java.sql.ResultSet;
    import java.sql.Statement;
    import java.util.Timer;
    import java.util.TimerTask;
    
    /*
     * 开启权限验证时,可以传入用户principal 和 keytab 进行身份验证
     */
    public class sparkjdbc {
       public static void main(String args[]) {
          final String principal = args[0];//用户对应principal,如user01
          final String keytab = args[1];//用户对应keytab,如/home/user01/user01.keytab
          String sql = args[2];//业务sql操作语句
          try {
             //1、身份验证:间隔12小时验证一次
             long interval=1;
             long now = System.currentTimeMillis();
             long start = interval - now % interval;
             Timer timer = new Timer();
             timer.schedule(new TimerTask(){
                public void run() {
                   org.apache.hadoop.conf.Configuration conf = new org.apache.hadoop.conf.Configuration();
                   conf.set("hadoop.security.authentication", "Kerberos");
                   UserGroupInformation.setConfiguration(conf);
                   try {
                      UserGroupInformation.loginUserFromKeytab(principal,keytab);
                      System.out.println("getting connection");
                      System.out.println("current user: "+UserGroupInformation.getCurrentUser());
                      System.out.println("login user: "+UserGroupInformation.getLoginUser());
                   } catch (IOException e) {
                      e.printStackTrace();
                   }
                   System.out.println("execute task!"+ this.scheduledExecutionTime());
                }
             },start,12*60*60*1000);//定时任务
    
             //正常业务,spark jdbc连接hive进行sql操作
             Class.forName("org.apache.hive.jdbc.HiveDriver");
             Connection con = DriverManager
                   .getConnection("jdbc:hive2://node7:10000/;principal=hive/node7@HADOOP.COM");
             System.out.println("got connection");
             Statement stmt = con.createStatement();
             ResultSet rs = stmt.executeQuery(sql);// executeQuery会返回结果的集合,否则返回空值
                System.out.println("打印输出结果:");
                while (rs.next()) {
                    System.out.println(rs.getString(1));// 入如果返回的是int类型可以用getInt()
                }
             
             con.close();
          } catch (Exception e) {
             e.printStackTrace();
          }
       }
    }
    
    
  • 相关阅读:
    安装node.js webkit环境[一]
    wpf 窗口最小化后,触发某事件弹出最小化窗口并置顶
    c# 旋转图片 无GDI+一般性错误
    类库里面添加日志记录 log4net
    string转xml
    DES c#加密后java解密
    使用排序字典排序
    怎么让一段xml被识别为字符串
    新装iis 页面503错误 DefaultAppPool停止解决方案
    hession
  • 原文地址:https://www.cnblogs.com/xiaodf/p/6279095.html
Copyright © 2011-2022 走看看