Cloudera’s Distribution Including Apache Hadoop,简称“CDH”,基于Web的用户界面,支持大多数Hadoop组件,包括HDFS、MapReduce、Hive、Spark、 Hbase、Zookeeper、Sqoop等,简化了大数据平台的安装、使用难度。本文档将演示CDH6.0.1的部署,该版本是目前最新版本,该版本Hadoop生态组件版本可以查看manifest.json文件,相关文件下载地址如下:
- https://archive.cloudera.com/cdh6/6.0.1/parcels/
- https://archive.cloudera.com/cm6/6.0.1/redhat7/yum/RPMS/x86_64/
- 实施环境说明
- 基础环境配置
- 搭建本地yum源
- 安装Mariadb数据库
- 安装Cloudera-manager-server
- 安装配置Cloudera-manager-client
- Web UI指南操作部署集群
实施环境说明
主机名 | IP | 配置 | 操作系统 | 角色 |
c1.heboan.com | 9.110.187.120 | 2核/8G | CentOS Linux release 7.2.1511 | cm-server、cm-agent、mariadb5.5 |
c2.heboan.com | 9.110.187.121 | 2核/8G | CentOS Linux release 7.2.1511 | cm-agent |
c3.heboan.com | 9.110.187.122 | 2核/8G | CentOS Linux release 7.2.1511 | cm-agent |
环境说明:
该环境配置非生产环境配置,是本人虚拟机的配置,一般企业大数据平台资源配置可如下参考:
测试集群环境:
机器数量: 5-10台
机器配置: 硬盘(4TB)、内存(24G-32G)、CPU(6核)、网卡(万兆)
生产集群环境:
小型集群: 20台以下
中型集群: 50台以下
大型集群: 50台以上
准备软件包放到/root/toos/目录下:
基础环境配置(所有机器进行的操作)
设置主机名
绑定主机名与ip的关系
# vim /etc/hosts ... 9.110.187.120 c1.heboan.com 9.110.187.121 c2.heboan.com 9.110.187.122 c3.heboan.com
配置NTP时间同步, 点我查看配置ntp服务器
设置ssh免密
//一路回车 # ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: 33:ca:61:3a:1c:15:5e:01:69:63:5c:e8:01:08:a2:b1 root@c3.heboan.com The key's randomart image is: +--[ RSA 2048]----+ |+. ..oo=+. | |oo. .Oo | |E +oo | | .. | | . o S | | . = o o | | + o | | . | | | +-----------------+
ssh-copy-id -i ~/.ssh/id_rsa.pub c1.heboan.com ssh-copy-id -i ~/.ssh/id_rsa.pub c2.heboan.com ssh-copy-id -i ~/.ssh/id_rsa.pub c3.heboan.com
配置文件打开数
#vim /etc/security/limits.conf //末尾加上 * soft nofile 32768 * hard nofile 1048576 * soft nproc 65536 * hard nproc unlimited * hard memlock ulimited * soft memlock unlimited
关闭防火墙
systemctl stop firewalld.service
systemctl disable firewalld.service
关闭selinux
//临时关闭,重启后失效 # setenforce 0 //修改配置文件 # vim /etc/sysconfig/selinux ... SELINUX=disabled
设置swap
# echo vm.swappiness = 10 >> /etc/sysctl.conf # sysctl -p
设置透明大页面
# echo never > /sys/kernel/mm/transparent_hugepage/defrag # echo never > /sys/kernel/mm/transparent_hugepage/enabled 将如下脚本添加到/etc/rc.d/rc.local文件中 if test -f /sys/kernel/mm/transparent_hugepage/enabled; then echo never > /sys/kernel/mm/transparent_hugepage/enabled fi if test -f /sys/kernel/mm/transparent_hugepage/defrag; then echo never > /sys/kernel/mm/transparent_hugepage/defrag fi
安装jdbc驱动
# mkdir -p /usr/share/java/ # cp mysql-connector-java-5.1.22.jar /usr/share/java/ # cd /usr/share/java/ # chmod 777 mysql-connector-java-5.1.22.jar # ln -s mysql-connector-java-5.1.22.jar mysql-connector-java.jar
安装jdbc
# yum install -y oracle-j2sdk1.8-1.8.0+update141-1.x86_64.rpm
配置环境变量
# cat /etc/profile.d/java.sh export JAVA_HOME=/usr/java/jdk1.8.0_141-cloudera export JAVA_BIN=/usr/java/jdk1.8.0_141-cloudera/bin export PATH=$PATH:$JAVA_HOME/bin export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export JAVA_HOME JAVA_BIN PATH CLASSPATH
搭建本地yum源(c1.heboan.com)
安装httpd服务
yum install -y httpd systemctl start httpd
制作repo
# yum install createrepo # cd /root/tools/cm/ # createrepo . # cd /root/tools/cdh # createrepo .
将cm目录移动到/var/www/html目录下,使得用户可以http访问这些rpm包
# cd /root/tools # mv cm/ /var/www/html/
配置repo源(所有机器执行此操作)
# vim /etc/yum.repos.d/cm.repo [cmrepo] name = cm_repo baseurl = http://c1.heboan.com/cm enable = true gpgcheck = false # yum repolist
配置数据(c1.heboan.com)
安装Mariadb
# yum install mariadb mariadb-server mariadb-devel # systemctl start mariadb # systemctl enable mariadb
初始化数据库设置root密码
mysql_secure_installation
使用root用户登录数据库创建相应的库和用户授权
create database cm default character set utf8; CREATE USER 'cm'@'%' IDENTIFIED BY '123456'; GRANT ALL PRIVILEGES ON cm. * TO 'cm'@'%'; create database hive default character set utf8; CREATE USER 'hive'@'%' IDENTIFIED BY '123456'; GRANT ALL PRIVILEGES ON hive. * TO 'hive'@'%'; create database am default character set utf8; CREATE USER 'am'@'%' IDENTIFIED BY '123456'; GRANT ALL PRIVILEGES ON am. * TO 'am'@'%'; create database hue default character set utf8; CREATE USER 'hue'@'%' IDENTIFIED BY '123456'; GRANT ALL PRIVILEGES ON hue. * TO 'hue'@'%'; create database oozie default character set utf8; CREATE USER 'oozie'@'%' IDENTIFIED BY '123456'; GRANT ALL PRIVILEGES ON oozie. * TO 'oozie'@'%'; FLUSH PRIVILEGES;
安装Cloudera-manager-server(c1.heboan.com)
通过yum安装
yum -y install cloudera-manager-server
初始化cm数据库
/opt/cloudera/cm/schema/scm_prepare_database.sh mysql cm cm password
启动Cloudera Manager Server
systemctl start cloudera-scm-server
温馨提示:
1、机器配置低的情况下,该服务启动时间较长
2、启动过程可查看日志:/var/log/cloudera-scm-server/cloudera-scm-server.log
3、该服务会启动两个端口: 7180(提供web界面给我们操作,7182用于和agent通信)
拷贝cdh包到/opt/cloudera/parcel-repo/
cp /var/www/html/cdh/CDH-6.0.1-1.cdh6.0.1.p0.590678-el7.parcel* /opt/cloudera/parcel-repo/ cd /opt/cloudera/parcel-repo/ mv CDH-6.0.1-1.cdh6.0.1.p0.590678-el7.parcel.sha256 CDH-6.0.1-1.cdh6.0.1.p0.590678-el7.parcel.sha
重启Cloudera Manager Server
systemctl restart cloudera-scm-server
安装Cloudera-manager-agent(所有机器)
通过yum安装
# yum -y install cloudera-manager-agent
修改agent配置文件
# vim /etc/cloudera-scm-agent/config.ini
...
server_host=c1.heboan.com #指向c1.heboan.com
启动
systemctl start cloudera-scm-agent
浏览器访问http://c1.heboan.com:7180进行web界面操作
账号密码:admin/admin
接受条款协议
选择免费版
选择集群主机
选择Parcel,点击“更多选项”, 把远程Parcel存储库URL都删除
sha1sum CDH-6.0.1-1.cdh6.0.1.p0.590678-el7.parcel #计算出值
然后创建 CDH-6.0.1-1.cdh6.0.1.p0.590678-el7.parcel.sha , 里面的内容为上面计算的值
cd /opt/cloudera/parcel-repo/chown -R cloudera-scm:cloudera-scm ./*
选择CDH6.0.1
执行到这一步,出现异常,老是回滚,进行不下去
查看日志
修改属主
# cd /opt/cloudera/parcel-repo # chown -R cloudera-scm:cloudera-scm ./*
然后就顺利进行下去了,^_^
检查主机
这里因为我虚拟机配置低,就选择Essentials,当然你在实际工作中更具需求来选择
分配角色到不同的主机,均衡分配即可
测试数据库连接
集群设置,根据实际情况设置,如数据目录位置等等
完成集群设置
是不是很激动
安装完成
解决CDH的web界面使用nginx代理一些静态文件无法加载
vim /opt/cm-5.13.0/share/cmf/webapp/WEB-INF/spring/mvc-config.xml .... 注释此行 <bean class="com.cloudera.server.web.cmf.csrf.CsrfRefererInterceptor" />
重启cloudera-scm-server