zoukankan      html  css  js  c++  java
  • Cloudera CDH 5集群搭建(yum 方式)

    1      集群环境

    主节点

    master001 ~~ master006

    从节点

    slave001 ~~ slave064

    2      安装CDH5的YUM源

    rpm -Uvhhttp://archive.cloudera.com/cdh5/one-click-install/redhat/6/x86_64/cloudera-cdh-5-0.x86_64.rpm

    wgethttp://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/cloudera-cdh5.repo

    mv cloudera-cdh5.repo /ect/yum.repo.d/

    3      ZooKeeper

    3.1    节点分配

    ZooKeeperServer :

    master002,master003, master004, master005, master006

    ZooKeeperClient :

    master001,master002, master003, master004, master005, master006

    3.2    安装

    ZooKeeper Client节点:

    yum install -y zookeeper

    ZooKeeper Server节点:

    yum install -y zookeeper-server

    3.3    配置

    1.zookeeper节点改动zookeeper配置文件

    /etc/zookeeper/conf/zoo.cfg

    maxClientCnxns=50

    # Thenumber of milliseconds of each tick

    tickTime=2000

    # Thenumber of ticks that the initial

    #synchronization phase can take

    initLimit=10

    # Thenumber of ticks that can pass between

    # sendinga request and getting an acknowledgement

    syncLimit=5

    # thedirectory where the snapshot is stored.

    dataDir=/data/disk01/zookeeper/zk_data

    dataLogDir=/data/disk01/zookeeper/zk_log

    # theport at which the clients will connect

    clientPort=2181

    server.2=master002:2888:3888

    server.3=master003:2888:3888

    server.4=master004:2888:3888

    server.5=master005:2888:3888

    server.6=master006:2888:3888

    2.初始化节点

    master002:

    service zookeeper-server init --myid=2

    master003:

    service zookeeper-server init --myid=3

    master004:

    service zookeeper-server init --myid=4

    master005:

    service zookeeper-server init --myid=5

    master006:

    service zookeeper-server init --myid=6

    3.执行zookeeper

    service zookeeper-server start

    3.4    安装路径

    程序路径

    /usr/lib/zookeeper/

    配置文件路径

    /etc/zookeeper/conf

    日志路径

    /var/log/zookeeper

    3.5    执行|关闭|查看状态

    ZooKeeper

    service zookeeper-server start|stop|status

    3.6    经常使用命令

    查看ZooKeeper节点状态

    zookeeper-server status

    手动清理日志

    /usr/lib/zookeeper/bin/zkCleanup.shdataLogDir [snapDir] -n count

    自己主动清理日志

    autopurge.purgeInterval 这个參数指定了清理频率,单位是小时,须要填写一个1或更大的整数,默认是0,表示不开启自己清理功能。

    autopurge.snapRetainCount 这个參数和上面的參数搭配使用,这个參数指定了须要保留的文件数目。默认是保留3个。

    3.7    測试

    https://github.com/phunt/zk-smoketest

    3.8    參考文献

    ZooKeeper參数配置

    http://my.oschina.net/u/128568/blog/194820

    ZooKeeper常见管理和运维

    http://nileader.blog.51cto.com/1381108/1032157

    4      HDFS

    4.1    节点分配(配置NN HA)

    namenode、zkfc:

    master002, master003

    datanode:

    slave001-slave064

    journalnode:

    master002, master003, master004

    4.2    安装

    namenode:

    yum install hadoop-hdfs-namenode

    yum install hadoop-hdfs-zkfc

    (yum install -y hadoop-hdfs-namenodehadoop-hdfs-zkfc hadoop-client)

    datanode:

    yum install hadoop-hdfs-datanode

    (yum install -y hadoop-hdfs-datanodehadoop-client)

    journalnode:

    yum install hadoop-hdfs-journalnode

    (yum install -y hadoop-hdfs-journalnode)

    全部节点:

    yum install hadoop-client

    4.3    配置

    1.配置文件

    /etc/hadoop/conf/core-site.xml

    <?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

    <configuration>

      <property>

        <name>fs.defaultFS</name>

        <value>hdfs://bdcluster</value>

      </property>

      <property>

        <name>fs.trash.interval</name>

        <value>1440</value>

      </property>

      <property>

       <name>hadoop.proxyuser.httpfs.hosts</name>

        <value>*</value>

      </property>

      <property>

        <name>hadoop.proxyuser.httpfs.groups</name>

        <value>*</value>

      </property>

    </configuration>

    /etc/hadoop/conf/hdfs-site.xml

    <?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

    <configuration>

      <property>

        <name>dfs.nameservices</name>

        <value>bdcluster</value>

      </property>

      <property>

       <name>dfs.ha.namenodes.bdcluster</name>

        <value>nn002,nn003</value>

      </property>

      <property>

       <name>dfs.namenode.rpc-address.bdcluster.nn002</name>

        <value>master002:8020</value>

      </property>

      <property>

       <name>dfs.namenode.rpc-address.bdcluster.nn003</name>

        <value>master003:8020</value>

      </property>

      <property>

        <name>dfs.namenode.http-address.bdcluster.nn002</name>

        <value>master002:50070</value>

      </property>

      <property>

       <name>dfs.namenode.http-address.bdcluster.nn003</name>

        <value>master003:50070</value>

      </property>

      <property>

       <name>dfs.namenode.shared.edits.dir</name>

       <value>qjournal://master002:8485;master003:8485;master004:8485/bdcluster</value>

      </property>

      <property>

       <name>dfs.journalnode.edits.dir</name>

       <value>/data/disk01/hadoop/hdfs/journalnode</value>

      </property>

      <property>

       <name>dfs.client.failover.proxy.provider.bdcluster</name>

       <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

      </property>

      <property>

       <name>dfs.ha.fencing.methods</name>

        <value>sshfence</value>

      </property>

      <property>

       <name>dfs.ha.fencing.ssh.private-key-files</name>

       <value>/var/lib/hadoop-hdfs/.ssh/id_dsa</value>

      </property>

      <property>

       <name>dfs.ha.automatic-failover.enabled</name>

        <value>true</value>

      </property>

      <property>

       <name>ha.zookeeper.quorum</name>

       <value>master002:2181,master003:2181,master004:2181,master005:2181,master006:2181</value>

      </property>

      <property>

       <name>dfs.permissions.superusergroup</name>

        <value>hadoop</value>

      </property>

      <property>

       <name>dfs.namenode.name.dir</name>

       <value>/data/disk01/hadoop/hdfs/namenode</value>

      </property>

      <property>

       <name>dfs.datanode.data.dir</name>

       <value>/data/disk01/hadoop/hdfs/datanode,/data/disk02/hadoop/hdfs/datanode,/data/disk03/hadoop/hdfs/datanode,/data/disk04/hadoop/hdfs/datanode,/data/disk05/hadoop/hdfs/datanode,/data/disk06/hadoop/hdfs/datanode,/data/disk07/hadoop/hdfs/datanode</value>

      </property>

      <property> 

        <name>dfs.datanode.failed.volumes.tolerated</name> 

        <value>3</value> 

      </property>

      <property>

       <name>dfs.datanode.max.xcievers</name>

        <value>4096</value>

      </property>

      <property>

       <name>dfs.webhdfs.enabled</name>

        <value>true</value>

      </property>

    </configuration>

    /etc/hadoop/conf/slaves

    slave001

    slave002

    slave064

    2.配置hdfs用户的免password登陆

    3.创建数据文件夹

    namenode

    mkdir -p/data/disk01/hadoop/hdfs/namenode

    chown -Rhdfs:hdfs /data/disk01/hadoop/hdfs/

    chown -Rhdfs:hdfs /data/disk01/hadoop/hdfs/namenode

    chmod 700/data/disk01/hadoop/hdfs/namenode

    datanode

    mkdir -p/data/disk01/hadoop/hdfs/datanode

    chmod 700/data/disk01/hadoop/hdfs/datanode

    chown -Rhdfs:hdfs /data/disk01/hadoop/hdfs/

    mkdir -p/data/disk02/hadoop/hdfs/datanode

    chmod 700/data/disk02/hadoop/hdfs/datanode

    chown -Rhdfs:hdfs /data/disk02/hadoop/hdfs/

    mkdir -p/data/disk03/hadoop/hdfs/datanode

    chmod 700/data/disk03/hadoop/hdfs/datanode

    chown -Rhdfs:hdfs /data/disk03/hadoop/hdfs/

    mkdir -p/data/disk04/hadoop/hdfs/datanode

    chmod 700/data/disk04/hadoop/hdfs/datanode

    chown -Rhdfs:hdfs /data/disk04/hadoop/hdfs/

    mkdir -p/data/disk05/hadoop/hdfs/datanode

    chmod 700/data/disk05/hadoop/hdfs/datanode

    chown -Rhdfs:hdfs /data/disk05/hadoop/hdfs/

    mkdir -p/data/disk06/hadoop/hdfs/datanode

    chmod 700/data/disk06/hadoop/hdfs/datanode

    chown -Rhdfs:hdfs /data/disk06/hadoop/hdfs/

    mkdir -p/data/disk07/hadoop/hdfs/datanode

    chmod 700/data/disk07/hadoop/hdfs/datanode

    chown -Rhdfs:hdfs /data/disk07/hadoop/hdfs/

    journalnode

    mkdir -p/data/disk01/hadoop/hdfs/journalnode

    chown -Rhdfs:hdfs /data/disk01/hadoop/hdfs/journalnode

    4.启动journalnode

    service hadoop-hdfs-journalnode start

    5.格式化namenode(master002)

    sudo -u hdfs hadoop namenode -format

    6.在ZooKeeper中初始化HA状态(namenodemaster002)

    hdfs zkfc -formatZK

    7.初始化Shared Editsdirectory(master002)

    hdfs namenode -initializeSharedEdits

    8.启动namenode

    formatted namenode(master002):

    service hadoop-hdfs-namenode start

    standby namenode(master003):

    sudo -u hdfs hdfs namenode-bootstrapStandby

    service hadoop-hdfs-namenode start

    9.启动datanode

    service hadoop-hdfs-datanode start

    10.启动zkfc(namenode)

    service hadoop-hdfs-zkfc start

    11.初始化HDFS文件夹

    /usr/lib/hadoop/libexec/init-hdfs.sh

    4.4    安装路径

    程序路径

    /usr/lib/hadoop-hdfs

    配置文件路径

    /etc/hadoop/conf

    日志路径

    /var/log/hadoop-hdfs

    4.5    执行|关闭|查看状态

    NameNode

    service hadoop-hdfs-namenodestart|stop|status

    DataNode

    service hadoop-hdfs-datanodestart|stop|status

    JournalNode

    service hadoop-hdfs-journalnodestart|stop|status

    zkfc

    service hadoop-hdfs-zkfc start|stop|status

    4.6    经常使用命令

    查看集群状态

    sudo -u hdfs hdfs dfsadmin -report

    检查文件及其副本

    sudo -u hdfs hdfs fsck [文件名称] -files-blocks -locations –racks

    5      YARN

    5.1    节点分配

    resourcemanager:

    master004

    nodemanager、mapreduce:

    slave001-slave064

    mapreduce-historyserver:

    master006

    5.2    安装

    resourcemanager:

    yum -y install hadoop-yarn-resourcemanager

    nodemanager:

    yum -y install hadoop-yarn-nodemanagerhadoop-mapreduce

    mapreduce-historyserver:

    yum -y installhadoop-mapreduce-historyserver hadoop-yarn-proxyserver

    全部节点

    yum -y install hadoop-client

    5.3    配置

    1.配置文件

    /etc/hadoop/conf/mapred-site.xml

    <?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

    <configuration>

      <property>

       <name>mapreduce.framework.name</name>

        <value>yarn</value>

      </property>

      <property>

       <name>mapreduce.task.io.sort.mb</name>

        <value>1024</value>

      </property>

      <property>

       <name>mapred.child.java.opts</name>

        <value>-XX:-UseGCOverheadLimit-Xms1024m -Xmx2048m</value>

      </property>

      <property>

       <name>yarn.app.mapreduce.am.command-opts</name>

        <value>-Xmx2048m</value>

      </property>

      <property>

       <name>mapreduce.jobhistory.address</name>

        <value>master006:10020</value>

        <description>MapReduce JobHistoryServer IPC host:port</description>

      </property>

      <property>

       <name>mapreduce.jobhistory.webapp.address</name>

        <value>master006:19888</value>

        <description>MapReduce JobHistoryServer Web UI host:port</description>

      </property>

      <property>

        <name>mapreduce.map.memory.mb</name>

        <value>2048</value>

      </property>

      <property>

       <name>mapreduce.reduce.memory.mb</name>

        <value>4096</value>

      </property>

      <property>

          <name>mapreduce.jobhistory.intermediate-done-dir</name>

          <value>/user/history/done_intermediate</value>

      </property>

      <property>

       <name>mapreduce.jobhistory.done-dir</name>

       <value>/user/history/done</value>

      </property>

    </configuration>

    /etc/hadoop/conf/yarn-site.xml

    <?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

    <configuration>

      <property>

       <name>yarn.resourcemanager.resource-tracker.address</name>

        <value>master004:8031</value>

      </property>

      <property>

       <name>yarn.resourcemanager.address</name>

        <value>master004:8032</value>

      </property>

      <property>

       <name>yarn.resourcemanager.scheduler.address</name>

        <value>master004:8030</value>

      </property>

      <property>

       <name>yarn.resourcemanager.admin.address</name>

        <value>master004:8033</value>

      </property>

      <property>

       <name>yarn.resourcemanager.webapp.address</name>

        <value>master004:8088</value>

      </property>

      <property>

       <name>yarn.nodemanager.aux-services</name>

       <value>mapreduce_shuffle</value>

      </property>

      <property>

        <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>

       <value>org.apache.hadoop.mapred.ShuffleHandler</value>

      </property>

      <property>

       <name>yarn.log-aggregation-enable</name>

        <value>true</value>

      </property>

      <property>

        <description>List of directories tostore localized files in.</description>

       <name>yarn.nodemanager.local-dirs</name>

       <value>/data/disk01/hadoop/yarn/local,/data/disk02/hadoop/yarn/local, /data/disk03/hadoop/yarn/local,/data/disk04/hadoop/yarn/local, /data/disk05/hadoop/yarn/local</value>

      </property>

      <property>

        <description>Where to store containerlogs.</description>

       <name>yarn.nodemanager.log-dirs</name>

        <value>/data/disk01/hadoop/yarn/logs,/data/disk02/hadoop/yarn/logs, /data/disk03/hadoop/yarn/logs,/data/disk04/hadoop/yarn/logs, /data/disk05/hadoop/yarn/logs</value>

      </property>

      <!--property>

        <description>Where to aggregate logsto.</description>

       <name>yarn.nodemanager.remote-app-log-dir</name>

       <value>/var/log/hadoop-yarn/apps</value>

      </property-->

      <property>

        <description>Classpath for typicalapplications.</description>

        <name>yarn.application.classpath</name>

         <value>

            $HADOOP_CONF_DIR,

           $HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,

           $HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,

           $HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,

           $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*

         </value>

      </property>

      <property>

       <name>yarn.app.mapreduce.am.staging-dir</name>

        <value>/user</value>

      </property>

      <property>

        <description>The minimum allocationfor every container request at the RM,

        in MBs. Memory requests lower than thiswon't take effect,

        and the specified value will get allocatedat minimum.</description>

       <name>yarn.scheduler.minimum-allocation-mb</name>

        <value>1024</value>

      </property>

      <property>

        <description>The maximum allocationfor every container request at the RM,

        in MBs. Memory requests higher than thiswon't take effect,

        and will get capped to thisvalue.</description>

       <name>yarn.scheduler.maximum-allocation-mb</name>

        <value>16384</value>

      </property>

      <property>

        <description>The minimum allocationfor every container request at the RM,

        in terms of virtual CPU cores. Requestslower than this won't take effect,

        and the specified value will get allocatedthe minimum.</description>

       <name>yarn.scheduler.minimum-allocation-vcores</name>

        <value>1</value>

      </property>

      <property>

        <description>The maximum allocationfor every container request at the RM,

        in terms of virtual CPU cores. Requestshigher than this won't take effect,

        and will get capped to thisvalue.</description>

       <name>yarn.scheduler.maximum-allocation-vcores</name>

        <value>32</value>

      </property>

      <property>

        <description>Number of CPU cores thatcan be allocated

        for containers.</description>

       <name>yarn.nodemanager.resource.cpu-vcores</name>

        <value>48</value>

      </property>

      <property>

        <description>Amount of physicalmemory, in MB, that can be allocated

        for containers.</description>

        <name>yarn.nodemanager.resource.memory-mb</name>

        <value>120000</value>

      </property>

      <property>

        <description>Ratio between virtualmemory to physical memory when

        setting memory limits for containers.Container allocations are

        expressed in terms of physical memory, andvirtual memory usage

        is allowed to exceed this allocation bythis ratio.

        </description>

       <name>yarn.nodemanager.vmem-pmem-ratio</name>

        <value>6</value>

      </property>

    </configuration>

    2. nodemanager创建本地文件夹

    mkdir -p/data/disk01/hadoop/yarn/local /data/disk02/hadoop/yarn/local/data/disk03/hadoop/yarn/local /data/disk04/hadoop/yarn/local/data/disk05/hadoop/yarn/local

    mkdir -p/data/disk01/hadoop/yarn/logs /data/disk02/hadoop/yarn/logs/data/disk03/hadoop/yarn/logs /data/disk04/hadoop/yarn/logs/data/disk05/hadoop/yarn/logs

    chown -Ryarn:yarn /data/disk01/hadoop/yarn /data/disk02/hadoop/yarn/local/data/disk03/hadoop/yarn /data/disk04/hadoop/yarn /data/disk05/hadoop/yarn

    chown -Ryarn:yarn /data/disk01/hadoop/yarn/local /data/disk02/hadoop/yarn/local/data/disk03/hadoop/yarn/local /data/disk04/hadoop/yarn/local/data/disk05/hadoop/yarn/local

    chown -Ryarn:yarn /data/disk01/hadoop/yarn/logs /data/disk02/hadoop/yarn/logs/data/disk03/hadoop/yarn/logs /data/disk04/hadoop/yarn/logs/data/disk05/hadoop/yarn/logs

    3. 创建history文件夹

    sudo -u hdfs hadoop fs -mkdir /user/history

    sudo -u hdfs hadoop fs -chmod -R 1777/user/history

    sudo -u hdfs hadoop fs -chown yarn/user/history

    4. 启动服务

    resourcemanager:

    sudo service hadoop-yarn-resourcemanagerstart

    nodemanager:

    sudo service hadoop-yarn-nodemanager start

    mapreduce-historyserver:

    sudo service hadoop-mapreduce-historyserverstart

    5.4    安装路径

    程序路径

    /usr/lib/hadoop-yarn

    配置文件路径

    /etc/hadoop/conf

    日志路径

    /var/log/hadoop-yarn

    5.5    执行|关闭|查看状态

    resourcemanager:

    service hadoop-yarn-resourcemanagerstart|stop|status

    nodemanager:

    service hadoop-yarn-nodemanagerstart|stop|status

    mapreduce-historyserver:

    service hadoop-mapreduce-historyserverstart|stop|status

    Edit

    5.6    经常使用命令

    查看节点状态

    yarn node -list -all

    resourcemanager管理

    yarm rmadmin ...

    6      HBase

    6.1    节点分配

    hbase-master

    master004, master005, master006

    hbase-regionserver

    slave001 ~~ 064

    hbase-thrift

    master004, master005, master006

    hbase-rest

    master004, master005, master006

    6.2    安装

    hbase-master

    yum install -y hbase hbase-master

    hbase-regionserver

    yum install -y hbase hbase-regionserver

    hbase-thrift

    yum install -y hbase-thrift

    hbase-rest

    yum install -y hbase-rest

    6.3    配置

    1.配置文件

    /etc/security/limits.conf

    hdfs -nofile 32768

    hbase -nofile 32768

    /etc/hbase/conf/hbase-site.xml

    <?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

    <configuration>

      <property>

       <name>hbase.rest.port</name>

        <value>60050</value>

      </property>

      <property>

       <name>hbase.zookeeper.quorum</name>

        <value>master002, master003,master004, master005,master006</value>

      </property>

      <property>

       <name>hbase.cluster.distributed</name>

        <value>true</value>

      </property>

      <property>

        <name>hbase.tmp.dir</name>

       <value>/tmp/hadoop/hbase</value>

      </property>

      <property>

        <name>hbase.rootdir</name>

        <value>hdfs://bdcluster/hbase/</value>

      </property>

    </configuration>

    /etc/hbase/conf/hbase-env.sh

    # Setenvironment variables here.

    # Thisscript sets variables multiple times over the course of starting an hbaseprocess,

    # so tryto keep things idempotent unless you want to take an even deeper look

    # intothe startup scripts (bin/hbase, etc.)

    # Thejava implementation to use.  Java 1.6required.

    # exportJAVA_HOME=/usr/java/default/

    # ExtraJava CLASSPATH elements.  Optional.

    # exportHBASE_CLASSPATH=

    # Themaximum amount of heap to use, in MB. Default is 1000.

    # exportHBASE_HEAPSIZE=1000

    # ExtraJava runtime options.

    # Beloware what we set by default.  May onlywork with SUN JVM.

    # Formore on why as well as other possible settings,

    # seehttp://wiki.apache.org/hadoop/PerformanceTuning

    exportHBASE_OPTS="-XX:+UseConcMarkSweepGC"

    #Uncomment one of the below three options to enable java garbage collectionlogging for the server-side processes.

    # Thisenables basic gc logging to the .out file.

    # exportSERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails-XX:+PrintGCDateStamps"

    exportSERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps$HBASE_GC_OPTS"

    exportSERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M$HBASE_GC_OPTS"

    # Thisenables basic gc logging to its own file.

    # IfFILE-PATH is not replaced, the log file(.gc) would still be generated in theHBASE_LOG_DIR .

    # exportSERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps-Xloggc:<FILE-PATH>"

    # Thisenables basic GC logging to its own file with automatic log rolling. Onlyapplies to jdk 1.6.0_34+ and 1.7.0_2+.

    # IfFILE-PATH is not replaced, the log file(.gc) would still be generated in theHBASE_LOG_DIR .

    # exportSERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps-Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1-XX:GCLogFileSize=512M"

    #Uncomment one of the below three options to enable java garbage collectionlogging for the client processes.

    # Thisenables basic gc logging to the .out file.

    # exportCLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails-XX:+PrintGCDateStamps"

    exportCLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps$HBASE_GC_OPTS"

    # Thisenables basic gc logging to its own file.

    # IfFILE-PATH is not replaced, the log file(.gc) would still be generated in theHBASE_LOG_DIR .

    # exportCLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps-Xloggc:<FILE-PATH>"

    # Thisenables basic GC logging to its own file with automatic log rolling. Onlyapplies to jdk 1.6.0_34+ and 1.7.0_2+.

    # IfFILE-PATH is not replaced, the log file(.gc) would still be generated in theHBASE_LOG_DIR .

    # exportCLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps-Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1-XX:GCLogFileSize=512M"

    #Uncomment below if you intend to use the EXPERIMENTAL off heap cache.

    # exportHBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize="

    # Sethbase.offheapcache.percentage in hbase-site.xml to a nonzero value.

    exportHBASE_USE_GC_LOGFILE=true

    #Uncomment and adjust to enable JMX exporting

    # Seejmxremote.password and jmxremote.access in $JRE_HOME/lib/management toconfigure remote password access.

    # Moredetails at:http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html

    #

    # exportHBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false-Dcom.sun.management.jmxremote.authenticate=false"

    # exportHBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE-Dcom.sun.management.jmxremote.port=10101"

    # exportHBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE-Dcom.sun.management.jmxremote.port=10102"

    # exportHBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE-Dcom.sun.management.jmxremote.port=10103"

    # exportHBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE-Dcom.sun.management.jmxremote.port=10104"

    # exportHBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10105"

    # Filenaming hosts on which HRegionServers will run. $HBASE_HOME/conf/regionservers by default.

    # exportHBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers

    #Uncomment and adjust to keep all the Region Server pages mapped to be memoryresident

    #HBASE_REGIONSERVER_MLOCK=true

    #HBASE_REGIONSERVER_UID="hbase"

    # Filenaming hosts on which backup HMaster will run. $HBASE_HOME/conf/backup-masters by default.

    # exportHBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters

    # Extrassh options.  Empty by default.

    # exportHBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR"

    # Wherelog files are stored.  $HBASE_HOME/logsby default.

    # exportHBASE_LOG_DIR=${HBASE_HOME}/logs

    # Enableremote JDWP debugging of major HBase processes. Meant for Core Developers

    # exportHBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"

    # exportHBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"

    # exportHBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"

    # exportHBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"

    # Astring representing this instance of hbase. $USER by default.

    # exportHBASE_IDENT_STRING=$USER

    # Thescheduling priority for daemon processes. See 'man nice'.

    # exportHBASE_NICENESS=10

    # Thedirectory where pid files are stored. /tmp by default.

    # exportHBASE_PID_DIR=/var/hadoop/pids

    # Secondsto sleep between slave commands.  Unsetby default.  This

    # can beuseful in large clusters, where, e.g., slave rsyncs can

    #otherwise arrive faster than the master can service them.

    # exportHBASE_SLAVE_SLEEP=0.1

    # TellHBase whether it should manage it's own instance of Zookeeper or not.

    exportHBASE_MANAGES_ZK=false

    # Thedefault log rolling policy is RFA, where the log file is rolled as per the sizedefined for the

    # RFAappender. Please refer to the log4j.properties file to see more details on thisappender.

    # In caseone needs to do log rolling on a date change, one should set the environmentproperty

    #HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA".

    # Forexample:

    #HBASE_ROOT_LOGGER=INFO,DRFA

    # Thereason for changing default to RFA is to avoid the boundary case of filling outdisk space as

    # DRFAdoesn't put any cap on the log size. Please refer to HBase-5655 for morecontext.

    2. 启动

    hbase-master

    service hbase-master start

    hbase-regionserver

    service hbase-regionserver start

    hbase-thrift

    service hbase-thrift start

    hbase-rest

    service hbase-rest start

    6.4    安装路径

    安装路径

    /usr/lib/hbase

    配置文件路径

    /etc/hbase/conf

    日志路径

    /var/log/hbase

    6.5    执行|关闭|查看状态

    hbase-master:

    service hbase-master start|stop|status

    hbase-regionserver:

    service hbase-regionserverstart|stop|status

    hbase-thrift:

    service hbase-thrift start|stop|status

    hbase-rest:

    service hbase-rest start|stop|status

    6.6    经常使用命令

    hbase shell

    7      Spark

    7.1    节点分配

    master002 ~~ master006

    7.2    安装

    yum install spark-core spark-masterspark-worker spark-python

    7.3    配置

    1. /etc/spark/conf/spark-env.sh

    export SPARK_HOME=/usr/lib/spark

    2. 部署Spark到HDFS

    source /etc/spark/conf/spark-env.sh

    hdfs dfs -mkdir -p /user/spark/share/lib

    sudo -u hdfs hdfs dfs -put/usr/lib/spark/assembly/lib/spark-assembly_2.10-0.9.0-cdh5.0.0-hadoop2.3.0-cdh5.0.0.jar/user/spark/share/lib/spark-assembly.jar

    7.4    安装路径

    程序路径

    /usr/lib/spark

    配置文件路径

    /etc/spark/conf

    日志路径

    /var/log/spark

    spark在hdfs的路径

    /user/spark/share/lib/spark-assembly.jar

    7.5    演示样例程序

    source /etc/spark/conf/spark-env.sh

    SPARK_JAR=hdfs://bdcluster/user/spark/share/lib/spark-assembly.jarAPP_JAR=$SPARK_HOME/examples/lib/spark-examples_2.10-0.9.0-cdh5.0.0.jar$SPARK_HOME/bin/spark-class org.apache.spark.deploy.yarn.Client --jar $APP_JAR--class org.apache.spark.examples.SparkPi --args yarn-standalone --args 10

  • 相关阅读:
    99. 恢复二叉搜索树
    337. 打家劫舍 III(dp+dfs)
    45. 跳跃游戏 II
    贪心
    460. LFU 缓存
    213. 打家劫舍 II(dp)
    MyScript 开发文档
    Android 全面屏体验
    Android Studio 导入自己编译的 framework jar
    cmake常用配置项
  • 原文地址:https://www.cnblogs.com/mengfanrong/p/4228752.html
Copyright © 2011-2022 走看看