zoukankan      html  css  js  c++  java
  • Hadoop 2.2.0部署安装(笔记,单机安装)

    SSH无密安装与配置

    具体配置步骤:

    ◎ 在root根目录下创建.ssh目录 (必须root用户登录)

    cd /root & mkdir .ssh

    chmod 700 .ssh & cd .ssh

    ◎ 创建密码为空的 RSA 密钥对:

    ssh-keygen -t rsa -P ""

    ◎ 在提示的对称密钥名称中输入 id_rsa将公钥添加至 authorized_keys 中:

    cat id_rsa.pub >> authorized_keys

    chmod 644 authorized_keys # 重要

    ◎ 编辑 sshd 配置文件 /etc/ssh/sshd_config ,把 #AuthorizedKeysFile  .ssh/authorized_keys 前面的注释取消掉。

    ◎ 重启 sshd 服务:

    service sshd restart

    ◎ 测试 SSH 连接。连接时会提示是否连接,按回车后会将此公钥加入至 knows_hosts 中:

    ssh localhost# 输入用户名密码 

     

     Hadoop 2.2.0部署安装

    具体步骤如下:

    ◎ 下载文件。

    ◎ 解压hadoop 配置环境。

    #root根目录下创建hadoop文件夹

    mkdir  hadoop;

    cd hadoop; 

    #将hadoop 2.2.0 安装文件放置到hadoop目录文件夹下

    #解压hadoop 2.2.0 文件 

    tar -zxvf hadoop-2.2.0.tar.gz

    #进入hadoop -2.2.0 文件夹

    cd hadoop-2.2.0

    #进入hadoop配置文件夹

    cd  etc/hadoop

    #修改core-site.xml 

    vi core-site.xml 添加以下信息(hadoop.tmp.dir、fs.default.name):

    <?xml version="1.0" encoding="UTF-8"?>
    
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <!--
    
      Licensed under the Apache License, Version 2.0 (the "License");
    
      you may not use this file except in compliance with the License.
    
      You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
      Unless required by applicable law or agreed to in writing, software
    
      distributed under the License is distributed on an "AS IS" BASIS,
    
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    
      See the License for the specific language governing permissions and
    
      limitations under the License. See accompanying LICENSE file.
    
    -->
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
    
    <property>
    
      <name>hadoop.tmp.dir</name>
    
      <value>/root/hadoop/hadoop-${user.name}</value>
    
      <description>A base for other temporaydirectories</description>
    
    </property>
    
    <property>
    
      <name>fs.default.name</name>
    
      <value>hdfs://localhost:8010</value>
    
      <description></description>
    
    </property>
    
    </configuration>
     
    
    #修改hdfs-site.xml配置文件, namenode和datanode存储路径的设置
    
    <?xml version="1.0" encoding="UTF-8"?>
    
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <!--
    
      Licensed under the Apache License, Version 2.0 (the "License");
    
      you may not use this file except in compliance with the License.
    
      You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
      Unless required by applicable law or agreed to in writing, software
    
      distributed under the License is distributed on an "AS IS" BASIS,
    
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    
      See the License for the specific language governing permissions and
    
      limitations under the License. See accompanying LICENSE file.
    
    -->
    
    <!-- Put site-specific property overrides in this file. -->
    <configuration>
       <property>
                       <name>dfs.namenode.name.dir</name>
                       <value>/root/hadoop/hdfs/namenode</value>
                       <description>Determineswhere on the local filesystem the DFS name node should store the name table. Ifthis is a comma-delimited list of directories then the name table is replicatedin all of the directories, for redundancy. </description>
                       <final>true</final>
             </property>
             <property>
                       <name>dfs.datanode.data.dir</name>
                       <value>/root/hadoop/hdfs/datanode</value>
                       <description>Determineswhere on the local filesystem an DFS data node should store its blocks. If thisis a comma-delimited list of directories, then data will be stored in all nameddirectories, typically on different devices.Directories that do not exist areignored.
                       </description>
                       <final>true</final>
             </property>
             <property>
                <!-- 副本个数-->
                       <name>dfs.replication</name>
                       <value>1</value>
             </property>
             <property>
               <name>dfs.permissions</name>
               <value>false</value>
             </property>
    </configuration>
    View Code

    #修改mapred-site.xml

    添加 dfs.namenode.name.dir、dfs.datanode.data.dir、dfs.replication、dfs.permissions等参数信息

     

    <?xml version="1.0"?>
    
    <!--
    
      Licensed under the Apache License, Version 2.0 (the "License");
    
      you may not use this file except in compliance with the License.
    
      You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
      Unless required by applicable law or agreed to in writing, software
    
      distributed under the License is distributed on an "AS IS" BASIS,
    
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    
      See the License for the specific language governing permissions and
    
      limitations under the License. See accompanying LICENSE file.
    
    -->
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
    <property>
     <name>mapred.job.tracker</name>
     <value>localhost:54311</value>
     <description>The host and port that the MapReduce job tracker runs
     at.  If "local", thenjobs are run in-process as a single map
     and reduce task.
    </description>
    </property>
    <property>
     <name>mapred.map.tasks</name>
     <value>10</value>
     <description>As a rule of thumb, use 10x the number of slaves(i.e., number of tasktrackers).</description>
    </property>
    <property>
     <name>mapred.reduce.tasks</name>
     <value>2</value>
     <description>As a rule of thumb, use 2x the number of slaveprocessors (i.e., number of tasktrackers).</description>
    </property>
    </configuration>
    View Code

    ◎ 设置java环境(接上述步骤)

    #修改hadoop-env.sh 设置java路径参数,export JAVA_HOME=/usr/local/jdk1.7

    # Copyright 2011 The Apache Software Foundation
    
    # 
    
    # Licensed to the Apache Software Foundation (ASF) under one
    
    # or more contributor license agreements.  See the NOTICE file
    
    # distributed with this work for additional information
    
    # regarding copyright ownership.  The ASF licenses this file
    
    # to you under the Apache License, Version 2.0 (the
    
    # "License"); you may not use this file except in compliance
    
    # with the License.  You may obtain a copy of the License at
    
    #
    
    #     http://www.apache.org/licenses/LICENSE-2.0
    
    #
    
    # Unless required by applicable law or agreed to in writing, software
    
    # distributed under the License is distributed on an "AS IS" BASIS,
    
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    
    # See the License for the specific language governing permissions and
    
    # limitations under the License.
    
    # Set Hadoop-specific environment variables here.
    
    # The only required environment variable is JAVA_HOME.  All others are
    
    # optional.  When running a distributed configuration it is best to
    
    # set JAVA_HOME in this file, so that it is correctly defined on
    
    # remote nodes.
    
    # The java implementation to use.
    
    export JAVA_HOME=/usr/local/jdk1.7
    
    # The jsvc implementation to use. Jsvc is required to run secure datanodes.
    
    #export JSVC_HOME=${JSVC_HOME}
    
    export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
    
    # Extra Java CLASSPATH elements.  Automatically insert capacity-scheduler.
    
    for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
    
      if [ "$HADOOP_CLASSPATH" ]; then
    
        export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
    
      else
    
        export HADOOP_CLASSPATH=$f
    
      fi
    
    done
    
    # The maximum amount of heap to use, in MB. Default is 1000.
    
    #export HADOOP_HEAPSIZE=
    
    #export HADOOP_NAMENODE_INIT_HEAPSIZE=""
    
    # Extra Java runtime options.  Empty by default.
    
    export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"
    
    # Command specific options appended to HADOOP_OPTS when specified
    
    export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS"
    
    export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS"
    
    export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"
    
    # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
    
    export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
    
    #HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData $HADOOP_JAVA_PLATFORM_OPTS"
    
    # On secure datanodes, user to run the datanode as after dropping privileges
    
    export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}
    
    # Where log files are stored.  $HADOOP_HOME/logs by default.
    
    #export HADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER 
    
    # Where log files are stored in the secure data environment.
    
    export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}
    
    # The directory where pid files are stored. /tmp by default.
    
    # NOTE: this should be set to a directory that can only be written to by 
    
    #       the user that will run the hadoop daemons.  Otherwise there is the
    
    #       potential for a symlink attack.
    
    export HADOOP_PID_DIR=${HADOOP_PID_DIR}
    
    export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}
    
    # A string representing this instance of hadoop. $USER by default.
    
    export HADOOP_IDENT_STRING=$USER
    View Code

     设置hadoop环境变量[HADOOP_HOME]

    vi /etc/profile 输入 export HADOOP_HOME=/root/hadoop/hadoop-2.2.0

    source /etc/profile  让环境变量生效。

    测试hadoop环境变量是否生效:

    echo $HADOOP_HOME 

    /root/hadoop/hadoop-2.2.0 

    ◎ 进入hadoop安装目录,进入bin目录,格式化hdfs

    ./hadoop namenode –format 

    ◎  启动hadoop ,进入hadoop安装目录,进入sbin目录。

    ./start-all.sh 

     验证安装,登录 http://localhost:50070/ 。

     文章转载请注明出处:http://www.cnblogs.com/likehua/p/3825810.html 

    相关推荐:

    sqoop安装参考:http://www.cnblogs.com/likehua/p/3825489.html
    hive安装参考:http://www.cnblogs.com/likehua/p/3825479.html

     

  • 相关阅读:
    Qt调用外部程序QProcess通信
    QT错误:collect2:ld returned 1 exit status
    ARM编译空间属性(转)
    深入C语言内存区域分配(进程的各个段)详解(转)
    Linux系统的组成和内核的组成
    C语言中,头文件和源文件的关系(转)
    Ubuntu安装samba服务器
    2018年应该做的事
    生活经历1
    学习笔记
  • 原文地址:https://www.cnblogs.com/likehua/p/3825810.html
Copyright © 2011-2022 走看看