zoukankan      html  css  js  c++  java
  • hadoop2的伪分布部署

    通过我们前面的操作,已经可以编译并且打包产生适合本机的hadoop包,目录是/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0。

    使用root用户登录

    配置文件位于/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/etc/hadoop目录下。

    编辑文件hadoop-env.sh,修改export JAVA_HOME=/usr/local/jdk1.7.0_71

    (1)编辑文件core-site.xml,内容如下:

    <property>
    <name>hadoop.tmp.dir</name>
    <value>/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp/hadoop-${user.name}</value>
    </property>
    <property>
    <name>fs.default.name</name>
    <value>hdfs://admin:9000</value>
    </property>

    创建目录/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp

    执行命令,重命名文件mv mapred-site.xml.template  mapred-site.xml

    (2)编辑文件mapred-site.xml,内容如下:

    <property>
    <name>mapred.job.tracker</name>
    <value>admin:9001</value>
    </property>

    (3)编辑文件hdfs-site.xml,内容如下:

    <property>
    <name>dfs.replication</name>
    <value>1</value>
    </property>

    (4)编辑文件yarn-site.xml,内容如下:

    <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
    </property>

    <property>
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>

    (5)执行格式化命令,对namedoe进行格式化,输出如下信息:

    [root@admin hadoop-2.2.0]# pwd
    /usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0
    [root@admin hadoop-2.2.0]# bin/hdfs namenode -format
    14/12/23 15:04:06 INFO namenode.NameNode: STARTUP_MSG:
    /************************************************************
    STARTUP_MSG: Starting NameNode
    STARTUP_MSG: host = admin.lan/192.168.199.118
    STARTUP_MSG: args = [-format]
    STARTUP_MSG: version = 2.2.0
    STARTUP_MSG: classpath = /usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/etc/hadoop:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/guava-11.0.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jsch-0.1.42.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jackson-core-asl-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-el-1.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-httpclient-3.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jsp-api-2.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/activation-1.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-lang-2.5.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/mockito-all-1.8.5.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jettison-1.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/asm-3.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/hadoop-auth-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jackson-mapper-asl-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/avro-1.7.4.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/netty-3.6.2.Final.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/zookeeper-3.4.5.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/hadoop-annotations-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jsr305-1.3.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jasper-compiler-5.5.23.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jets3t-0.6.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jersey-json-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jackson-xc-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/junit-4.8.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-collections-3.2.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/xz-1.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jetty-6.1.26.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-compress-1.4.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jackson-jaxrs-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jasper-runtime-5.5.23.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-logging-1.1.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-net-3.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/paranamer-2.3.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/stax-api-1.0.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-io-2.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-math-2.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0-tests.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/hadoop-nfs-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/guava-11.0.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jackson-core-asl-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-el-1.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jsp-api-2.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-lang-2.5.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/asm-3.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jackson-mapper-asl-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jsr305-1.3.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jasper-runtime-5.5.23.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-logging-1.1.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-io-2.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/hadoop-hdfs-2.2.0-tests.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/hadoop-hdfs-nfs-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/hadoop-hdfs-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/aopalliance-1.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/javax.inject-1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/log4j-1.2.17.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/jackson-core-asl-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/guice-3.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/hamcrest-core-1.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/jersey-core-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/asm-3.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/jackson-mapper-asl-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/avro-1.7.4.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/hadoop-annotations-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/junit-4.10.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/xz-1.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/jersey-server-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/paranamer-2.3.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/commons-io-2.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-site-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-api-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-client-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-tests-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-common-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-common-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/javax.inject-1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-core-asl-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/guice-3.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/hamcrest-core-1.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/asm-3.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/hadoop-annotations-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/junit-4.10.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/xz-1.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-io-2.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.2.0.jar:/contrib/capacity-scheduler/*.jar
    STARTUP_MSG: build = Unknown -r Unknown; compiled by 'root' on 2014-12-18T09:20Z
    STARTUP_MSG: java = 1.7.0_71
    ************************************************************/
    14/12/23 15:04:06 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
    Formatting using clusterid: CID-c7f77023-a884-4886-a313-bc9a671aaeb5
    14/12/23 15:04:08 INFO namenode.HostFileManager: read includes:
    HostSet(
    )
    14/12/23 15:04:08 INFO namenode.HostFileManager: read excludes:
    HostSet(
    )
    14/12/23 15:04:08 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
    14/12/23 15:04:08 INFO util.GSet: Computing capacity for map BlocksMap
    14/12/23 15:04:08 INFO util.GSet: VM type = 32-bit
    14/12/23 15:04:08 INFO util.GSet: 2.0% max memory = 966.7 MB
    14/12/23 15:04:08 INFO util.GSet: capacity = 2^22 = 4194304 entries
    14/12/23 15:04:08 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
    14/12/23 15:04:08 INFO blockmanagement.BlockManager: defaultReplication = 1
    14/12/23 15:04:08 INFO blockmanagement.BlockManager: maxReplication = 512
    14/12/23 15:04:08 INFO blockmanagement.BlockManager: minReplication = 1
    14/12/23 15:04:08 INFO blockmanagement.BlockManager: maxReplicationStreams = 2
    14/12/23 15:04:08 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks = false
    14/12/23 15:04:08 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
    14/12/23 15:04:08 INFO blockmanagement.BlockManager: encryptDataTransfer = false
    14/12/23 15:04:08 INFO namenode.FSNamesystem: fsOwner = root (auth:SIMPLE)
    14/12/23 15:04:08 INFO namenode.FSNamesystem: supergroup = supergroup
    14/12/23 15:04:08 INFO namenode.FSNamesystem: isPermissionEnabled = true
    14/12/23 15:04:08 INFO namenode.FSNamesystem: HA Enabled: false
    14/12/23 15:04:08 INFO namenode.FSNamesystem: Append Enabled: true
    14/12/23 15:04:09 INFO util.GSet: Computing capacity for map INodeMap
    14/12/23 15:04:09 INFO util.GSet: VM type = 32-bit
    14/12/23 15:04:09 INFO util.GSet: 1.0% max memory = 966.7 MB
    14/12/23 15:04:09 INFO util.GSet: capacity = 2^21 = 2097152 entries
    14/12/23 15:04:09 INFO namenode.NameNode: Caching file names occuring more than 10 times
    14/12/23 15:04:09 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
    14/12/23 15:04:09 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
    14/12/23 15:04:09 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000
    14/12/23 15:04:09 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
    14/12/23 15:04:09 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
    14/12/23 15:04:09 INFO util.GSet: Computing capacity for map Namenode Retry Cache
    14/12/23 15:04:09 INFO util.GSet: VM type = 32-bit
    14/12/23 15:04:09 INFO util.GSet: 0.029999999329447746% max memory = 966.7 MB
    14/12/23 15:04:09 INFO util.GSet: capacity = 2^16 = 65536 entries
    14/12/23 15:04:09 INFO common.Storage: Storage directory /usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp/hadoop-root/dfs/name has been successfully formatted.
    14/12/23 15:04:09 INFO namenode.FSImage: Saving image file /usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp/hadoop-root/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
    14/12/23 15:04:09 INFO namenode.FSImage: Image file /usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp/hadoop-root/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 196 bytes saved in 0 seconds.
    14/12/23 15:04:09 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
    14/12/23 15:04:09 INFO util.ExitUtil: Exiting with status 0
    14/12/23 15:04:09 INFO namenode.NameNode: SHUTDOWN_MSG:
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at admin.lan/192.168.199.118


    ************************************************************/
    [root@hadoop10 hadoop-2.2.0]#

    (6)启动hdfs,执行命令结果如下

    [root@hadoop10 hadoop-2.2.0]# sbin/start-dfs.sh
    Starting namenodes on [hadoop10]
    hadoop10: starting namenode, logging to /usr/local/hadoop-dist/target/hadoop-2.2.0/logs/hadoop-root-namenode-hadoop10.out
    localhost: starting datanode, logging to /usr/local/hadoop-dist/target/hadoop-2.2.0/logs/hadoop-root-datanode-hadoop10.out
    Starting secondary namenodes [0.0.0.0]
    The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
    RSA key fingerprint is 3d:56:ae:31:73:66:9c:21:02:02:bc:5a:6b:bd:bf:75.
    Are you sure you want to continue connecting (yes/no)? yes
    0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.
    0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop-dist/target/hadoop-2.2.0/logs/hadoop-root-secondarynamenode-hadoop10.out
    [root@hadoop10 hadoop-2.2.0]# jps
    5256 SecondaryNameNode
    5015 NameNode
    5123 DataNode
    5352 Jps
    [root@hadoop10 hadoop-2.2.0]#

    (7)启动yarn,执行命令结果如下

    [root@hadoop10 hadoop-2.2.0]# sbin/start-yarn.sh
    starting yarn daemons
    starting resourcemanager, logging to /usr/local/hadoop-dist/target/hadoop-2.2.0/logs/yarn-root-resourcemanager-hadoop10.out
    localhost: starting nodemanager, logging to /usr/local/hadoop-dist/target/hadoop-2.2.0/logs/yarn-root-nodemanager-hadoop10.out

    格式化完成后,开始启动hadoop 程序。
    hadoop 启动的三种方式:

    第一种,一次性全部启动:

    执行start-all.sh 启动hadoop,观察控制台的输出,可以看到正在启动进程,分别是namenode、datanode、secondarynamenode、jobtracker、tasktracker,一共5 个,待执行完毕后,并不意味着这5 个进程成功启动,上面仅仅表示系统正在启动进程而已。我们使用jdk 的命令jps 查看进程是否已经正确启动。执行以下jps,如果看到了这5 个进程,说明hadoop 真的启动成功了。如果缺少一个或者多个,那就进入到“Hadoop的常见启动错误”章节寻找原因了。关闭hadoop 的命令是stop-all.sh。

    [root@hadoop10 hadoop-2.2.0]# jps
    5496 NodeManager
    5524 Jps
    5256 SecondaryNameNode
    5015 NameNode
    5123 DataNode
    5410 ResourceManager
    [root@hadoop10 hadoop-2.2.0]#

    上面的命令是最简单的,可以一次性把所有节点都启动、关闭。除此之外,还有其他命令,是分别启动的。

    第二种,分别启动HDFS 和MapReduce:

        执行命令start-dfs.sh,是单独启动hdfs。执行完该命令后,通过jps 能够看到NameNode、DataNode、SecondaryNameNode 三个进程启动了,该命令适合于只执行hdfs存储不使用MapReduce 做计算的场景。关闭的命令就是stop-dfs.sh 了。

        执行命令start-mapred.sh,可以单独启动MapReduce 的两个进程。关闭的命令就是stop-mapred.sh 了。当然,也可以先启动MapReduce,再启动HDFS。这说明,HDFS 和mapReduce的进程之间是互相独立的,没有依赖关系。

    第三种,分别启动各个进程,单个增加、删除节点:

    hadoop-daemon.sh start namenode
    hadoop-daemon.sh start datanode
    hadoop-daemon.sh start secondarynamenode
    hadoop-daemon.sh start jobtracker
    hadoop-daemon.sh start tasktracker

    (8)看到这5个java进程,就表示启动成功了。通过浏览器查看一下:

     

    节点查看:http://192.168.199.118:50070

    8042端口查看节点管理器资源状态

    查看集群状态:./bin/hdfs dfsadmin –report

    查看文件块组成:  ./bin/hdfs fsck / -files -blocks

    部署完成,这里我们就是实现了伪分布式单机hadoop的开发环境。后续发力会出hadoop 分布式文件系统,感受一下google的文件系统,其实和linux的GFS差不多。

    hadoop三种运行模式:

    单机模式(standalone):单机模式是Hadoop的默认模式。当首次解压Hadoop的源码包时,Hadoop无法了解硬件安装环境,便保守地选择了最小配置。在这种默认模式下所有3个XML文件均为空。当配置文件为空时,Hadoop会完全运行在本地。因为不需要与其他节点交互,单机模式就不使用HDFS,也不加载任何Hadoop的守护进程。该模式主要用于开发调试MapReduce程序的应用逻辑。
    伪分布模式(Pseudo-Distributed Mode):伪分布模式在“单节点集群”上运行Hadoop,其中所有的守护进程都运行在同一台机器上。该模式在单机模式之上增加了代码调试功能,允许你检查内存使用情况,HDFS输入输出,以及其他的守护进程交互。
    全分布模式(Fully Distributed Mode)。

    (9) 添加HADOOP_HOME环境变量:

    export HADOOP_HOME=/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/
    export PATH=.:/usr/local/protoc/bin:$FINDBUGS_HOME/bin:$MAVEN_HOME/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/bin:$PATH

    (10)集群验证:

    我们使用Hadoop自带的WordCount例子进行验证。该程序是统计文件中单词的出现次数的.先在HDFS创建几个数据目录:

    hadoop fs -mkdir -p /data/wordcount
    hadoop fs -mkdir -p /output/

    目录/data/wordcount用来存放Hadoop自带的WordCount例子的数据文件,运行这个MapReduce任务的结果输出到/output/wordcount目录中。
    将本地文件上传到HDFS中:

     hadoop fs -put /usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/etc/hadoop/*.xml /data/wordcount

    可以查看上传后的文件情况,执行如下命令:

    hadoop fs -ls /data/wordcount

    可以看到上传到HDFS中的文件。
    下面,运行WordCount例子,执行如下命令:

     hadoop jar /usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /data/wordcount /output/wordcount

    可以看到控制台输出程序运行的信息:

    [root@admin hadoop-2.2.0]# hadoop jar /usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /data/wordcount /output/wordcount
    14/12/23 16:59:26 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
    14/12/23 16:59:26 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
    14/12/23 16:59:27 INFO input.FileInputFormat: Total input paths to process : 7
    14/12/23 16:59:27 INFO mapreduce.JobSubmitter: number of splits:7
    14/12/23 16:59:27 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
    14/12/23 16:59:27 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
    14/12/23 16:59:27 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
    14/12/23 16:59:27 INFO Configuration.deprecation: mapreduce.combine.class is deprecated. Instead, use mapreduce.job.combine.class
    14/12/23 16:59:27 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
    14/12/23 16:59:27 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
    14/12/23 16:59:27 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
    14/12/23 16:59:27 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
    14/12/23 16:59:27 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
    14/12/23 16:59:27 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
    14/12/23 16:59:27 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
    14/12/23 16:59:27 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
    14/12/23 16:59:27 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1742380566_0001
    14/12/23 16:59:27 WARN conf.Configuration: file:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp/hadoop-root/mapred/staging/root1742380566/.staging/job_local1742380566_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
    14/12/23 16:59:27 WARN conf.Configuration: file:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp/hadoop-root/mapred/staging/root1742380566/.staging/job_local1742380566_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
    14/12/23 16:59:27 WARN conf.Configuration: file:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp/hadoop-root/mapred/local/localRunner/root/job_local1742380566_0001/job_local1742380566_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
    14/12/23 16:59:27 WARN conf.Configuration: file:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp/hadoop-root/mapred/local/localRunner/root/job_local1742380566_0001/job_local1742380566_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
    14/12/23 16:59:28 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
    14/12/23 16:59:28 INFO mapreduce.Job: Running job: job_local1742380566_0001
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: OutputCommitter set in config null
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: Waiting for map tasks
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: Starting task: attempt_local1742380566_0001_m_000000_0
    14/12/23 16:59:28 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
    14/12/23 16:59:28 INFO mapred.MapTask: Processing split: hdfs://admin:9000/data/wordcount/hadoop-policy.xml:0+9257
    14/12/23 16:59:28 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
    14/12/23 16:59:28 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
    14/12/23 16:59:28 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
    14/12/23 16:59:28 INFO mapred.MapTask: soft limit at 83886080
    14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
    14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
    14/12/23 16:59:28 INFO mapred.LocalJobRunner:
    14/12/23 16:59:28 INFO mapred.MapTask: Starting flush of map output
    14/12/23 16:59:28 INFO mapred.MapTask: Spilling map output
    14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufend = 12916; bufvoid = 104857600
    14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26210084(104840336); length = 4313/6553600
    14/12/23 16:59:28 INFO mapred.MapTask: Finished spill 0
    14/12/23 16:59:28 INFO mapred.Task: Task:attempt_local1742380566_0001_m_000000_0 is done. And is in the process of committing
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: map
    14/12/23 16:59:28 INFO mapred.Task: Task 'attempt_local1742380566_0001_m_000000_0' done.
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local1742380566_0001_m_000000_0
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: Starting task: attempt_local1742380566_0001_m_000001_0
    14/12/23 16:59:28 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
    14/12/23 16:59:28 INFO mapred.MapTask: Processing split: hdfs://admin:9000/data/wordcount/capacity-scheduler.xml:0+3560
    14/12/23 16:59:28 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
    14/12/23 16:59:28 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
    14/12/23 16:59:28 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
    14/12/23 16:59:28 INFO mapred.MapTask: soft limit at 83886080
    14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
    14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
    14/12/23 16:59:28 INFO mapred.LocalJobRunner:
    14/12/23 16:59:28 INFO mapred.MapTask: Starting flush of map output
    14/12/23 16:59:28 INFO mapred.MapTask: Spilling map output
    14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufend = 4457; bufvoid = 104857600
    14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26213132(104852528); length = 1265/6553600
    14/12/23 16:59:28 INFO mapred.MapTask: Finished spill 0
    14/12/23 16:59:28 INFO mapred.Task: Task:attempt_local1742380566_0001_m_000001_0 is done. And is in the process of committing
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: map
    14/12/23 16:59:28 INFO mapred.Task: Task 'attempt_local1742380566_0001_m_000001_0' done.
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local1742380566_0001_m_000001_0
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: Starting task: attempt_local1742380566_0001_m_000002_0
    14/12/23 16:59:28 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
    14/12/23 16:59:28 INFO mapred.MapTask: Processing split: hdfs://admin:9000/data/wordcount/yarn-site.xml:0+1000
    14/12/23 16:59:28 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
    14/12/23 16:59:28 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
    14/12/23 16:59:28 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
    14/12/23 16:59:28 INFO mapred.MapTask: soft limit at 83886080
    14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
    14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
    14/12/23 16:59:28 INFO mapred.LocalJobRunner:
    14/12/23 16:59:28 INFO mapred.MapTask: Starting flush of map output
    14/12/23 16:59:28 INFO mapred.MapTask: Spilling map output
    14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufend = 1322; bufvoid = 104857600
    14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26213988(104855952); length = 409/6553600
    14/12/23 16:59:28 INFO mapred.MapTask: Finished spill 0
    14/12/23 16:59:28 INFO mapred.Task: Task:attempt_local1742380566_0001_m_000002_0 is done. And is in the process of committing
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: map
    14/12/23 16:59:28 INFO mapred.Task: Task 'attempt_local1742380566_0001_m_000002_0' done.
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local1742380566_0001_m_000002_0
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: Starting task: attempt_local1742380566_0001_m_000003_0
    14/12/23 16:59:28 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
    14/12/23 16:59:28 INFO mapred.MapTask: Processing split: hdfs://admin:9000/data/wordcount/core-site.xml:0+910
    14/12/23 16:59:28 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
    14/12/23 16:59:28 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
    14/12/23 16:59:28 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
    14/12/23 16:59:28 INFO mapred.MapTask: soft limit at 83886080
    14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
    14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
    14/12/23 16:59:28 INFO mapred.LocalJobRunner:
    14/12/23 16:59:28 INFO mapred.MapTask: Starting flush of map output
    14/12/23 16:59:28 INFO mapred.MapTask: Spilling map output
    14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufend = 1298; bufvoid = 104857600
    14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26213988(104855952); length = 409/6553600
    14/12/23 16:59:28 INFO mapred.MapTask: Finished spill 0
    14/12/23 16:59:28 INFO mapred.Task: Task:attempt_local1742380566_0001_m_000003_0 is done. And is in the process of committing
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: map
    14/12/23 16:59:28 INFO mapred.Task: Task 'attempt_local1742380566_0001_m_000003_0' done.
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local1742380566_0001_m_000003_0
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: Starting task: attempt_local1742380566_0001_m_000004_0
    14/12/23 16:59:28 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
    14/12/23 16:59:28 INFO mapred.MapTask: Processing split: hdfs://admin:9000/data/wordcount/hdfs-site.xml:0+843
    14/12/23 16:59:28 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
    14/12/23 16:59:28 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
    14/12/23 16:59:28 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
    14/12/23 16:59:28 INFO mapred.MapTask: soft limit at 83886080
    14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
    14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
    14/12/23 16:59:28 INFO mapred.LocalJobRunner:
    14/12/23 16:59:28 INFO mapred.MapTask: Starting flush of map output
    14/12/23 16:59:28 INFO mapred.MapTask: Spilling map output
    14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufend = 1239; bufvoid = 104857600
    14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26213980(104855920); length = 417/6553600
    14/12/23 16:59:28 INFO mapred.MapTask: Finished spill 0
    14/12/23 16:59:28 INFO mapred.Task: Task:attempt_local1742380566_0001_m_000004_0 is done. And is in the process of committing
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: map
    14/12/23 16:59:28 INFO mapred.Task: Task 'attempt_local1742380566_0001_m_000004_0' done.
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local1742380566_0001_m_000004_0
    14/12/23 16:59:28 INFO mapred.LocalJobRunner: Starting task: attempt_local1742380566_0001_m_000005_0
    14/12/23 16:59:28 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
    14/12/23 16:59:28 INFO mapred.MapTask: Processing split: hdfs://admin:9000/data/wordcount/mapred-site.xml:0+838
    14/12/23 16:59:28 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
    14/12/23 16:59:29 INFO mapreduce.Job: Job job_local1742380566_0001 running in uber mode : false
    14/12/23 16:59:29 INFO mapreduce.Job: map 100% reduce 0%
    14/12/23 16:59:29 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
    14/12/23 16:59:29 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
    14/12/23 16:59:29 INFO mapred.MapTask: soft limit at 83886080
    14/12/23 16:59:29 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
    14/12/23 16:59:29 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
    14/12/23 16:59:29 INFO mapred.LocalJobRunner:
    14/12/23 16:59:29 INFO mapred.MapTask: Starting flush of map output
    14/12/23 16:59:29 INFO mapred.MapTask: Spilling map output
    14/12/23 16:59:29 INFO mapred.MapTask: bufstart = 0; bufend = 1230; bufvoid = 104857600
    14/12/23 16:59:29 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26213984(104855936); length = 413/6553600
    14/12/23 16:59:29 INFO mapred.MapTask: Finished spill 0
    14/12/23 16:59:29 INFO mapred.Task: Task:attempt_local1742380566_0001_m_000005_0 is done. And is in the process of committing
    14/12/23 16:59:29 INFO mapred.LocalJobRunner: map
    14/12/23 16:59:29 INFO mapred.Task: Task 'attempt_local1742380566_0001_m_000005_0' done.
    14/12/23 16:59:29 INFO mapred.LocalJobRunner: Finishing task: attempt_local1742380566_0001_m_000005_0
    14/12/23 16:59:29 INFO mapred.LocalJobRunner: Starting task: attempt_local1742380566_0001_m_000006_0
    14/12/23 16:59:29 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
    14/12/23 16:59:29 INFO mapred.MapTask: Processing split: hdfs://admin:9000/data/wordcount/httpfs-site.xml:0+620
    14/12/23 16:59:29 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
    14/12/23 16:59:29 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
    14/12/23 16:59:29 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
    14/12/23 16:59:29 INFO mapred.MapTask: soft limit at 83886080
    14/12/23 16:59:29 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
    14/12/23 16:59:29 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
    14/12/23 16:59:29 INFO mapred.LocalJobRunner:
    14/12/23 16:59:29 INFO mapred.MapTask: Starting flush of map output
    14/12/23 16:59:29 INFO mapred.MapTask: Spilling map output
    14/12/23 16:59:29 INFO mapred.MapTask: bufstart = 0; bufend = 939; bufvoid = 104857600
    14/12/23 16:59:29 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214060(104856240); length = 337/6553600
    14/12/23 16:59:29 INFO mapred.MapTask: Finished spill 0
    14/12/23 16:59:29 INFO mapred.Task: Task:attempt_local1742380566_0001_m_000006_0 is done. And is in the process of committing
    14/12/23 16:59:29 INFO mapred.LocalJobRunner: map
    14/12/23 16:59:29 INFO mapred.Task: Task 'attempt_local1742380566_0001_m_000006_0' done.
    14/12/23 16:59:29 INFO mapred.LocalJobRunner: Finishing task: attempt_local1742380566_0001_m_000006_0
    14/12/23 16:59:29 INFO mapred.LocalJobRunner: Map task executor complete.
    14/12/23 16:59:29 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
    14/12/23 16:59:29 INFO mapred.Merger: Merging 7 sorted segments
    14/12/23 16:59:29 INFO mapred.Merger: Down to the last merge-pass, with 7 segments left of total size: 13662 bytes
    14/12/23 16:59:29 INFO mapred.LocalJobRunner:
    14/12/23 16:59:29 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
    14/12/23 16:59:29 INFO mapred.Task: Task:attempt_local1742380566_0001_r_000000_0 is done. And is in the process of committing
    14/12/23 16:59:29 INFO mapred.LocalJobRunner:
    14/12/23 16:59:29 INFO mapred.Task: Task attempt_local1742380566_0001_r_000000_0 is allowed to commit now
    14/12/23 16:59:29 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1742380566_0001_r_000000_0' to hdfs://admin:9000/output/wordcount/_temporary/0/task_local1742380566_0001_r_000000
    14/12/23 16:59:29 INFO mapred.LocalJobRunner: reduce > reduce
    14/12/23 16:59:29 INFO mapred.Task: Task 'attempt_local1742380566_0001_r_000000_0' done.
    14/12/23 16:59:30 INFO mapreduce.Job: map 100% reduce 100%
    14/12/23 16:59:30 INFO mapreduce.Job: Job job_local1742380566_0001 completed successfully
    14/12/23 16:59:30 INFO mapreduce.Job: Counters: 32
    File System Counters
    FILE: Number of bytes read=2203023
    FILE: Number of bytes written=4000234
    FILE: Number of read operations=0
    FILE: Number of large read operations=0
    FILE: Number of write operations=0
    HDFS: Number of bytes read=116652
    HDFS: Number of bytes written=6042
    HDFS: Number of read operations=105
    HDFS: Number of large read operations=0
    HDFS: Number of write operations=10
    Map-Reduce Framework
    Map input records=448
    Map output records=1896
    Map output bytes=23401
    Map output materialized bytes=13732
    Input split bytes=794
    Combine input records=1896
    Combine output records=815
    Reduce input groups=352
    Reduce shuffle bytes=0
    Reduce input records=815
    Reduce output records=352
    Spilled Records=1630
    Shuffled Maps =0
    Failed Shuffles=0
    Merged Map outputs=0
    GC time elapsed (ms)=237
    CPU time spent (ms)=0
    Physical memory (bytes) snapshot=0
    Virtual memory (bytes) snapshot=0
    Total committed heap usage (bytes)=1231712256
    File Input Format Counters
    Bytes Read=17028
    File Output Format Counters
    Bytes Written=6042

    查看结果,执行如下命令:

    hadoop fs -cat /output/wordcount/part-r-00000 | head

    [root@admin hadoop-2.2.0]# hadoop fs -cat /output/wordcount/part-r-00000 | head

    [root@admin hadoop-2.2.0]# hadoop fs -text /output/wordcount/part-r-00000 

    "*" 17
    "AS 7
    "License"); 7
    "alice,bob 17
    (ASF) 1
    (root 1
    (the 7
    --> 13
    -1. 1
    0.0 1
    cat: Unable to write to output stream.

    登录到Web控制台,访问链接http://admin:8088/可以看到任务记录情况。
    可见,我们的HDFS能够存储数据,而YARN集群也能够运行MapReduce任务。

    (11) 运行简单的MapReduce 计算

    在/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce 下有个jar 包,叫hadoop-mapreduce-examples-2.2.0.jar,这里面含有框架提供的很多例子.我们现在学习一下如何运行其中的例子吧.

    执行命令:

    [root@admin mapreduce]# hadoop jar ./hadoop-mapreduce-examples-2.2.0.jar 

    可以看到输出信息,可以看到18 个输出信息,都是内置的例子程序.

    (12)问题及总结

    • 需要知道的默认配置

    在Hadoop 2.2.0中,YARN框架有很多默认的参数值,如果你是在机器资源比较不足的情况下,需要修改这些默认值,来满足一些任务需要。
    NodeManager和ResourceManager都是在yarn-site.xml文件中配置的,而运行MapReduce任务时,是在mapred-site.xml中进行配置的。
    下面看一下相关的参数及其默认值情况:

    参数名称 默认值 进程名称 配置文件 含义说明
    yarn.nodemanager.resource.memory-mb 8192 NodeManager yarn-site.xml 从节点所在物理主机的可用物理内存总量
    yarn.nodemanager.resource.cpu-vcores 8 NodeManager yarn-site.xml 节点所在物理主机的可用虚拟CPU资源总数(core)
    yarn.nodemanager.vmem-pmem-ratio 2.1 NodeManager yarn-site.xml 使用1M物理内存,最多可以使用的虚拟内存数量
    yarn.scheduler.minimum-allocation-mb 1024 ResourceManager yarn-site.xml 一次申请分配内存资源的最小数量
    yarn.scheduler.maximum-allocation-mb 8192 ResourceManager yarn-site.xml 一次申请分配内存资源的最大数量
    yarn.scheduler.minimum-allocation-vcores 1 ResourceManager yarn-site.xml 一次申请分配虚拟CPU资源最小数量
    yarn.scheduler.maximum-allocation-vcores 8 ResourceManager yarn-site.xml 一次申请分配虚拟CPU资源最大数量
    mapreduce.framework.name local MapReduce mapred-site.xml 取值local、classic或yarn其中之一,如果不是yarn,则不会使用YARN集群来实现资源的分配
    mapreduce.map.memory.mb 1024 MapReduce mapred-site.xml 每个MapReduce作业的map任务可以申请的内存资源数量
    mapreduce.map.cpu.vcores 1 MapReduce mapred-site.xml 每个MapReduce作业的map任务可以申请的虚拟CPU资源的数量
    mapreduce.reduce.memory.mb 1024 MapReduce mapred-site.xml 每个MapReduce作业的reduce任务可以申请的内存资源数量
    yarn.nodemanager.resource.cpu-vcores 8 MapReduce mapred-site.xml 每个MapReduce作业的reduce任务可以申请的虚拟CPU资源的数量

    参考链接

  • 相关阅读:
    80.共享内存实现进程通信
    79.cgi硬盘查询个人信息
    78.pipe多管道云端,客户端通信
    77.邮槽通信
    76.CGI编码
    strnpy函数
    POJ 1258 Agri-Net(Prim算法)
    0X7FFFFFFF,0X80000000
    Visual C++中min()和max()函数的使用
    POJ 2421 Constructing Roads(Kruskal算法)
  • 原文地址:https://www.cnblogs.com/timssd/p/4180227.html
Copyright © 2011-2022 走看看