zoukankan      html  css  js  c++  java
  • Hadoop 系列文章(二) Hadoop配置部署启动HDFS及本地模式运行MapReduce

      接着上一篇文章,继续我们 hadoop 的入门案例.

      

    •   1. 修改 core-site.xml 文件
      
    [bamboo@hadoop-senior hadoop-2.5.0]$ vim etc/hadoop/core-site.xml
    添加如下的配置:
    <configuration>
      <property>
      <name>fs.defaultFS</name>
      <value>hdfs://hadoop-senior.bamboo.com:8020</value>
    </property>
    ## 修改默认目录
    <property>
      <name>hadoop.tmp.dir</name>
      <value>/opt/modules/hadoop-2.5.0/data/tmp</value>
      </property>
    </configuration>
      在 hadoop 根目录下创建 data/tmp 目录,然后指向 hadoop.tmp.dir 属性
      
      >>说明
      hadoop-senior.bamboo.com 是 hostname 的值
      可以通过在 terminal 中输入 hostname 来查看.
      修改的话,需要修改 /etc/sysconfig/network 文件的属性即可
     
    •   2. 修改 hdfs-site.xml
    <configuration>
     <property>
            <name>dfs.replication</name>
            <value>1</value>
        </property>
    </configuration>
     
    •   3. 启动顺序
      namenode (主节点) 管理源数据
      datanode (从节点)    存储数据
     
    [bamboo@hadoop-senior hadoop-2.5.0]$ sbin/hadoop-daemon.sh start namenode
      starting namenode, logging to /opt/modules/hadoop-2.5.0/logs/hadoop-bamboo-namenode-hadoop-senior.bamboo.com.out
    
    [bamboo@hadoop-senior hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode
      starting datanode, logging to /opt/modules/hadoop-2.5.0/logs/hadoop-bamboo-datanode-hadoop-senior.bamboo.com.out

      我在启动 datanode 的时候,jps 看了下进程信息,并没有发现 datanode 进程,所以就找错误。

      <<jps 看到 datanode 并没有启动>>
      [bamboo@hadoop-senior hadoop-2.5.0]$ jps
      10408 Jps
      10131 NameNode
     
      原因:
      datanode的clusterID 和 namenode的clusterID 不匹配。
      出现该问题的原因:在第一次格式化dfs后,启动并使用了hadoop,后来又重新执行了格式化命令(hdfs namenode -format),这时namenode的clusterID会重新生成,而datanode的clusterID 保持不变。
     
      解决办法:
      根据日志中的路径,cd /opt/modules/hadoop-2.5.0/data/tmp/dfs
      能看到 data和name两个文件夹,
      将name/current下的VERSION中的clusterID复制到data/current下的VERSION中,覆盖掉原来的clusterID
      让两个保持一致
      
      然后重启,启动后执行jps,查看进程
      [bamboo@hadoop-senior dfs]$ jps
      10614 Jps
      10131 NameNode
      10467 DataNode
     
    •   4. hadoop 官网上的 hdfs 启动步骤如下:
      The following instructions are to run a MapReduce job locally. If you want to execute a job on YARN, see YARN on Single Node.
      1.Format the filesystem:
      $ bin/hdfs namenode -format
      
      2.Start NameNode daemon and DataNode daemon:
      $ sbin/start-dfs.sh
      The hadoop daemon log output is written to the $HADOOP_LOG_DIR directory (defaults to $HADOOP_HOME/logs).
      
      3.Browse the web interface for the NameNode; by default it is available at:
     
      4.Make the HDFS directories required to execute MapReduce jobs:
     
      create folder $ bin/hdfs dfs -mkdir -p /user/<username>
     
      5.Copy the input files into the distributed filesystem:
      $ bin/hdfs dfs -put etc/hadoop input
     
      6.Run some of the examples provided:
      $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.2.jar grep input output 'dfs[a-z.]+'
     
      7.Examine the output files:
      Copy the output files from the distributed filesystem to the local filesystem and examine them:
      $ bin/hdfs dfs -get output output $ cat output/*
      or
      View the output files on the distributed filesystem:
      $ bin/hdfs dfs -cat output/*
     
      8.When you're done, stop the daemons with:
      $ sbin/stop-dfs.sh
     
     
    •   5. hdfs 的命令使用
     
      5.1 查看文件列表
    [bamboo@hadoop-senior hadoop-2.5.0]$ bin/hdfs dfs -ls -R /
    17/12/31 18:34:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    drwxr-xr-x - bamboo supergroup 0 2017-12-31 05:08 /user
    drwxr-xr-x - bamboo supergroup 0 2017-12-31 05:09 /user/bamboo
    drwxr-xr-x - bamboo supergroup 0 2017-12-31 05:09 /user/bamboo/input
     
      5.2 上传文件并查看
      
    1)、创建上传目录
    [bamboo@hadoop-senior hadoop-2.5.0]$ bin/hdfs dfs -mkdir -p /user/bamboo/mapreduce/wordcount/input/
    17/12/31 18:38:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
     
    2)、上传文件
    [bamboo@hadoop-senior hadoop-2.5.0]$ bin/hdfs dfs -put wcinput/wc.txt /user/bamboo/mapreduce/wordcount/input
    17/12/31 18:40:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
     
    3)、查看文件
    [bamboo@hadoop-senior hadoop-2.5.0]$ bin/hdfs dfs -ls /user/bamboo/mapreduce/wordcount/input
    17/12/31 18:41:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Found 1 items
    -rw-r--r-- 1 bamboo supergroup 81 2017-12-31 18:40 /user/bamboo/mapreduce/wordcount/input/wc.txt
     
    4)、查看文件内容
    [bamboo@hadoop-senior hadoop-2.5.0]$ bin/hdfs dfs -cat /user/bamboo/mapreduce/wordcount/input/wc.txt
    17/12/31 18:43:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    hadoop yarn
    hadoop mapreduce
    hadoop hdfs
    yarn nodemanager
    hadoop resourcemanager
     6. 用 hdfs 运行任务并存储到 hdfs
    [bamboo@hadoop-senior hadoop-2.5.0]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar wordcount /user/bamboo/mapreduce/wordcount/input /user/bamboo/mapreduce/wordcount/output
     
     
    查看运行结果:
    [bamboo@hadoop-senior hadoop-2.5.0]$ bin/hdfs dfs -cat /user/bamboo/mapreduce/wordcount/output/*
    17/12/31 18:50:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    hadoop 4
    hdfs 1
    mapreduce 1
    nodemanager 1
    resourcemanager 1
    yarn 2

    OK,  hdfs 大概就到这里了,下一个章节来继续 yarn 方式来启动 。

  • 相关阅读:
    grpc 浅谈
    ticket项目所得
    odoo 安装
    Ubuntu 设置系统环境变量和开机自启动
    supervisor 错误集合
    Python之路--前端知识--HTML
    Python之路--Python基础14--MySQL
    Python之路--Python基础13--异步IO、RedisMemcached缓存、RabbitMQ队列
    Python之路--Python基础12--并发编程之协程
    Python之路--Python基础11--并发编程之线程
  • 原文地址:https://www.cnblogs.com/zhuzi91/p/8167223.html
Copyright © 2011-2022 走看看