zoukankan      html  css  js  c++  java
  • win10配置hadoop环境

    安装hadoop和mapreduce详解以及避坑指南


    win10开发环境配置(括号中为我的安装路径,按需修改)
    1. 下载hadoop(D:\安装包\Download\hadoop)

      https://mirror.bit.edu.cn/apache/hadoop/common/hadoop-3.2.1/

      Hadoop3.2.1有坑,不建议安装这个,坑直接翻到最后。

    2. 下载windows binaries and winutils for Hadoop 3.2.1(版本可以和上面不一致,D:\安装包\Download\hadoop)

      https://github.com/selfgrowth/apache-hadoop-3.1.1-winutils,加压后覆盖hadoop中的bin目录。

    3. 拷贝bin下的hadoop.dll到C:\\Window\system32

    4. 添加环境变量

      HADOOP_HOME=D:\hadoop\hadoop-3.2.1

      添加path %HADOOP_HOME%\bin

    5. 报错:Hadoop Error: JAVA_HOME is incorrectly set.

      JAVA_HOME的路径中是否含有空格,比如Program files这种的,如果是这种,请将空格部分加上英文的双引号。


    配置maven中的pom.xml依赖项
    <?xml version="1.0" encoding="UTF-8"?>
    <project xmlns="http://maven.apache.org/POM/4.0.0"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
        <modelVersion>4.0.0</modelVersion>
    
        <groupId>com.bjfu.jichuang</groupId>
        <artifactId>my-wordcount</artifactId>
        <version>1.0-SNAPSHOT</version>
        <packaging>jar</packaging>
        <description></description>
    
        <properties>
            <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
            <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
            <java.version>1.8</java.version>
            <hadoop.version>3.2.1</hadoop.version>
            <log4j.version>1.2.17</log4j.version>
            <mockito.version>1.8.5</mockito.version>
            <junit.version>4.10</junit.version>
        </properties>
    
        <dependencies>
            <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-client</artifactId>
                <version>${hadoop.version}</version>
            </dependency>
            <dependency>
                <groupId>log4j</groupId>
                <artifactId>log4j</artifactId>
                <version>${log4j.version}</version>
            </dependency>
            <dependency>
                <groupId>org.slf4j</groupId>
                <artifactId>slf4j-log4j12</artifactId>
                <version>1.7.5</version>
            </dependency>
            <dependency>
                <groupId>org.mockito</groupId>
                <artifactId>mockito-all</artifactId>
                <version>${mockito.version}</version>
                <scope>test</scope>
            </dependency>
            <dependency>
                <groupId>junit</groupId>
                <artifactId>junit</artifactId>
                <version>${junit.version}</version>
                <scope>test</scope>
            </dependency>
        </dependencies>
        <build>
            <plugins>
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-compiler-plugin</artifactId>
                    <version>2.3.2</version>
                    <configuration>
                        <source>1.8</source>
                        <target>1.8</target>
                    </configuration>
                </plugin>
                <plugin>
                    <artifactId>maven-assembly-plugin</artifactId>
                    <configuration>
                        <descriptorRefs>
                            <descriptorRef>
                                jar-with-dependencies
                            </descriptorRef>
                        </descriptorRefs>
                    </configuration>
                    <executions>
                        <execution>
                            <id>make-assembly</id>
                            <phase>package</phase>
                            <goals>
                                <goal>
                                    single
                                </goal>
                            </goals>
                        </execution>
                    </executions>
                </plugin>
            </plugins>
        </build>
    </project>
    

    启动hadoop

    1. 修改core-site.xml(D:\hadoop\hadoop-3.2.1\etc\hadoop)

      新建tmp文件夹和name文件夹

      <configuration>
          <property>
              <name>hadoop.tmp.dir</name>
              <value>/D:/hadoop/hadoop-3.2.1/workplace/tmp</value>
          </property>
          <property>
              <name>dfs.name.dir</name>
              <value>/D:/hadoop/hadoop-3.2.1/workplace/name</value>
          </property>
          <property>
              <name>fs.default.name</name>
              <value>hdfs://localhost:9000</value>
          </property>
      </configuration>
      
    2. 修改hdfs-site.xml

      新建datanode和namenode文件夹后修改对应内容

      <configuration>
          <!-- 这个参数设置为1,因为是单机版hadoop -->
          <property>
              <name>dfs.replication</name>
              <value>1</value>
          </property>
          <property>
              <name>dfs.data.dir</name>
              <value>/D:/hadoop/hadoop-3.2.1/workplace/data</value>
          </property>
      </configuration>
      
    3. 修改mapred-site.xml

      <configuration>
          <property>
             <name>mapreduce.framework.name</name>
             <value>yarn</value>
          </property>
          <property>
             <name>mapred.job.tracker</name>
             <value>hdfs://localhost:9001</value>
          </property>
      </configuration>
      
    4. 修改yarn-site.xml

      <configuration>
          <property>
             <name>yarn.nodemanager.aux-services</name>
             <value>mapreduce_shuffle</value>
          </property>
          <property>
             <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
             <value>org.apache.hadoop.mapred.ShuffleHandler</value>
          </property>
      </configuration>
      
    5. 编辑“hadoop”目录下的hadoop-env.cmd文件

      @rem set JAVA_HOME=%JAVA_HOME%
      
      set JAVA_HOME=D:\java\jdk --jdk安装路径
      
    6. 格式化namenode

      D:\安装包\Download\hadoop\hadoop-3.2.1\hadoop-3.2.1\bin>hadoop namenode -format
      
    7. 启动hadoop

      D:\安装包\Download\hadoop\hadoop-3.2.1\hadoop-3.2.1\sbin>start-all.cmd
      

    yarn运行成功,访问http://localhost:8088/cluster/apps

    img


    坑1:namenode格式化报错(3.2.1通病)

    https://kontext.tech/column/hadoop/377/latest-hadoop-321-installation-on-windows-10-step-by-step-guide

    https://www.cnblogs.com/yifengjianbai/p/8258898.html

    坑2:http://localhost:50070/无法访问

    坑3:启动yarn的时候,无法启动nodemanager

    Failed to setup local dir D:/hadoop/tmp/nm-local-dir, which was marked as good.

    管理员权限问题,使用管理员权限运行start-yarn.cmd即可

    坑4:8088端口UI界面不显示yarn执行的任务

    在$HADOOP_HOME/conf/mapred-site.xml,添加如下代码:

    <property>
         <name>mapreduce.framework.name</name>
         <value>yarn</value>
    </property>
    <property>
         <name>mapreduce.jobhistory.address</name>
         <value>master:10020</value>
     </property>
     <property>
         <name>mapreduce.jobhistory.webapp.address</name>
         <value>master:19888</value>
    </property>
    

    坑5:Hadoop项目出现No such file or directory错误

    使用管理员身份运行ide即可。

  • 相关阅读:
    Converting PDF to Text in C#
    Working with PDF files in C# using PdfBox and IKVM
    Visualize Code with Visual Studio
    Azure Machine Learning
    Building Forms with PowerShell – Part 1 (The Form)
    ML.NET is an open source and cross-platform machine learning framework
    Microsoft Visual Studio Tools for AI
    Debugging Beyond Visual Studio – WinDbg
    Platform.Uno介绍
    Hawk-数据抓取工具
  • 原文地址:https://www.cnblogs.com/hang-shao/p/12860000.html
Copyright © 2011-2022 走看看