zoukankan      html  css  js  c++  java
  • 在MacOs上配置Hadoop和Spark环境

    在MacOs上配置hadoop和spark环境

    Setting up Hadoop with Spark on MacOs

    Instructions

    1. 准备环境
      如果没有brew,先google怎样安装brew
      先uninstall老版本的Hadoop

      brew cleanup hadoop

      然后更新homebrew formulae

      brew update
      brew upgrade
      brew cleanup

      检查版本信息

      brew info hadoop
      brew info apache-spark
      brew info sbt
      brew info scala

      如果以上程序没有安装,需要使用brew install app 进行安装。

    2. 安装环境 安装hadoop

      brew install hadoop

      安装spark

      brew install apache-spark scala sbt

    3. 设置环境变量
      使用vim编辑~/.bash_profile,将以下内容贴到最后

       # set environment variables   
       export JAVA_HOME=$(/usr/libexec/java_home)     
       export HADOOP_HOME=/usr/local/Cellar/hadoop/2.5.1    
       export HADOOP_CONF_DIR=$HADOOP_HOME/libexec/etc/hadoop   
       export SCALA_HOME=/usr/local/Cellar/apache-spark/1.1.0   
      
       # set path variables   
       export PATH=$PATH:$HADOOP_HOME/bin:$SCALA_HOME/bin     
      
       # set alias start & stop scripts   
       alias hstart=$HADOOP_HOME/sbin/start-dfs.sh;$HADOOP_HOME/sbin/start-yarn.sh   
       alias hstop=$HADOOP_HOME/sbin/stop-dfs.sh;$HADOOP_HOME/sbin/stop-yarn.sh
      
    4. Hadoop必须要使ssh生效,设置ssh

      • 配置文件路径:

        /etc/sshd_config

      • 生成秘钥:

        sh-3.2# sudo ssh-keygen -t rsa

          Generating public/private rsa key pair.
          Enter file in which to save the key (/var/root/.ssh/id_rsa):  输入/var/root/.ssh/id_rsa
          Enter passphrase (empty for no passphrase): [直接回车]
          Enter same passphrase again: [直接回车]
          Your identification has been saved in /var/root/.ssh/id_rsa.
          Your public key has been saved in /var/root/.ssh/id_rsa.pub.
          key fingerprint is:
          97:e9:5a:5e:91:52:30:63:9e:34:1a:6f:24:64:75:af root@cuican.local
          The key's randomart image is:
          +--[ RSA 2048]----+
          |       .=.X .    |
          |       . X B .   |
          |        . = . .  |
          |         . + o   |
          |        S = E    |
          |         o . .   |
          |          o .    |
          |         + .     |
          |        . .      |
          +-----------------+
        
      • 修改配置文

        sudo vim /etc/ssh/sshd_config

          Port 22
          #AddressFamily any
          #ListenAddress 0.0.0.0
          #ListenAddress ::
          # The default requires explicit activation of protocol 1
          Protocol 2
          # HostKey for protocol version 1
          #HostKey /etc/ssh/ssh_host_key
          # HostKeys for protocol version 2
          #HostKey /etc/ssh/ssh_host_rsa_key
          #HostKey /etc/ssh/ssh_host_dsa_key
          #HostKey /etc/ssh/ssh_host_ecdsa_key
          HostKey /var/root/.ssh/id_rsa
        
          # Lifetime and size of ephemeral version 1 server key
          KeyRegenerationInterval 1h
          ServerKeyBits 1024
        
          # Logging
          # obsoletes QuietMode and FascistLogging
          SyslogFacility AUTHPRIV
          #LogLevel INFO
        
          # Authentication:
          LoginGraceTime 2m
          PermitRootLogin yes
          StrictModes yes
          #MaxAuthTries 6
          #MaxSessions 10
        
          RSAAuthentication yes
        
          PubkeyAuthentication yes
        
      • 启动ssh服务

        which sshd //查找sshd的位置。

        Mac 上sshd的位置在 /usr/sbin/sshd

        在终端输入sudo /usr/sbin/sshd即可启动sshd服务。

        ssh-keygen -t rsa
        cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

    5. 配置Hadoop
      到hadoop的安装路径

      cd usr/local/Cellar/hadoop/2.5.1/libexec/

      编辑etc/hadoop/hadoop-env.sh

       # this fixes the "scdynamicstore" warning   
       export HADOOP_OPTS="$HADOOP_OPTS -Djava.security.krb5.realm= -Djava.security.krb5.kdc=" 
      

      编辑etc/hadoop/core-site.xml

       <configuration>
           <property>
               <name>fs.defaultFS</name>
               <value>hdfs://localhost:9000</value>
           </property>
       </configuration>
      

      编辑etc/hadoop/hdfs-site.xml

       <configuration> 
           <property> 
               <name>dfs.replication</name> 
               <value>1</value> 
           </property> 
       </configuration>
      

      编辑etc/hadoop/mapred-site.xml

       <configuration>
           <property>
               <name>mapreduce.framework.name</name>
               <value>yarn</value>
           </property>
       </configuration>
      

      编辑etc/hadoop/yarn-site.xml

       <configuration> 
           <property> 
               <name>yarn.nodemanager.aux-services</name> 
               <value>mapreduce_shuffle</value> 
           </property> 
       </configuration>
      
    6. 开始启用Hadoop
      移动到Hadoop的root directory

      cd /usr/local/Cellar/hadoop/2.5.1

      格式化Hadoop HDFS

      ./bin/hdfs namenode -format

      启动NameNode和DataNode daemon

      ./sbin/start-dfs.sh

      从网页中查看

      http://localhost:50070/

      启动ResourceManager和NodeManager daemon

      ./sbin/start-yarn.sh

      检查所有的守护线程是不是已经在运行

      jps

      从网页中查看ResourceManager

      http://localhost:8088/

      创建HDFS目录

      ./bin/hdfs dfs -mkdir -p /user/{username}

      启动一个MapReduce的例子

       #calculate pi  
       ./bin/hadoop jar libexec/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar pi 10 100
      
    7. 启动spark

      到Spark的安装目录

      cd /usr/local/Cellar/apache-spark/1.1.0

      启动Spark的例子

      ./bin/run-example SparkPi

      在网页中查看Spark任务

      http://localhost:4040/

      也可以使用Spark-submit来提交任务

       # pattern to launch an application in yarn-cluster mode
       ./bin/spark-submit --class <path.to.class> --master yarn-cluster [options] <app.jar> [options]
      
       # run example application (calculate pi)
       ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster libexec/lib/spark-examples-*.jar
      
    8. 结束

  • 相关阅读:
    apache solr简单搭建
    Flash学习初总结
    UWP多设备加载不同xaml布局文件
    鼠标右键多余选项删除
    用命令查看win10/win8.1等详细激活信息方法:
    win10 登陆选项 无法打开
    UWP应用开发:添加复制按钮,添加引用
    notepad++详细介绍!
    Python安装出现2503 2502 问题解决!
    Genymotion插件安装教程
  • 原文地址:https://www.cnblogs.com/wangjiyong/p/5173090.html
Copyright © 2011-2022 走看看