zoukankan      html  css  js  c++  java
  • Learn ZYNQ(10) – zybo cluster word count

    1.配置环境说明

    spark:5台zybo板,192.168.1.1master,其它4台为slave

    hadoop:192.168.1.1(外接SanDisk )

    2.单节点hadoop测试:

    如果出现内存不足情况如下:

    EVE)V2DEK1)UL0[]NX2BX$H

    查看当前虚拟内存容量:

    free -m
    cd /mnt
    mkdir swap
    cd swap/
    创建一个swap文件
    dd if=/dev/zero of=swapfile bs=1024 count=1000000
    把生成的文件转换成swap文件
    mkswap swapfile
    激活swap文件
    swapon swapfile
    free -m

    通过测试:

    image

    image

    3.spark + hadoop 测试

    SPARK_MASTER_IP=192.168.1.1 ./sbin/start-all.sh

    image

    MASTER=spark://192.168.1.1:7077 ./bin/pyspark

    file = sc.textFile("hdfs://192.168.1.1:9000/in/file")
    counts = file.flatMap(lambda line: line.split(" "))
                 .map(lambda word: (word, 1))
                 .reduceByKey(lambda a, b: a + b)
    counts.saveAsTextFile("hdfs://192.168.1.1:9000/out/mycount")
    counts.saveAsTextFile("/mnt/mycount")
    counts.collect()

    image

    counts.collect()

    image

    错误1:

    java.net.ConnectException: Call From zynq/192.168.1.1 to spark1:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

    这是由于我们用root启动hadoop,而spark要远程操作hadoop系统,没有权限引起的

    解决:如果是测试环境,可以取消hadoop hdfs的用户权限检查。打开etc/hadoop/hdfs-site.xml,找到dfs.permissions属性修改为false(默认为true)OK了。

    <property>
            <name>dfs.permissions</name>
            <value>false</value>
    </property>

    4.附:我的配置文件

    go.sh:

    #! /bin/sh -
    
    mount /dev/sda1 /mnt/
    cd /mnt/swap/
    swapon swapfile
    free -m
    
    cd /root/hadoop-2.4.0/
    sbin/hadoop-daemon.sh start namenode
    sbin/hadoop-daemon.sh start datanode
    sbin/hadoop-daemon.sh start secondarynamenode
    sbin/yarn-daemon.sh start resourcemanager
    sbin/yarn-daemon.sh start nodemanager
    sbin/mr-jobhistory-daemon.sh start historyserver
    
    jps
    while [ `netstat -ntlp | grep 9000` -eq `echo` ]
    do
    sleep 1
    done
    netstat -ntlp
    echo hadoop start successfully
    
    cd /root/spark-0.9.1-bin-hadoop2
    SPARK_MASTER_IP=192.168.1.1 ./sbin/start-all.sh
    jps
    while [ `netstat -ntlp | grep 7077` -eq `echo` ]
    do
    sleep 1
    done 
    netstat -ntlp
    echo spark start successfully

    /etc/hosts

    #127.0.0.1      localhost       zynq
    192.168.1.1     spark1          localhost       zynq
    #192.168.1.1    spark1
    192.168.1.2     spark2
    192.168.1.3     spark3
    192.168.1.4     spark4
    192.168.1.5     spark5
    192.168.1.100   sparkMaster
    #::1            localhost ip6-localhost ip6-loopback

    /etc/profile

    export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:$PATH
    export JAVA_HOME=/usr/lib/jdk1.7.0_55
    export JRE_HOME=${JAVA_HOME}/jre
    export CLASSPATH=.:$JAVA_HOME/lib/tools.jar
    export PATH=$JAVA_HOME/bin:$PATH
    export HADOOP_HOME=/root/hadoop-2.4.0
    
    export PATH=$PATH:$HADOOP_HOME/bin
    export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
    export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
    export HADOOP_MAPRED_HOME=$HADOOP_HOME
    export HADOOP_COMMON_HOME=$HADOOP_HOME
    export HADOOP_HDFS_HOME=$HADOOP_HOME
    export YARN_HOME=$HADOOP_HOME
    export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
    
    ifconfig eth2 hw ether 00:0a:35:00:01:01
    ifconfig eth2 192.168.1.1/24 up

    HADOOP_HOME/etc/hadoop/yarn-site.xml

    <configuration>
    <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
    </property>
    
    <property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    
    </configuration>

    HADOOP_HOME/etc/hadoop/core-site.xml

    <configuration>
        <property>
            <name>fs.default.name</name>
            <value>hdfs://localhost:9000</value>
        </property>
    
    
        <property>
            <name>hadoop.tmp.dir</name>
            <value>/mnt/hadoop/tmp</value>
        </property>
    </configuration>

    HADOOP_HOME/etc/hadoop/hdfs-site.xml

    <configuration>
        <property>
            <name>dfs.replication</name>
            <value>1</value>
        </property>
        <property>
            <name>dfs.permissions</name>
            <value>false</value>
        </property>
    
        <property>
            <name>dfs.namenode.rpc-address</name>
            <value>192.168.1.1:9000</value>
        </property>
    
        <property>
            <name>dfs.datanode.data.dir</name>
            <value>/mnt/datanode</value>
        </property>
        <property>
            <name>dfs.namenode.name.dir</name>
            <value>/mnt/namenode</value>
        </property>
    </configuration>

    done

  • 相关阅读:
    依次逐个亮灯并且每次只能亮一个灯的跑马灯程序
    逐个点亮LED灯,再逐个熄灭LED灯的跑马灯程序---基于74HC595移位锁存器,程序框架用switch语句
    把74HC595驱动程序翻译成类似单片机IO口直接驱动的方式
    两片联级74HC595驱动16个LED灯的基本驱动程序
    树莓派
    Linux I2C驱动
    转:使用 /proc 文件系统来访问 Linux 内核的内容
    转: 使用 /sys 文件系统访问 Linux 内核
    树梅派 -- 通过/sys读写ADC芯片 pcf8591
    树莓派 -- oled 续(2) python
  • 原文地址:https://www.cnblogs.com/shenerguang/p/3834006.html
Copyright © 2011-2022 走看看