zoukankan      html  css  js  c++  java
  • CentOs 7 安装Spark

    环境

    centos7

    hadoop 2.7.3

    java 1.8 

    下载

    http://spark.apache.org

    解压到安装目录

    可以自由选择,我安装到hadoop同一目录

    配置

    (cd spark安装目录/conf)

    cp log4j.properties.template log4j.properties
    cp  spark-env.sh.template spark-env.sh
    cp slaves.template  slaves

    在spark-env.sh文件后面添加如下信息指定hadoop和spark环境

    export SPARK_DIST_CLASSPATH=$(/home/hadoop/hadoop-2.7.3/bin/hadoop classpath)
    export SPARK_HOME=/home/hadoop/spark
    

     在slaves 文件末尾添加 slave机器

    复制文件到slaves

    如:scp -r spark  hadoop@slave1:/home/hadoop/       ;scp -r spark  hadoop@slave2:/home/hadoop/

    启动

    在master机器spark目录下,运行命令:sbin/start-master.sh   sbin/start-slaves.sh  或者 sbin/start-all.sh

    查看spark是否运行:

    http://yourIp:8080

    运行application

     (主机url在http://yourIp:8080显示)

    bin/spark-shell  --matser  spark://master:7077

    [hadoop@master spark]$ bin/spark-shell --master spark://master:7077
    Setting default log level to "WARN".
    To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/home/hadoop/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    17/06/06 04:01:17 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    17/06/06 04:01:29 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
    Spark context Web UI available at http://10.12.1.102:4040
    Spark context available as 'sc' (master = spark://master:7077, app id = app-20170606040119-0002).
    Spark session available as 'spark'.
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _ / _ / _ `/ __/  '_/
       /___/ .__/\_,_/_/ /_/\_   version 2.1.1
          /_/
    
    Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_112)
    Type in expressions to have them evaluated.
    Type :help for more information.
    
    scala>
    

    官方示例:http://spark.apache.org/docs/latest/quick-start.html

    scala> var textfile=sc.textFile("hdfs://master:9000/user/lihb/in/*.log")
    textfile: org.apache.spark.rdd.RDD[String] = hdfs://master:9000/user/lihb/in/*.log MapPartitionsRDD[1] at textFile at <console>:24
    
    scala> textfile.first()
    res5: String = #Software: IIS Advanced Logging Module
    
    scala> textfile.count()
    res7: Long = 32583
    
    scala> val wordCounts=textfile.flatMap(line=>line.split(" ")).map(word=>(word,1)).reduceByKey((a,b)=>a+b)
    wordCounts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:26
    
    scala> wordCounts.collect()
    res8: Array[(String, Int)] = Array((/space/attentionto/99335/,1), (01:41:27.777,1),  (01:45:...
    scala>
    

    hadoop 安装:centos 7 hadoop的安装和使用

  • 相关阅读:
    L1-050 倒数第N个字符串 (15分)
    Oracle存储过程的疑难问题
    Linux的细节
    Linux字符设备和块设备的区别
    Shell变量
    游标的常用属性
    Oracle中Execute Immediate用法
    Oracle中的sqlerrm和sqlcode
    Oracle把一个表的数据复制到另一个表中
    Oracle的差异增量和累积增量
  • 原文地址:https://www.cnblogs.com/hobinly/p/6952013.html
Copyright © 2011-2022 走看看