zoukankan      html  css  js  c++  java
  • 运行spark exmaple 代码2.1.1

    运行spark exmaple 代码

    以管理员权限运行eclipse

    以JavaSparkHiveExample为例

    package :org.apache.spark.examples.sql

    搭建代码环境

    Figure 1新建maven项目,名称为spark2.1.1example

    修改jdk版本,取消Enable project specific settings

    修改jdk库为1.8。选中JRE System Library[J2SE-1.5],点击remove,点击Add Library/JRE System Library

    改后

    替换pom.xml为

     

    <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>spark2.1.1example</groupId>
    <artifactId>spark2.1.1example</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <properties>
    <java.version>1.7</java.version>
    </properties>
    <dependencies>

    <!-- https://mvnrepository.com/artifact/org.apache.kafka/kafka_2.10 -->
    <!---->
    <dependency>
    <groupId>org.apache.kafka</groupId>
    <artifactId>kafka_2.10</artifactId>
    <version>0.10.2.1</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-flume_2.10 -->
    <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming-flume_2.10</artifactId>
    <version>2.1.1</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kafka_2.10 -->

    <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming-kafka_2.11</artifactId>
    <version>1.6.3</version>1.5.2
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kafka-0-10_2.11 -->
    <!--<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
    <version>2.1.1</version>
    </dependency> -->

    <!-- https://mvnrepository.com/artifact/mysql/mysql-connector-java -->
    <dependency>
    <groupId>mysql</groupId>
    <artifactId>mysql-connector-java</artifactId>
    <version>5.1.35</version>
    </dependency>

    <!-- https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper -->
    <dependency>
    <groupId>org.apache.zookeeper</groupId>
    <artifactId>zookeeper</artifactId>
    <version>3.4.8</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql-kafka-0-10_2.10 -->
    <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql-kafka-0-10_2.10</artifactId>
    <version>2.1.1</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-flume_2.10 -->
    <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming-flume_2.10</artifactId>
    <version>2.1.1</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.flume/flume-ng-embedded-agent -->
    <dependency>
    <groupId>org.apache.flume</groupId>
    <artifactId>flume-ng-embedded-agent</artifactId>
    <version>1.6.0</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.derby/derby -->
    <dependency>
    <groupId>org.apache.derby</groupId>
    <artifactId>derby</artifactId>
    <version>10.13.1.1</version>
    </dependency>


    </dependencies>
    <build>
    <sourceDirectory>src</sourceDirectory>
    <plugins>
    <plugin>
    <artifactId>maven-compiler-plugin</artifactId>
    <version>3.1</version>
    <configuration>
    <source/>
    <target/>
    </configuration>
    </plugin>
    </plugins>
    </build>
    </project>

    下载spark-2.1.1-bin-hadoop2.7.tgz

    http://spark.apache.org/downloads.html

    解压缩spark-2.1.1-bin-hadoop2.7.tgz

    Figure 2新建User libraries

    鼠标单击选中spark2.1.1jars,单机Add External JARS

    打开刚才解压缩的目录

    Figure 3添加jars下所有文件

    Figure 4添加examples/jars下所有

    Figure 5 libraries下包含JDK,Maven,spark2.1.1jars三类

    下载winutils

    https://github.com/steveloughran/winutils

    我们只需要其中hadoop-2.7.1部分。

    Figure 6解压缩后:

    Figure 7右键 Run AS/Java Application

    忽略报错。这一步创建运行配置文件,下一步修改运行配置文件后报错自动消失。

    Figure 8右键Run As/Run Configuration

    Figure 9切换到Environment标签

    Figure 10新建HADOOP_HOME指向yourdirwinutils-masterhadoop-2.7.1

    Figure 11选中replace native environment

    在project下新建三层目录

    examples/src/main/resources

    Figure 12拷贝此目录下文件到刚新建的目录下

    Figure 13为了在eclipse中运行,修改了标记//HERE的行

    Figure 14查看运行结果

  • 相关阅读:
    尖峰冲击测试(spike Testing)
    mysql返回记录的ROWNUM(转)
    SQL2005四个排名函数(row_number、rank、dense_rank和ntile)的比较
    JUnit编写单元测试代码注意点小结
    Linux下Tomcat的启动、关闭、杀死进程
    linux下oracle11g R2的启动与关闭监听、数据库
    linux下使用yum安装mysql详解
    VC++ 实现文件与应用程序关联
    C++ 去掉字符串首尾的 x20 字符
    VC++ 线程同步 总结
  • 原文地址:https://www.cnblogs.com/wifi0/p/6950156.html
Copyright © 2011-2022 走看看