zoukankan      html  css  js  c++  java
  • spark本地开发环境搭建及打包配置

    在idea中新建工程

    image.png
    image.png

    删除新项目的src,创建moudle

    image.png

    在父pom中添加spark和scala依赖,我们项目中用scala开发模型,建议scala,开发体验会更好(java、python也可以)

    <?xml version="1.0" encoding="UTF-8"?>
    <project xmlns="http://maven.apache.org/POM/4.0.0"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
        <modelVersion>4.0.0</modelVersion>
    
        <groupId>com.shaozhiqi.bigdata</groupId>
        <artifactId>spark-demo01</artifactId>
        <packaging>pom</packaging>
        <version>1.0-SNAPSHOT</version>
        <modules>
            <module>spark-core</module>
        </modules>
    
        <properties>
            <maven.compiler.source>1.8</maven.compiler.source>
            <maven.compiler.target>1.8</maven.compiler.target>
            <scala.version>2.11.7</scala.version>
            <spark.version>2.4.3</spark.version>
            <encoding>UTF-8</encoding>
        </properties>
        <dependencies>
    
            <dependency>
                <groupId>org.scala-lang</groupId>
                <artifactId>scala-library</artifactId>
                <version>${scala.version}</version>
            </dependency>
    
            <dependency>
                <groupId>org.apache.spark</groupId>
                <artifactId>spark-core_2.11</artifactId>
                <version>${spark.version}</version>
            </dependency>
        </dependencies>
    
    </project>

    在我们Moudle中配置打包插件

    <?xml version="1.0" encoding="UTF-8"?>
    <project xmlns="http://maven.apache.org/POM/4.0.0"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
        <parent>
            <artifactId>spark-demo01</artifactId>
            <groupId>com.shaozhiqi.bigdata</groupId>
            <version>1.0-SNAPSHOT</version>
        </parent>
        <modelVersion>4.0.0</modelVersion>
    
        <artifactId>spark-core</artifactId>
    
        <build>
            <pluginManagement>
                <plugins>
                    <!-- 编译scala的插件 -->
                    <plugin>
                        <groupId>net.alchim31.maven</groupId>
                        <artifactId>scala-maven-plugin</artifactId>
                        <version>3.2.2</version>
                    </plugin>
                </plugins>
            </pluginManagement>
            <plugins>
                <plugin>
                    <groupId>net.alchim31.maven</groupId>
                    <artifactId>scala-maven-plugin</artifactId>
                    <executions>
                        <execution>
                            <id>scala-compile-first</id>
                            <phase>process-resources</phase>
                            <goals>
                                <goal>add-source</goal>
                                <goal>compile</goal>
                            </goals>
                        </execution>
                        <execution>
                            <id>scala-test-compile</id>
                            <phase>process-test-resources</phase>
                            <goals>
                                <goal>testCompile</goal>
                            </goals>
                        </execution>
                    </executions>
                </plugin>
    
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-compiler-plugin</artifactId>
                    <executions>
                        <execution>
                            <phase>compile</phase>
                            <goals>
                                <goal>compile</goal>
                            </goals>
                        </execution>
                    </executions>
                </plugin>
                <!-- 打包插件 -->
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-shade-plugin</artifactId>
                    <version>3.2.1</version>
                    <configuration>
                        <transformers>
                            <!-- add Main-Class to manifest file -->
                            <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                <!--you can add you want to need   the main class--><!---->
                                <mainClass>com.shaozhiqi.bigdata.spark.WordCount</mainClass>
                            </transformer>
                        </transformers>
                        <createDependencyReducedPom>false</createDependencyReducedPom>
                    </configuration>
                    <executions>
                        <execution>
                            <phase>package</phase>
                            <goals>
                                <goal>shade</goal>
                            </goals>
                            <configuration>
                                <filters>
                                    <filter>
                                        <artifact>*:*</artifact>
                                        <excludes>
                                            <exclude>META-INF/*.SF</exclude>
                                            <exclude>META-INF/*.DSA</exclude>
                                            <exclude>META-INF/*.RSA</exclude>
                                        </excludes>
                                    </filter>
                                </filters>
                            </configuration>
                        </execution>
                    </executions>
                </plugin>
            </plugins>
        </build>
    </project>

    安装scala开发插件到idea

    image.png

    安装后重启

    设置scalasdk,选我们新建的moudle

    image.png
    image.png

    新建scala对象

    image.png

    编写代码:

     def main(args: Array[String]): Unit = {
        //1.创建配置信息
        val conf =new SparkConf().setAppName("wordcount").setMaster("local[*]")
        //2.创建sparkcontext
        val sc= new SparkContext(conf)
        //3.处理业务数据,我们统计每个单词的个数
        // 我们要在集群上尝试所以就将textFile的参数参数化,如果在本地执行则写本地的绝对路径
        val lines=sc.textFile("G:\temp\input.txt")
        val words=lines.flatMap(_.split(" "))
        val keyMap=words.map((_, 1))
        val result =keyMap.reduceByKey(_+_)
        result.foreach(println)
        //4.关闭连接
        sc.stop()
      }

    本地调测试

    (1233,1)
    (llll,1)
    (hhh,1)
    (ddd,2)
    (55,2)
    (,1)
    (kkkk,1)
    (jjj,1)
  • 相关阅读:
    DropDownList判断值是否存在下拉列表中
    postgre教程
    Cookie seesion 赋值
    Winform定时启动
    ASP.NET数据绑定控件
    ASP.NET常用数据绑定控件优劣总结
    Cards and Joy (dp好题)
    River Hopscotch (二分)
    剪花布条(KMP入门)
    GCD (区间数的质因子打表+容斥原理)
  • 原文地址:https://www.cnblogs.com/shaozhiqi/p/11535269.html
Copyright © 2011-2022 走看看