zoukankan      html  css  js  c++  java
  • Spark学习之idea scala环境配置

    idea scala环境配置、运行第一个Scala程序

    1、环境

    • jdk推荐1.8版本

    2、下载Scala

    • 推荐安装版本,不用自己手动配置环境变量

    • scala版本要与虚拟机上提示相一致

    3、创建 IDEA 工程

    4、增加 Scala 支持

    • 右击项目Add Framework Support
    • 前提是安装了scala

    5、安装scala插件,在idea中安装或者离线都可以

    6、编写pom文件

    • 复制代码,记得刷新一下maven

    • 如果里面有之前使用过的,可以选择之前的一些版本

      <?xml version="1.0" encoding="UTF-8"?>
      <project xmlns="http://maven.apache.org/POM/4.0.0"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
               xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
          <modelVersion>4.0.0</modelVersion>
      
          <groupId>cn.itcast</groupId>
          <artifactId>spark</artifactId>
          <version>0.1.0</version>
      
          <properties>
              <scala.version>2.11.8</scala.version>
              <spark.version>2.2.0</spark.version>
              <slf4j.version>1.7.16</slf4j.version>
              <log4j.version>1.2.17</log4j.version>
          </properties>
      
          <dependencies>
              <dependency>
                  <groupId>org.scala-lang</groupId>
                  <artifactId>scala-library</artifactId>
                  <version>${scala.version}</version>
              </dependency>
              <dependency>
                  <groupId>org.apache.spark</groupId>
                  <artifactId>spark-core_2.11</artifactId>
                  <version>${spark.version}</version>
              </dependency>
              <dependency>
                  <groupId>org.apache.hadoop</groupId>
                  <artifactId>hadoop-client</artifactId>
                  <version>2.6.0</version>
              </dependency>
      
              <dependency>
                  <groupId>org.slf4j</groupId>
                  <artifactId>jcl-over-slf4j</artifactId>
                  <version>${slf4j.version}</version>
              </dependency>
              <dependency>
                  <groupId>org.slf4j</groupId>
                  <artifactId>slf4j-api</artifactId>
                  <version>${slf4j.version}</version>
              </dependency>
              <dependency>
                  <groupId>org.slf4j</groupId>
                  <artifactId>slf4j-log4j12</artifactId>
                  <version>${slf4j.version}</version>
              </dependency>
              <dependency>
                  <groupId>log4j</groupId>
                  <artifactId>log4j</artifactId>
                  <version>${log4j.version}</version>
              </dependency>
              <dependency>
                  <groupId>junit</groupId>
                  <artifactId>junit</artifactId>
                  <version>4.10</version>
                  <scope>provided</scope>
              </dependency>
          </dependencies>
      
          <build>
              <sourceDirectory>src/main/scala</sourceDirectory>
              <testSourceDirectory>src/test/scala</testSourceDirectory>
              <plugins>
                  <plugin>
                      <groupId>org.apache.maven.plugins</groupId>
                      <artifactId>maven-compiler-plugin</artifactId>
                      <version>3.0</version>
                      <configuration>
                          <source>1.8</source>
                          <target>1.8</target>
                          <encoding>UTF-8</encoding>
                      </configuration>
                  </plugin>
      
                  <plugin>
                      <groupId>net.alchim31.maven</groupId>
                      <artifactId>scala-maven-plugin</artifactId>
                      <version>3.2.0</version>
                      <executions>
                          <execution>
                              <goals>
                                  <goal>compile</goal>
                                  <goal>testCompile</goal>
                              </goals>
                              <configuration>
                                  <args>
                                      <arg>-dependencyfile</arg>
                                      <arg>${project.build.directory}/.scala_dependencies</arg>
                                  </args>
                              </configuration>
                          </execution>
                      </executions>
                  </plugin>
      
                  <plugin>
                      <groupId>org.apache.maven.plugins</groupId>
                      <artifactId>maven-shade-plugin</artifactId>
                      <version>3.1.1</version>
                      <executions>
                          <execution>
                              <phase>package</phase>
                              <goals>
                                  <goal>shade</goal>
                              </goals>
                              <configuration>
                                  <filters>
                                      <filter>
                                          <artifact>*:*</artifact>
                                          <excludes>
                                              <exclude>META-INF/*.SF</exclude>
                                              <exclude>META-INF/*.DSA</exclude>
                                              <exclude>META-INF/*.RSA</exclude>
                                          </excludes>
                                      </filter>
                                  </filters>
                                  <transformers>
                                      <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                          <mainClass></mainClass>
                                      </transformer>
                                  </transformers>
                              </configuration>
                          </execution>
                      </executions>
                  </plugin>
              </plugins>
          </build>
      </project>
      
    • 因为在 pom.xml 中指定了 Scala 的代码目录, 所以创建目录 src/main/scala 和目录 src/test/scala

    7、创建object,编写代码

    项目下创建dataset文件夹,并编写wordcount.txt文件

    object WordCounts {
    
      def main(args: Array[String]): Unit = {
        // 1. 创建 Spark Context
        val conf = new SparkConf().setMaster("local[2]").setAppName("word_count")
        val sc: SparkContext = new SparkContext(conf)
    
        // 2. 读取文件并计算词频
        val source =  sc.textFile("dataset/wordcount.txt")
        val words  = source.flatMap { item => item.split(" ") }
        val wordsTuple = words.map { word => (word, 1) }
        val wordsCount = wordsTuple.reduceByKey { (x, y) => x + y }
        // 3. 查看执行结果
        println(wordsCount.collect)
      }
    }
    

    8、运行

  • 相关阅读:
    nexus 安装与启动(windows本版)
    linux 安装 mysql8
    02、linux 常用指令
    linux 安装tomcat8
    CentOS7使用firewalld打开关闭防火墙与端口
    03、linux 安装jdk
    rabbit mq的使用
    跨域与同源策略
    JDK1.8新特性04--Optional处理空指针问题
    HttpAsyncClient异步调用
  • 原文地址:https://www.cnblogs.com/xp-thebest/p/14273848.html
Copyright © 2011-2022 走看看