zoukankan      html  css  js  c++  java
  • 解惑:在Ubuntu18.04.2的idea上运行Scala支持的spark程序遇到的问题

    解惑:在Ubuntu18.04.2的idea上运行Scala支持的spark程序遇到的问题

    一、前言

        最近在做一点小的实验,用到了Scala,spark这些东西,于是在Linux平台上来完成,结果一个最简单的入门程序搞了一两天,出了汗颜之外,对于这些工具的难用性也有了深刻的认知,难怪Hadoop的几个公司会渐渐走向衰落。

    二、解惑

        如果大家看过我之前的博客就知道,我是用过Hadoop,spark的,当时就遇到了非常多的麻烦,这些产品迭代的比较快,每个版本对于之前的兼容性可以说是微乎其微,因此版本的选用非常重要,除了在官网上看这些版本匹配的知识之外,网上很少涉及到这些东西的,但是这些东西却是非常重要的。而且这些产品安装起来也比较麻烦,下载下来,虽说是开箱即用,但是也需要对于里面的一些配置文件进行一些修改,这些都不算什么,当我们在命令行上运行的时候,却发现出现莫名其妙的错误,这些错误多与底层的Java版本,Hadoop版本,Scala版本等等有关,真的是让人很郁闷,但是产品做的也不好没有一些正确的提示,于是在网上找资料,但是发现能找到的非常少,往往是南辕北辙,自相矛盾,最后没有个一两天是很难找到最终的解决办法的。这些产品如果不改进,和那些MySQL,mongodb相比绝对是会被淘汰的。在本次小测试中,我就遇到了因为版本依赖问题而停工两天的问题,那就是在Ubuntu18.04.2的idea上运行Scala支持的spark程序,遇到的奇葩的问题。

        先介绍一下我是怎么一步步来构建程序的,网上有不少案例,但是都是浅尝辄止,语焉不详,这些人是不配写文章的,没有一点敬畏心和责任感,搞出来的东西是把很多最重要的细节直接忽略,不知道是缺乏表达能力还是不屑为之。首先就是创建什么样的工程,支持Scala的程序,在idea中可以有两种方法,一种是直接创建Scala工程,这样首先需要安装Scala插件,其次在创建工程之后需要自己配置程序运行的环境,这些环境盘根错节,配置起来可能需要很多次尝试,最终浪费大量的精力;第二种方式还是要安装Scala插件,但是创建maven工程,在pom.xml文件中导入需要的配置,根据依赖和继承关系自动下载,并且导入Scala插件即可,显然第二种更简单一点。于是我们用第二种方式,构建maven工程。

       创建新的文件夹,并且在程序结构中设置为我们的源文件文件夹。

         最后我们需要引入我们下载的Scala插件,这个时候就涉及到版本问题了,Scala2.10之前支持Java7,2.11之后不支持Java7,而是Java8了,我们用的Java8,那么至少也是2.11,而2.11有很多版本,我们需要去选择一个。在这个界面,我们加入相应的Scala版本,但是这个版本可能没有,于是我们点击download按钮即可选择相应的版本下载,这里不得不吐槽一下idea实在是做的比较差的一点,下载需要半个小时时间,并且下载过程中没有进度条,让人非常的不耐,关闭也非常的麻烦。下载之后我们选择相应的版本。

        到了这一点就需要在pom.xml中进行配置了,因为用到spark里面的机器学习插件,我们引入即可,因为包之间的依赖关系,maven自动帮我们搞定依赖关系,值得称赞。

     1   <properties>
     2     <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
     3     <maven.compiler.source>1.7</maven.compiler.source>
     4     <maven.compiler.target>1.7</maven.compiler.target>
     5 <!--    <scala.version>2.11.0</scala.version>-->
     6     <spark.artifactID.suffix>2.11</spark.artifactID.suffix>
     7     <spark.version>2.4.3</spark.version>
     8   </properties>
     9   <dependencies>
    10      <dependency>
    11        <groupId>junit</groupId>
    12        <artifactId>junit</artifactId>
    13        <version>4.11</version>
    14        <scope>test</scope>
    15      </dependency>
    16      <dependency>
    17         <groupId>org.apache.spark</groupId>
    18         <artifactId>spark-mllib_${spark.artifactID.suffix}</artifactId>
    19         <version>${spark.version}</version>
    20      </dependency>
    21   </dependencies>
    View Code

        这里我们版本设置成Scala2.11,对应于刚刚的下载,如果用2.12不知道怎么的,明明导入了依赖关系,总是连程序都出现问题,说找不到相应的包,而我在下载的依赖中明明就发现了这些文件,真的是让人惊讶!!!后来好不容易找到了,运行的时候却发现对于出现奇葩的异常,运行个程序真的是难呀,我们的Scala的hello程序竟然都难到这种程度了,版本问题造成的错误可以说是很奇葩了,spark按照maven仓库里面来尝试,我选的是最新版2.4.3。因为这是我目前可以运行的配置,所以是暂时没问题的。有的时候更奇葩的是第一次运行成功了,第二次再运行另一个程序失败了,第三次再来运行第一次的程序也出现了一样的问题,把idea的缓存都清了一次重启了很多次,依然存在这些问题,在另一台电脑上操作还是这样的问题,你说让不让人绝望?!最终暂时探索的一个可行的版本关联配置是Java8+Scala2.11.0+sparkmlib2.11+spark2.4.3,至此问题解决。

      第一个程序:

     1 package com.kmeans
     2 
     3 import org.apache.spark.{SparkConf, SparkContext}
     4 
     5 
     6 object MyTest {
     7   def main(args:Array[String]): Unit = {
     8     val logFile="file:///home/zyr/file.txt"
     9     val conf = new SparkConf().setAppName("Simple Application").setMaster("local[2]")
    10     val sc=new SparkContext(conf)
    11     val logData=sc.textFile(logFile,2).cache()
    12     val num=logData.flatMap(x=>x.split(" ")).filter(_.contains("a")).count()
    13     println("Words with a : %s".format(num))
    14     sc.stop()
    15   }
    16 }
    View Code

       文件:

    xyr  a b c d f g a d f g
    a a a a a a a a a
    w e r t y yuu 
    View Code

      运行结果:

     1 /usr/lib/jvm/java-8-openjdk-amd64/bin/java -javaagent:/usr/local/idea/lib/idea_rt.jar=44451:/usr/local/idea/bin -Dfile.encoding=UTF-8 -classpath /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/charsets.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/cldrdata.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/dnsns.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/icedtea-sound.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/jaccess.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/java-atk-wrapper.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/localedata.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/nashorn.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/sunec.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/sunjce_provider.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/sunpkcs11.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/zipfs.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/jce.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/jsse.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/management-agent.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/resources.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar:/home/zyr/IdeaProjects/myspark/target/classes:/home/zyr/.m2/repository/org/scala-lang/scala-reflect/2.11.0/scala-reflect-2.11.0.jar:/home/zyr/.m2/repository/org/scala-lang/scala-library/2.11.0/scala-library-2.11.0.jar:/home/zyr/.m2/repository/org/apache/spark/spark-mllib_2.11/2.4.3/spark-mllib_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/scala-lang/modules/scala-parser-combinators_2.11/1.1.0/scala-parser-combinators_2.11-1.1.0.jar:/home/zyr/.m2/repository/org/scala-lang/scala-library/2.11.12/scala-library-2.11.12.jar:/home/zyr/.m2/repository/org/apache/spark/spark-core_2.11/2.4.3/spark-core_2.11-2.4.3.jar:/home/zyr/.m2/repository/com/thoughtworks/paranamer/paranamer/2.8/paranamer-2.8.jar:/home/zyr/.m2/repository/org/apache/avro/avro/1.8.2/avro-1.8.2.jar:/home/zyr/.m2/repository/org/codehaus/jackson/jackson-core-asl/1.9.13/jackson-core-asl-1.9.13.jar:/home/zyr/.m2/repository/org/codehaus/jackson/jackson-mapper-asl/1.9.13/jackson-mapper-asl-1.9.13.jar:/home/zyr/.m2/repository/org/apache/commons/commons-compress/1.8.1/commons-compress-1.8.1.jar:/home/zyr/.m2/repository/org/tukaani/xz/1.5/xz-1.5.jar:/home/zyr/.m2/repository/org/apache/avro/avro-mapred/1.8.2/avro-mapred-1.8.2-hadoop2.jar:/home/zyr/.m2/repository/org/apache/avro/avro-ipc/1.8.2/avro-ipc-1.8.2.jar:/home/zyr/.m2/repository/commons-codec/commons-codec/1.9/commons-codec-1.9.jar:/home/zyr/.m2/repository/com/twitter/chill_2.11/0.9.3/chill_2.11-0.9.3.jar:/home/zyr/.m2/repository/com/esotericsoftware/kryo-shaded/4.0.2/kryo-shaded-4.0.2.jar:/home/zyr/.m2/repository/com/esotericsoftware/minlog/1.3.0/minlog-1.3.0.jar:/home/zyr/.m2/repository/org/objenesis/objenesis/2.5.1/objenesis-2.5.1.jar:/home/zyr/.m2/repository/com/twitter/chill-java/0.9.3/chill-java-0.9.3.jar:/home/zyr/.m2/repository/org/apache/xbean/xbean-asm6-shaded/4.8/xbean-asm6-shaded-4.8.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-client/2.6.5/hadoop-client-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-common/2.6.5/hadoop-common-2.6.5.jar:/home/zyr/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar:/home/zyr/.m2/repository/xmlenc/xmlenc/0.52/xmlenc-0.52.jar:/home/zyr/.m2/repository/commons-httpclient/commons-httpclient/3.1/commons-httpclient-3.1.jar:/home/zyr/.m2/repository/commons-io/commons-io/2.4/commons-io-2.4.jar:/home/zyr/.m2/repository/commons-collections/commons-collections/3.2.2/commons-collections-3.2.2.jar:/home/zyr/.m2/repository/commons-configuration/commons-configuration/1.6/commons-configuration-1.6.jar:/home/zyr/.m2/repository/commons-digester/commons-digester/1.8/commons-digester-1.8.jar:/home/zyr/.m2/repository/commons-beanutils/commons-beanutils/1.7.0/commons-beanutils-1.7.0.jar:/home/zyr/.m2/repository/com/google/code/gson/gson/2.2.4/gson-2.2.4.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-auth/2.6.5/hadoop-auth-2.6.5.jar:/home/zyr/.m2/repository/org/apache/httpcomponents/httpclient/4.2.5/httpclient-4.2.5.jar:/home/zyr/.m2/repository/org/apache/httpcomponents/httpcore/4.2.4/httpcore-4.2.4.jar:/home/zyr/.m2/repository/org/apache/directory/server/apacheds-kerberos-codec/2.0.0-M15/apacheds-kerberos-codec-2.0.0-M15.jar:/home/zyr/.m2/repository/org/apache/directory/server/apacheds-i18n/2.0.0-M15/apacheds-i18n-2.0.0-M15.jar:/home/zyr/.m2/repository/org/apache/directory/api/api-asn1-api/1.0.0-M20/api-asn1-api-1.0.0-M20.jar:/home/zyr/.m2/repository/org/apache/directory/api/api-util/1.0.0-M20/api-util-1.0.0-M20.jar:/home/zyr/.m2/repository/org/apache/curator/curator-client/2.6.0/curator-client-2.6.0.jar:/home/zyr/.m2/repository/org/htrace/htrace-core/3.0.4/htrace-core-3.0.4.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-hdfs/2.6.5/hadoop-hdfs-2.6.5.jar:/home/zyr/.m2/repository/org/mortbay/jetty/jetty-util/6.1.26/jetty-util-6.1.26.jar:/home/zyr/.m2/repository/xerces/xercesImpl/2.9.1/xercesImpl-2.9.1.jar:/home/zyr/.m2/repository/xml-apis/xml-apis/1.3.04/xml-apis-1.3.04.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-app/2.6.5/hadoop-mapreduce-client-app-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-common/2.6.5/hadoop-mapreduce-client-common-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-yarn-client/2.6.5/hadoop-yarn-client-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-yarn-server-common/2.6.5/hadoop-yarn-server-common-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-shuffle/2.6.5/hadoop-mapreduce-client-shuffle-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-yarn-api/2.6.5/hadoop-yarn-api-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-core/2.6.5/hadoop-mapreduce-client-core-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-yarn-common/2.6.5/hadoop-yarn-common-2.6.5.jar:/home/zyr/.m2/repository/javax/xml/bind/jaxb-api/2.2.2/jaxb-api-2.2.2.jar:/home/zyr/.m2/repository/javax/xml/stream/stax-api/1.0-2/stax-api-1.0-2.jar:/home/zyr/.m2/repository/org/codehaus/jackson/jackson-jaxrs/1.9.13/jackson-jaxrs-1.9.13.jar:/home/zyr/.m2/repository/org/codehaus/jackson/jackson-xc/1.9.13/jackson-xc-1.9.13.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-jobclient/2.6.5/hadoop-mapreduce-client-jobclient-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-annotations/2.6.5/hadoop-annotations-2.6.5.jar:/home/zyr/.m2/repository/org/apache/spark/spark-launcher_2.11/2.4.3/spark-launcher_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/apache/spark/spark-kvstore_2.11/2.4.3/spark-kvstore_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/fusesource/leveldbjni/leveldbjni-all/1.8/leveldbjni-all-1.8.jar:/home/zyr/.m2/repository/com/fasterxml/jackson/core/jackson-core/2.6.7/jackson-core-2.6.7.jar:/home/zyr/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.6.7/jackson-annotations-2.6.7.jar:/home/zyr/.m2/repository/org/apache/spark/spark-network-common_2.11/2.4.3/spark-network-common_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/apache/spark/spark-network-shuffle_2.11/2.4.3/spark-network-shuffle_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/apache/spark/spark-unsafe_2.11/2.4.3/spark-unsafe_2.11-2.4.3.jar:/home/zyr/.m2/repository/javax/activation/activation/1.1.1/activation-1.1.1.jar:/home/zyr/.m2/repository/org/apache/curator/curator-recipes/2.6.0/curator-recipes-2.6.0.jar:/home/zyr/.m2/repository/org/apache/curator/curator-framework/2.6.0/curator-framework-2.6.0.jar:/home/zyr/.m2/repository/com/google/guava/guava/16.0.1/guava-16.0.1.jar:/home/zyr/.m2/repository/org/apache/zookeeper/zookeeper/3.4.6/zookeeper-3.4.6.jar:/home/zyr/.m2/repository/javax/servlet/javax.servlet-api/3.1.0/javax.servlet-api-3.1.0.jar:/home/zyr/.m2/repository/org/apache/commons/commons-lang3/3.5/commons-lang3-3.5.jar:/home/zyr/.m2/repository/com/google/code/findbugs/jsr305/1.3.9/jsr305-1.3.9.jar:/home/zyr/.m2/repository/org/slf4j/slf4j-api/1.7.16/slf4j-api-1.7.16.jar:/home/zyr/.m2/repository/org/slf4j/jul-to-slf4j/1.7.16/jul-to-slf4j-1.7.16.jar:/home/zyr/.m2/repository/org/slf4j/jcl-over-slf4j/1.7.16/jcl-over-slf4j-1.7.16.jar:/home/zyr/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/home/zyr/.m2/repository/org/slf4j/slf4j-log4j12/1.7.16/slf4j-log4j12-1.7.16.jar:/home/zyr/.m2/repository/com/ning/compress-lzf/1.0.3/compress-lzf-1.0.3.jar:/home/zyr/.m2/repository/org/xerial/snappy/snappy-java/1.1.7.3/snappy-java-1.1.7.3.jar:/home/zyr/.m2/repository/org/lz4/lz4-java/1.4.0/lz4-java-1.4.0.jar:/home/zyr/.m2/repository/com/github/luben/zstd-jni/1.3.2-2/zstd-jni-1.3.2-2.jar:/home/zyr/.m2/repository/org/roaringbitmap/RoaringBitmap/0.7.45/RoaringBitmap-0.7.45.jar:/home/zyr/.m2/repository/org/roaringbitmap/shims/0.7.45/shims-0.7.45.jar:/home/zyr/.m2/repository/commons-net/commons-net/3.1/commons-net-3.1.jar:/home/zyr/.m2/repository/org/json4s/json4s-jackson_2.11/3.5.3/json4s-jackson_2.11-3.5.3.jar:/home/zyr/.m2/repository/org/json4s/json4s-core_2.11/3.5.3/json4s-core_2.11-3.5.3.jar:/home/zyr/.m2/repository/org/json4s/json4s-ast_2.11/3.5.3/json4s-ast_2.11-3.5.3.jar:/home/zyr/.m2/repository/org/json4s/json4s-scalap_2.11/3.5.3/json4s-scalap_2.11-3.5.3.jar:/home/zyr/.m2/repository/org/scala-lang/modules/scala-xml_2.11/1.0.6/scala-xml_2.11-1.0.6.jar:/home/zyr/.m2/repository/org/glassfish/jersey/core/jersey-client/2.22.2/jersey-client-2.22.2.jar:/home/zyr/.m2/repository/javax/ws/rs/javax.ws.rs-api/2.0.1/javax.ws.rs-api-2.0.1.jar:/home/zyr/.m2/repository/org/glassfish/hk2/hk2-api/2.4.0-b34/hk2-api-2.4.0-b34.jar:/home/zyr/.m2/repository/org/glassfish/hk2/hk2-utils/2.4.0-b34/hk2-utils-2.4.0-b34.jar:/home/zyr/.m2/repository/org/glassfish/hk2/external/aopalliance-repackaged/2.4.0-b34/aopalliance-repackaged-2.4.0-b34.jar:/home/zyr/.m2/repository/org/glassfish/hk2/external/javax.inject/2.4.0-b34/javax.inject-2.4.0-b34.jar:/home/zyr/.m2/repository/org/glassfish/hk2/hk2-locator/2.4.0-b34/hk2-locator-2.4.0-b34.jar:/home/zyr/.m2/repository/org/javassist/javassist/3.18.1-GA/javassist-3.18.1-GA.jar:/home/zyr/.m2/repository/org/glassfish/jersey/core/jersey-common/2.22.2/jersey-common-2.22.2.jar:/home/zyr/.m2/repository/javax/annotation/javax.annotation-api/1.2/javax.annotation-api-1.2.jar:/home/zyr/.m2/repository/org/glassfish/jersey/bundles/repackaged/jersey-guava/2.22.2/jersey-guava-2.22.2.jar:/home/zyr/.m2/repository/org/glassfish/hk2/osgi-resource-locator/1.0.1/osgi-resource-locator-1.0.1.jar:/home/zyr/.m2/repository/org/glassfish/jersey/core/jersey-server/2.22.2/jersey-server-2.22.2.jar:/home/zyr/.m2/repository/org/glassfish/jersey/media/jersey-media-jaxb/2.22.2/jersey-media-jaxb-2.22.2.jar:/home/zyr/.m2/repository/javax/validation/validation-api/1.1.0.Final/validation-api-1.1.0.Final.jar:/home/zyr/.m2/repository/org/glassfish/jersey/containers/jersey-container-servlet/2.22.2/jersey-container-servlet-2.22.2.jar:/home/zyr/.m2/repository/org/glassfish/jersey/containers/jersey-container-servlet-core/2.22.2/jersey-container-servlet-core-2.22.2.jar:/home/zyr/.m2/repository/io/netty/netty-all/4.1.17.Final/netty-all-4.1.17.Final.jar:/home/zyr/.m2/repository/io/netty/netty/3.9.9.Final/netty-3.9.9.Final.jar:/home/zyr/.m2/repository/com/clearspring/analytics/stream/2.7.0/stream-2.7.0.jar:/home/zyr/.m2/repository/io/dropwizard/metrics/metrics-core/3.1.5/metrics-core-3.1.5.jar:/home/zyr/.m2/repository/io/dropwizard/metrics/metrics-jvm/3.1.5/metrics-jvm-3.1.5.jar:/home/zyr/.m2/repository/io/dropwizard/metrics/metrics-json/3.1.5/metrics-json-3.1.5.jar:/home/zyr/.m2/repository/io/dropwizard/metrics/metrics-graphite/3.1.5/metrics-graphite-3.1.5.jar:/home/zyr/.m2/repository/com/fasterxml/jackson/core/jackson-databind/2.6.7.1/jackson-databind-2.6.7.1.jar:/home/zyr/.m2/repository/com/fasterxml/jackson/module/jackson-module-scala_2.11/2.6.7.1/jackson-module-scala_2.11-2.6.7.1.jar:/home/zyr/.m2/repository/org/scala-lang/scala-reflect/2.11.8/scala-reflect-2.11.8.jar:/home/zyr/.m2/repository/com/fasterxml/jackson/module/jackson-module-paranamer/2.7.9/jackson-module-paranamer-2.7.9.jar:/home/zyr/.m2/repository/org/apache/ivy/ivy/2.4.0/ivy-2.4.0.jar:/home/zyr/.m2/repository/oro/oro/2.0.8/oro-2.0.8.jar:/home/zyr/.m2/repository/net/razorvine/pyrolite/4.13/pyrolite-4.13.jar:/home/zyr/.m2/repository/net/sf/py4j/py4j/0.10.7/py4j-0.10.7.jar:/home/zyr/.m2/repository/org/apache/commons/commons-crypto/1.0.0/commons-crypto-1.0.0.jar:/home/zyr/.m2/repository/org/apache/spark/spark-streaming_2.11/2.4.3/spark-streaming_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/apache/spark/spark-sql_2.11/2.4.3/spark-sql_2.11-2.4.3.jar:/home/zyr/.m2/repository/com/univocity/univocity-parsers/2.7.3/univocity-parsers-2.7.3.jar:/home/zyr/.m2/repository/org/apache/spark/spark-sketch_2.11/2.4.3/spark-sketch_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/apache/spark/spark-catalyst_2.11/2.4.3/spark-catalyst_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/codehaus/janino/janino/3.0.9/janino-3.0.9.jar:/home/zyr/.m2/repository/org/codehaus/janino/commons-compiler/3.0.9/commons-compiler-3.0.9.jar:/home/zyr/.m2/repository/org/antlr/antlr4-runtime/4.7/antlr4-runtime-4.7.jar:/home/zyr/.m2/repository/org/apache/orc/orc-core/1.5.5/orc-core-1.5.5-nohive.jar:/home/zyr/.m2/repository/org/apache/orc/orc-shims/1.5.5/orc-shims-1.5.5.jar:/home/zyr/.m2/repository/com/google/protobuf/protobuf-java/2.5.0/protobuf-java-2.5.0.jar:/home/zyr/.m2/repository/commons-lang/commons-lang/2.6/commons-lang-2.6.jar:/home/zyr/.m2/repository/io/airlift/aircompressor/0.10/aircompressor-0.10.jar:/home/zyr/.m2/repository/org/apache/orc/orc-mapreduce/1.5.5/orc-mapreduce-1.5.5-nohive.jar:/home/zyr/.m2/repository/org/apache/parquet/parquet-column/1.10.1/parquet-column-1.10.1.jar:/home/zyr/.m2/repository/org/apache/parquet/parquet-common/1.10.1/parquet-common-1.10.1.jar:/home/zyr/.m2/repository/org/apache/parquet/parquet-encoding/1.10.1/parquet-encoding-1.10.1.jar:/home/zyr/.m2/repository/org/apache/parquet/parquet-hadoop/1.10.1/parquet-hadoop-1.10.1.jar:/home/zyr/.m2/repository/org/apache/parquet/parquet-format/2.4.0/parquet-format-2.4.0.jar:/home/zyr/.m2/repository/org/apache/parquet/parquet-jackson/1.10.1/parquet-jackson-1.10.1.jar:/home/zyr/.m2/repository/org/apache/arrow/arrow-vector/0.10.0/arrow-vector-0.10.0.jar:/home/zyr/.m2/repository/org/apache/arrow/arrow-format/0.10.0/arrow-format-0.10.0.jar:/home/zyr/.m2/repository/org/apache/arrow/arrow-memory/0.10.0/arrow-memory-0.10.0.jar:/home/zyr/.m2/repository/joda-time/joda-time/2.9.9/joda-time-2.9.9.jar:/home/zyr/.m2/repository/com/carrotsearch/hppc/0.7.2/hppc-0.7.2.jar:/home/zyr/.m2/repository/com/vlkan/flatbuffers/1.2.0-3f79e055/flatbuffers-1.2.0-3f79e055.jar:/home/zyr/.m2/repository/org/apache/spark/spark-graphx_2.11/2.4.3/spark-graphx_2.11-2.4.3.jar:/home/zyr/.m2/repository/com/github/fommil/netlib/core/1.1.2/core-1.1.2.jar:/home/zyr/.m2/repository/net/sourceforge/f2j/arpack_combined_all/0.1/arpack_combined_all-0.1.jar:/home/zyr/.m2/repository/org/apache/spark/spark-mllib-local_2.11/2.4.3/spark-mllib-local_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/scalanlp/breeze_2.11/0.13.2/breeze_2.11-0.13.2.jar:/home/zyr/.m2/repository/org/scalanlp/breeze-macros_2.11/0.13.2/breeze-macros_2.11-0.13.2.jar:/home/zyr/.m2/repository/net/sf/opencsv/opencsv/2.3/opencsv-2.3.jar:/home/zyr/.m2/repository/com/github/rwl/jtransforms/2.4.0/jtransforms-2.4.0.jar:/home/zyr/.m2/repository/org/spire-math/spire_2.11/0.13.0/spire_2.11-0.13.0.jar:/home/zyr/.m2/repository/org/spire-math/spire-macros_2.11/0.13.0/spire-macros_2.11-0.13.0.jar:/home/zyr/.m2/repository/org/typelevel/machinist_2.11/0.6.1/machinist_2.11-0.6.1.jar:/home/zyr/.m2/repository/com/chuusai/shapeless_2.11/2.3.2/shapeless_2.11-2.3.2.jar:/home/zyr/.m2/repository/org/typelevel/macro-compat_2.11/1.1.1/macro-compat_2.11-1.1.1.jar:/home/zyr/.m2/repository/org/apache/commons/commons-math3/3.4.1/commons-math3-3.4.1.jar:/home/zyr/.m2/repository/org/apache/spark/spark-tags_2.11/2.4.3/spark-tags_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/spark-project/spark/unused/1.0.0/unused-1.0.0.jar com.kmeans.MyTest
     2 Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
     3 19/07/10 11:36:47 WARN Utils: Your hostname, zyrpc resolves to a loopback address: 127.0.1.1; using 192.168.31.160 instead (on interface ens33)
     4 19/07/10 11:36:47 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
     5 19/07/10 11:36:47 INFO SparkContext: Running Spark version 2.4.3
     6 19/07/10 11:36:49 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
     7 19/07/10 11:36:50 INFO SparkContext: Submitted application: Simple Application
     8 19/07/10 11:36:50 INFO SecurityManager: Changing view acls to: zyr
     9 19/07/10 11:36:50 INFO SecurityManager: Changing modify acls to: zyr
    10 19/07/10 11:36:50 INFO SecurityManager: Changing view acls groups to: 
    11 19/07/10 11:36:50 INFO SecurityManager: Changing modify acls groups to: 
    12 19/07/10 11:36:50 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(zyr); groups with view permissions: Set(); users  with modify permissions: Set(zyr); groups with modify permissions: Set()
    13 19/07/10 11:36:52 INFO Utils: Successfully started service 'sparkDriver' on port 41147.
    14 19/07/10 11:36:52 INFO SparkEnv: Registering MapOutputTracker
    15 19/07/10 11:36:52 INFO SparkEnv: Registering BlockManagerMaster
    16 19/07/10 11:36:52 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
    17 19/07/10 11:36:52 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
    18 19/07/10 11:36:52 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-63b48034-1ffc-40fa-bb45-6c117cb0451b
    19 19/07/10 11:36:52 INFO MemoryStore: MemoryStore started with capacity 345.0 MB
    20 19/07/10 11:36:52 INFO SparkEnv: Registering OutputCommitCoordinator
    21 19/07/10 11:36:53 INFO Utils: Successfully started service 'SparkUI' on port 4040.
    22 19/07/10 11:36:53 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.31.160:4040
    23 19/07/10 11:36:54 INFO Executor: Starting executor ID driver on host localhost
    24 19/07/10 11:36:54 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 34263.
    25 19/07/10 11:36:54 INFO NettyBlockTransferService: Server created on 192.168.31.160:34263
    26 19/07/10 11:36:54 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
    27 19/07/10 11:36:54 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.31.160, 34263, None)
    28 19/07/10 11:36:54 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.31.160:34263 with 345.0 MB RAM, BlockManagerId(driver, 192.168.31.160, 34263, None)
    29 19/07/10 11:36:54 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.31.160, 34263, None)
    30 19/07/10 11:36:54 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.31.160, 34263, None)
    31 19/07/10 11:36:57 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 214.6 KB, free 344.8 MB)
    32 19/07/10 11:36:57 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 20.4 KB, free 344.8 MB)
    33 19/07/10 11:36:57 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.31.160:34263 (size: 20.4 KB, free: 345.0 MB)
    34 19/07/10 11:36:57 INFO SparkContext: Created broadcast 0 from textFile at MyTest.scala:11
    35 19/07/10 11:36:58 INFO FileInputFormat: Total input paths to process : 1
    36 19/07/10 11:36:58 INFO SparkContext: Starting job: count at MyTest.scala:12
    37 19/07/10 11:36:58 INFO DAGScheduler: Got job 0 (count at MyTest.scala:12) with 2 output partitions
    38 19/07/10 11:36:58 INFO DAGScheduler: Final stage: ResultStage 0 (count at MyTest.scala:12)
    39 19/07/10 11:36:58 INFO DAGScheduler: Parents of final stage: List()
    40 19/07/10 11:36:58 INFO DAGScheduler: Missing parents: List()
    41 19/07/10 11:36:58 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[3] at filter at MyTest.scala:12), which has no missing parents
    42 19/07/10 11:36:58 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.7 KB, free 344.8 MB)
    43 19/07/10 11:36:58 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.1 KB, free 344.8 MB)
    44 19/07/10 11:36:58 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.31.160:34263 (size: 2.1 KB, free: 345.0 MB)
    45 19/07/10 11:36:58 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1161
    46 19/07/10 11:36:58 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[3] at filter at MyTest.scala:12) (first 15 tasks are for partitions Vector(0, 1))
    47 19/07/10 11:36:58 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
    48 19/07/10 11:36:58 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 7883 bytes)
    49 19/07/10 11:36:58 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 7883 bytes)
    50 19/07/10 11:36:58 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
    51 19/07/10 11:36:58 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
    52 19/07/10 11:36:59 INFO HadoopRDD: Input split: file:/home/zyr/file.txt:0+29
    53 19/07/10 11:36:59 INFO HadoopRDD: Input split: file:/home/zyr/file.txt:29+29
    54 19/07/10 11:36:59 INFO MemoryStore: Block rdd_1_0 stored as values in memory (estimated size 192.0 B, free 344.8 MB)
    55 19/07/10 11:36:59 INFO MemoryStore: Block rdd_1_1 stored as values in memory (estimated size 96.0 B, free 344.8 MB)
    56 19/07/10 11:36:59 INFO BlockManagerInfo: Added rdd_1_1 in memory on 192.168.31.160:34263 (size: 96.0 B, free: 345.0 MB)
    57 19/07/10 11:36:59 INFO BlockManagerInfo: Added rdd_1_0 in memory on 192.168.31.160:34263 (size: 192.0 B, free: 345.0 MB)
    58 19/07/10 11:36:59 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 875 bytes result sent to driver
    59 19/07/10 11:36:59 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 875 bytes result sent to driver
    60 19/07/10 11:36:59 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 635 ms on localhost (executor driver) (1/2)
    61 19/07/10 11:36:59 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 593 ms on localhost (executor driver) (2/2)
    62 19/07/10 11:36:59 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
    63 19/07/10 11:36:59 INFO DAGScheduler: ResultStage 0 (count at MyTest.scala:12) finished in 1.203 s
    64 19/07/10 11:36:59 INFO DAGScheduler: Job 0 finished: count at MyTest.scala:12, took 1.438576 s
    65 Words with a : 11
    66 19/07/10 11:36:59 INFO SparkUI: Stopped Spark web UI at http://192.168.31.160:4040
    67 19/07/10 11:36:59 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
    68 19/07/10 11:36:59 INFO MemoryStore: MemoryStore cleared
    69 19/07/10 11:36:59 INFO BlockManager: BlockManager stopped
    70 19/07/10 11:36:59 INFO BlockManagerMaster: BlockManagerMaster stopped
    71 19/07/10 11:36:59 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
    72 19/07/10 11:36:59 INFO SparkContext: Successfully stopped SparkContext
    73 19/07/10 11:36:59 INFO ShutdownHookManager: Shutdown hook called
    74 19/07/10 11:36:59 INFO ShutdownHookManager: Deleting directory /tmp/spark-e07b1de0-0ac0-4abe-952a-504c2c7282fd
    75 
    76 Process finished with exit code 0
    View Code

      第二个程序:

     1 package com.kmeans
     2 
     3 import org.apache.spark.mllib.clustering.KMeans
     4 import org.apache.spark.mllib.linalg.Vectors
     5 import org.apache.spark.{SparkConf, SparkContext}
     6 
     7 
     8 /**
     9 Scala版K近邻算法获取三维空间点中数据的归属
    10   * ****************
    11   * 测试数据(x,y,z) *
    12   * ***************
    13   * 0.0 0.0 0.0
    14   * 0.1 0.1 0.1
    15   * 0.2 0.2 0.2
    16   * 9.0 9.0 9.0
    17   * 9.1 9.1 9.1
    18   * 9.2 9.2 9.2
    19   */
    20 object Kmeans {
    21   def main(args: Array[String]): Unit = {
    22 
    23     val conf = new SparkConf().setAppName("Simple Application").setMaster("local[2]")
    24     val context=new SparkContext(conf)
    25     val dataSourceRDD = context.textFile("file:///home/zyr/kmeanstest.txt").cache()
    26     val trainRDD = dataSourceRDD.map(lines => Vectors.dense(lines.split(" ").map(_.toDouble)))
    27     // trainRDD.foreach(trainRow => println(trainRow)
    28     // trainRDD.foreach(println)
    29     // 训练数据得到模型
    30     // 参数一:训练数据(Vectors类型的RDD)
    31     // 参数二:中心簇数量 0 ~ n
    32     // 参数三:代次数
    33     val model = KMeans.train(trainRDD, 3, 30)
    34 
    35     // 获取数据模型的中心点
    36     val clustercenters = model.clusterCenters
    37 
    38     // 打印数据模型的中心点
    39     clustercenters.foreach(println)
    40 
    41     //计算误差
    42     val cross = model.computeCost(trainRDD)
    43     println("误差为:" + cross)
    44 
    45     // 使用模型匹配测试数据获取预测结果
    46     val res1 = model.predict(Vectors.dense("0.2 0.2 0.2".split(' ').map(_.toDouble)))
    47     val res2 = model.predict(Vectors.dense("0.25 0.25 0.25".split(' ').map(_.toDouble)))
    48     val res3 = model.predict(Vectors.dense("0.1 0.1 0.1".split(' ').map(_.toDouble)))
    49     val res4 = model.predict(Vectors.dense("9 9 9".split(' ').map(_.toDouble)))
    50     val res5 = model.predict(Vectors.dense("9.1 9.1 9.1".split(' ').map(_.toDouble)))
    51     val res6 = model.predict(Vectors.dense("9.06 9.06 9.06".split(' ').map(_.toDouble)))
    52     // println("预测结果为:
    " + res1 + "
    " + res2 + "
    "  + res3 + "
    "  + res4 + "
    "  + res5 + "
    "  + res6)
    53     /**
    54       * 这是三个中心点
    55       * [9.1,9.1,9.1]
    56       * [0.05,0.05,0.05]
    57       * [0.2,0.2,0.2]
    58       * 以下为类簇值
    59       * 2
    60       * 2
    61       * 1
    62       * 0
    63       * 0
    64       * 0
    65       * 此处结果可以看出输入数据与中心点更靠近的话就属于哪一个簇
    66       */
    67     // 使用原数据进行交叉评估预测
    68     val crossPredictRes = dataSourceRDD.map{
    69       lines =>
    70         val lineVectors = Vectors.dense(lines.split(" ").map(_.toDouble))
    71         val predictRes = model.predict(lineVectors)
    72         lineVectors + "==>" + predictRes
    73     }
    74     crossPredictRes.foreach(println)
    75 
    76     /**
    77       * [9.0,9.0,9.0]==>0
    78       * [9.1,9.1,9.1]==>0
    79       * [9.2,9.2,9.2]==>0
    80       * [0.0,0.0,0.0]==>1
    81       * [0.1,0.1,0.1]==>1
    82       * [0.2,0.2,0.2]==>2
    83       *
    84       */
    85   }
    86 }
    View Code

       文件:

    1 0.0 0.0 0.0
    2 0.1 0.1 0.1
    3 0.2 0.2 0.2
    4 9.0 9.0 9.0
    5 9.1 9.1 9.1
    6 9.2 9.2 9.2
    View Code

       运行结果:

      1 /usr/lib/jvm/java-8-openjdk-amd64/bin/java -javaagent:/usr/local/idea/lib/idea_rt.jar=34781:/usr/local/idea/bin -Dfile.encoding=UTF-8 -classpath /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/charsets.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/cldrdata.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/dnsns.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/icedtea-sound.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/jaccess.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/java-atk-wrapper.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/localedata.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/nashorn.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/sunec.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/sunjce_provider.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/sunpkcs11.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/zipfs.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/jce.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/jsse.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/management-agent.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/resources.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar:/home/zyr/IdeaProjects/myspark/target/classes:/home/zyr/.m2/repository/org/scala-lang/scala-reflect/2.11.0/scala-reflect-2.11.0.jar:/home/zyr/.m2/repository/org/scala-lang/scala-library/2.11.0/scala-library-2.11.0.jar:/home/zyr/.m2/repository/org/apache/spark/spark-mllib_2.11/2.4.3/spark-mllib_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/scala-lang/modules/scala-parser-combinators_2.11/1.1.0/scala-parser-combinators_2.11-1.1.0.jar:/home/zyr/.m2/repository/org/scala-lang/scala-library/2.11.12/scala-library-2.11.12.jar:/home/zyr/.m2/repository/org/apache/spark/spark-core_2.11/2.4.3/spark-core_2.11-2.4.3.jar:/home/zyr/.m2/repository/com/thoughtworks/paranamer/paranamer/2.8/paranamer-2.8.jar:/home/zyr/.m2/repository/org/apache/avro/avro/1.8.2/avro-1.8.2.jar:/home/zyr/.m2/repository/org/codehaus/jackson/jackson-core-asl/1.9.13/jackson-core-asl-1.9.13.jar:/home/zyr/.m2/repository/org/codehaus/jackson/jackson-mapper-asl/1.9.13/jackson-mapper-asl-1.9.13.jar:/home/zyr/.m2/repository/org/apache/commons/commons-compress/1.8.1/commons-compress-1.8.1.jar:/home/zyr/.m2/repository/org/tukaani/xz/1.5/xz-1.5.jar:/home/zyr/.m2/repository/org/apache/avro/avro-mapred/1.8.2/avro-mapred-1.8.2-hadoop2.jar:/home/zyr/.m2/repository/org/apache/avro/avro-ipc/1.8.2/avro-ipc-1.8.2.jar:/home/zyr/.m2/repository/commons-codec/commons-codec/1.9/commons-codec-1.9.jar:/home/zyr/.m2/repository/com/twitter/chill_2.11/0.9.3/chill_2.11-0.9.3.jar:/home/zyr/.m2/repository/com/esotericsoftware/kryo-shaded/4.0.2/kryo-shaded-4.0.2.jar:/home/zyr/.m2/repository/com/esotericsoftware/minlog/1.3.0/minlog-1.3.0.jar:/home/zyr/.m2/repository/org/objenesis/objenesis/2.5.1/objenesis-2.5.1.jar:/home/zyr/.m2/repository/com/twitter/chill-java/0.9.3/chill-java-0.9.3.jar:/home/zyr/.m2/repository/org/apache/xbean/xbean-asm6-shaded/4.8/xbean-asm6-shaded-4.8.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-client/2.6.5/hadoop-client-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-common/2.6.5/hadoop-common-2.6.5.jar:/home/zyr/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar:/home/zyr/.m2/repository/xmlenc/xmlenc/0.52/xmlenc-0.52.jar:/home/zyr/.m2/repository/commons-httpclient/commons-httpclient/3.1/commons-httpclient-3.1.jar:/home/zyr/.m2/repository/commons-io/commons-io/2.4/commons-io-2.4.jar:/home/zyr/.m2/repository/commons-collections/commons-collections/3.2.2/commons-collections-3.2.2.jar:/home/zyr/.m2/repository/commons-configuration/commons-configuration/1.6/commons-configuration-1.6.jar:/home/zyr/.m2/repository/commons-digester/commons-digester/1.8/commons-digester-1.8.jar:/home/zyr/.m2/repository/commons-beanutils/commons-beanutils/1.7.0/commons-beanutils-1.7.0.jar:/home/zyr/.m2/repository/com/google/code/gson/gson/2.2.4/gson-2.2.4.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-auth/2.6.5/hadoop-auth-2.6.5.jar:/home/zyr/.m2/repository/org/apache/httpcomponents/httpclient/4.2.5/httpclient-4.2.5.jar:/home/zyr/.m2/repository/org/apache/httpcomponents/httpcore/4.2.4/httpcore-4.2.4.jar:/home/zyr/.m2/repository/org/apache/directory/server/apacheds-kerberos-codec/2.0.0-M15/apacheds-kerberos-codec-2.0.0-M15.jar:/home/zyr/.m2/repository/org/apache/directory/server/apacheds-i18n/2.0.0-M15/apacheds-i18n-2.0.0-M15.jar:/home/zyr/.m2/repository/org/apache/directory/api/api-asn1-api/1.0.0-M20/api-asn1-api-1.0.0-M20.jar:/home/zyr/.m2/repository/org/apache/directory/api/api-util/1.0.0-M20/api-util-1.0.0-M20.jar:/home/zyr/.m2/repository/org/apache/curator/curator-client/2.6.0/curator-client-2.6.0.jar:/home/zyr/.m2/repository/org/htrace/htrace-core/3.0.4/htrace-core-3.0.4.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-hdfs/2.6.5/hadoop-hdfs-2.6.5.jar:/home/zyr/.m2/repository/org/mortbay/jetty/jetty-util/6.1.26/jetty-util-6.1.26.jar:/home/zyr/.m2/repository/xerces/xercesImpl/2.9.1/xercesImpl-2.9.1.jar:/home/zyr/.m2/repository/xml-apis/xml-apis/1.3.04/xml-apis-1.3.04.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-app/2.6.5/hadoop-mapreduce-client-app-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-common/2.6.5/hadoop-mapreduce-client-common-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-yarn-client/2.6.5/hadoop-yarn-client-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-yarn-server-common/2.6.5/hadoop-yarn-server-common-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-shuffle/2.6.5/hadoop-mapreduce-client-shuffle-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-yarn-api/2.6.5/hadoop-yarn-api-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-core/2.6.5/hadoop-mapreduce-client-core-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-yarn-common/2.6.5/hadoop-yarn-common-2.6.5.jar:/home/zyr/.m2/repository/javax/xml/bind/jaxb-api/2.2.2/jaxb-api-2.2.2.jar:/home/zyr/.m2/repository/javax/xml/stream/stax-api/1.0-2/stax-api-1.0-2.jar:/home/zyr/.m2/repository/org/codehaus/jackson/jackson-jaxrs/1.9.13/jackson-jaxrs-1.9.13.jar:/home/zyr/.m2/repository/org/codehaus/jackson/jackson-xc/1.9.13/jackson-xc-1.9.13.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-jobclient/2.6.5/hadoop-mapreduce-client-jobclient-2.6.5.jar:/home/zyr/.m2/repository/org/apache/hadoop/hadoop-annotations/2.6.5/hadoop-annotations-2.6.5.jar:/home/zyr/.m2/repository/org/apache/spark/spark-launcher_2.11/2.4.3/spark-launcher_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/apache/spark/spark-kvstore_2.11/2.4.3/spark-kvstore_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/fusesource/leveldbjni/leveldbjni-all/1.8/leveldbjni-all-1.8.jar:/home/zyr/.m2/repository/com/fasterxml/jackson/core/jackson-core/2.6.7/jackson-core-2.6.7.jar:/home/zyr/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.6.7/jackson-annotations-2.6.7.jar:/home/zyr/.m2/repository/org/apache/spark/spark-network-common_2.11/2.4.3/spark-network-common_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/apache/spark/spark-network-shuffle_2.11/2.4.3/spark-network-shuffle_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/apache/spark/spark-unsafe_2.11/2.4.3/spark-unsafe_2.11-2.4.3.jar:/home/zyr/.m2/repository/javax/activation/activation/1.1.1/activation-1.1.1.jar:/home/zyr/.m2/repository/org/apache/curator/curator-recipes/2.6.0/curator-recipes-2.6.0.jar:/home/zyr/.m2/repository/org/apache/curator/curator-framework/2.6.0/curator-framework-2.6.0.jar:/home/zyr/.m2/repository/com/google/guava/guava/16.0.1/guava-16.0.1.jar:/home/zyr/.m2/repository/org/apache/zookeeper/zookeeper/3.4.6/zookeeper-3.4.6.jar:/home/zyr/.m2/repository/javax/servlet/javax.servlet-api/3.1.0/javax.servlet-api-3.1.0.jar:/home/zyr/.m2/repository/org/apache/commons/commons-lang3/3.5/commons-lang3-3.5.jar:/home/zyr/.m2/repository/com/google/code/findbugs/jsr305/1.3.9/jsr305-1.3.9.jar:/home/zyr/.m2/repository/org/slf4j/slf4j-api/1.7.16/slf4j-api-1.7.16.jar:/home/zyr/.m2/repository/org/slf4j/jul-to-slf4j/1.7.16/jul-to-slf4j-1.7.16.jar:/home/zyr/.m2/repository/org/slf4j/jcl-over-slf4j/1.7.16/jcl-over-slf4j-1.7.16.jar:/home/zyr/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/home/zyr/.m2/repository/org/slf4j/slf4j-log4j12/1.7.16/slf4j-log4j12-1.7.16.jar:/home/zyr/.m2/repository/com/ning/compress-lzf/1.0.3/compress-lzf-1.0.3.jar:/home/zyr/.m2/repository/org/xerial/snappy/snappy-java/1.1.7.3/snappy-java-1.1.7.3.jar:/home/zyr/.m2/repository/org/lz4/lz4-java/1.4.0/lz4-java-1.4.0.jar:/home/zyr/.m2/repository/com/github/luben/zstd-jni/1.3.2-2/zstd-jni-1.3.2-2.jar:/home/zyr/.m2/repository/org/roaringbitmap/RoaringBitmap/0.7.45/RoaringBitmap-0.7.45.jar:/home/zyr/.m2/repository/org/roaringbitmap/shims/0.7.45/shims-0.7.45.jar:/home/zyr/.m2/repository/commons-net/commons-net/3.1/commons-net-3.1.jar:/home/zyr/.m2/repository/org/json4s/json4s-jackson_2.11/3.5.3/json4s-jackson_2.11-3.5.3.jar:/home/zyr/.m2/repository/org/json4s/json4s-core_2.11/3.5.3/json4s-core_2.11-3.5.3.jar:/home/zyr/.m2/repository/org/json4s/json4s-ast_2.11/3.5.3/json4s-ast_2.11-3.5.3.jar:/home/zyr/.m2/repository/org/json4s/json4s-scalap_2.11/3.5.3/json4s-scalap_2.11-3.5.3.jar:/home/zyr/.m2/repository/org/scala-lang/modules/scala-xml_2.11/1.0.6/scala-xml_2.11-1.0.6.jar:/home/zyr/.m2/repository/org/glassfish/jersey/core/jersey-client/2.22.2/jersey-client-2.22.2.jar:/home/zyr/.m2/repository/javax/ws/rs/javax.ws.rs-api/2.0.1/javax.ws.rs-api-2.0.1.jar:/home/zyr/.m2/repository/org/glassfish/hk2/hk2-api/2.4.0-b34/hk2-api-2.4.0-b34.jar:/home/zyr/.m2/repository/org/glassfish/hk2/hk2-utils/2.4.0-b34/hk2-utils-2.4.0-b34.jar:/home/zyr/.m2/repository/org/glassfish/hk2/external/aopalliance-repackaged/2.4.0-b34/aopalliance-repackaged-2.4.0-b34.jar:/home/zyr/.m2/repository/org/glassfish/hk2/external/javax.inject/2.4.0-b34/javax.inject-2.4.0-b34.jar:/home/zyr/.m2/repository/org/glassfish/hk2/hk2-locator/2.4.0-b34/hk2-locator-2.4.0-b34.jar:/home/zyr/.m2/repository/org/javassist/javassist/3.18.1-GA/javassist-3.18.1-GA.jar:/home/zyr/.m2/repository/org/glassfish/jersey/core/jersey-common/2.22.2/jersey-common-2.22.2.jar:/home/zyr/.m2/repository/javax/annotation/javax.annotation-api/1.2/javax.annotation-api-1.2.jar:/home/zyr/.m2/repository/org/glassfish/jersey/bundles/repackaged/jersey-guava/2.22.2/jersey-guava-2.22.2.jar:/home/zyr/.m2/repository/org/glassfish/hk2/osgi-resource-locator/1.0.1/osgi-resource-locator-1.0.1.jar:/home/zyr/.m2/repository/org/glassfish/jersey/core/jersey-server/2.22.2/jersey-server-2.22.2.jar:/home/zyr/.m2/repository/org/glassfish/jersey/media/jersey-media-jaxb/2.22.2/jersey-media-jaxb-2.22.2.jar:/home/zyr/.m2/repository/javax/validation/validation-api/1.1.0.Final/validation-api-1.1.0.Final.jar:/home/zyr/.m2/repository/org/glassfish/jersey/containers/jersey-container-servlet/2.22.2/jersey-container-servlet-2.22.2.jar:/home/zyr/.m2/repository/org/glassfish/jersey/containers/jersey-container-servlet-core/2.22.2/jersey-container-servlet-core-2.22.2.jar:/home/zyr/.m2/repository/io/netty/netty-all/4.1.17.Final/netty-all-4.1.17.Final.jar:/home/zyr/.m2/repository/io/netty/netty/3.9.9.Final/netty-3.9.9.Final.jar:/home/zyr/.m2/repository/com/clearspring/analytics/stream/2.7.0/stream-2.7.0.jar:/home/zyr/.m2/repository/io/dropwizard/metrics/metrics-core/3.1.5/metrics-core-3.1.5.jar:/home/zyr/.m2/repository/io/dropwizard/metrics/metrics-jvm/3.1.5/metrics-jvm-3.1.5.jar:/home/zyr/.m2/repository/io/dropwizard/metrics/metrics-json/3.1.5/metrics-json-3.1.5.jar:/home/zyr/.m2/repository/io/dropwizard/metrics/metrics-graphite/3.1.5/metrics-graphite-3.1.5.jar:/home/zyr/.m2/repository/com/fasterxml/jackson/core/jackson-databind/2.6.7.1/jackson-databind-2.6.7.1.jar:/home/zyr/.m2/repository/com/fasterxml/jackson/module/jackson-module-scala_2.11/2.6.7.1/jackson-module-scala_2.11-2.6.7.1.jar:/home/zyr/.m2/repository/org/scala-lang/scala-reflect/2.11.8/scala-reflect-2.11.8.jar:/home/zyr/.m2/repository/com/fasterxml/jackson/module/jackson-module-paranamer/2.7.9/jackson-module-paranamer-2.7.9.jar:/home/zyr/.m2/repository/org/apache/ivy/ivy/2.4.0/ivy-2.4.0.jar:/home/zyr/.m2/repository/oro/oro/2.0.8/oro-2.0.8.jar:/home/zyr/.m2/repository/net/razorvine/pyrolite/4.13/pyrolite-4.13.jar:/home/zyr/.m2/repository/net/sf/py4j/py4j/0.10.7/py4j-0.10.7.jar:/home/zyr/.m2/repository/org/apache/commons/commons-crypto/1.0.0/commons-crypto-1.0.0.jar:/home/zyr/.m2/repository/org/apache/spark/spark-streaming_2.11/2.4.3/spark-streaming_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/apache/spark/spark-sql_2.11/2.4.3/spark-sql_2.11-2.4.3.jar:/home/zyr/.m2/repository/com/univocity/univocity-parsers/2.7.3/univocity-parsers-2.7.3.jar:/home/zyr/.m2/repository/org/apache/spark/spark-sketch_2.11/2.4.3/spark-sketch_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/apache/spark/spark-catalyst_2.11/2.4.3/spark-catalyst_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/codehaus/janino/janino/3.0.9/janino-3.0.9.jar:/home/zyr/.m2/repository/org/codehaus/janino/commons-compiler/3.0.9/commons-compiler-3.0.9.jar:/home/zyr/.m2/repository/org/antlr/antlr4-runtime/4.7/antlr4-runtime-4.7.jar:/home/zyr/.m2/repository/org/apache/orc/orc-core/1.5.5/orc-core-1.5.5-nohive.jar:/home/zyr/.m2/repository/org/apache/orc/orc-shims/1.5.5/orc-shims-1.5.5.jar:/home/zyr/.m2/repository/com/google/protobuf/protobuf-java/2.5.0/protobuf-java-2.5.0.jar:/home/zyr/.m2/repository/commons-lang/commons-lang/2.6/commons-lang-2.6.jar:/home/zyr/.m2/repository/io/airlift/aircompressor/0.10/aircompressor-0.10.jar:/home/zyr/.m2/repository/org/apache/orc/orc-mapreduce/1.5.5/orc-mapreduce-1.5.5-nohive.jar:/home/zyr/.m2/repository/org/apache/parquet/parquet-column/1.10.1/parquet-column-1.10.1.jar:/home/zyr/.m2/repository/org/apache/parquet/parquet-common/1.10.1/parquet-common-1.10.1.jar:/home/zyr/.m2/repository/org/apache/parquet/parquet-encoding/1.10.1/parquet-encoding-1.10.1.jar:/home/zyr/.m2/repository/org/apache/parquet/parquet-hadoop/1.10.1/parquet-hadoop-1.10.1.jar:/home/zyr/.m2/repository/org/apache/parquet/parquet-format/2.4.0/parquet-format-2.4.0.jar:/home/zyr/.m2/repository/org/apache/parquet/parquet-jackson/1.10.1/parquet-jackson-1.10.1.jar:/home/zyr/.m2/repository/org/apache/arrow/arrow-vector/0.10.0/arrow-vector-0.10.0.jar:/home/zyr/.m2/repository/org/apache/arrow/arrow-format/0.10.0/arrow-format-0.10.0.jar:/home/zyr/.m2/repository/org/apache/arrow/arrow-memory/0.10.0/arrow-memory-0.10.0.jar:/home/zyr/.m2/repository/joda-time/joda-time/2.9.9/joda-time-2.9.9.jar:/home/zyr/.m2/repository/com/carrotsearch/hppc/0.7.2/hppc-0.7.2.jar:/home/zyr/.m2/repository/com/vlkan/flatbuffers/1.2.0-3f79e055/flatbuffers-1.2.0-3f79e055.jar:/home/zyr/.m2/repository/org/apache/spark/spark-graphx_2.11/2.4.3/spark-graphx_2.11-2.4.3.jar:/home/zyr/.m2/repository/com/github/fommil/netlib/core/1.1.2/core-1.1.2.jar:/home/zyr/.m2/repository/net/sourceforge/f2j/arpack_combined_all/0.1/arpack_combined_all-0.1.jar:/home/zyr/.m2/repository/org/apache/spark/spark-mllib-local_2.11/2.4.3/spark-mllib-local_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/scalanlp/breeze_2.11/0.13.2/breeze_2.11-0.13.2.jar:/home/zyr/.m2/repository/org/scalanlp/breeze-macros_2.11/0.13.2/breeze-macros_2.11-0.13.2.jar:/home/zyr/.m2/repository/net/sf/opencsv/opencsv/2.3/opencsv-2.3.jar:/home/zyr/.m2/repository/com/github/rwl/jtransforms/2.4.0/jtransforms-2.4.0.jar:/home/zyr/.m2/repository/org/spire-math/spire_2.11/0.13.0/spire_2.11-0.13.0.jar:/home/zyr/.m2/repository/org/spire-math/spire-macros_2.11/0.13.0/spire-macros_2.11-0.13.0.jar:/home/zyr/.m2/repository/org/typelevel/machinist_2.11/0.6.1/machinist_2.11-0.6.1.jar:/home/zyr/.m2/repository/com/chuusai/shapeless_2.11/2.3.2/shapeless_2.11-2.3.2.jar:/home/zyr/.m2/repository/org/typelevel/macro-compat_2.11/1.1.1/macro-compat_2.11-1.1.1.jar:/home/zyr/.m2/repository/org/apache/commons/commons-math3/3.4.1/commons-math3-3.4.1.jar:/home/zyr/.m2/repository/org/apache/spark/spark-tags_2.11/2.4.3/spark-tags_2.11-2.4.3.jar:/home/zyr/.m2/repository/org/spark-project/spark/unused/1.0.0/unused-1.0.0.jar com.kmeans.Kmeans
      2 Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
      3 19/07/10 11:40:11 WARN Utils: Your hostname, zyrpc resolves to a loopback address: 127.0.1.1; using 192.168.31.160 instead (on interface ens33)
      4 19/07/10 11:40:11 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
      5 19/07/10 11:40:11 INFO SparkContext: Running Spark version 2.4.3
      6 19/07/10 11:40:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      7 19/07/10 11:40:14 INFO SparkContext: Submitted application: Simple Application
      8 19/07/10 11:40:15 INFO SecurityManager: Changing view acls to: zyr
      9 19/07/10 11:40:15 INFO SecurityManager: Changing modify acls to: zyr
     10 19/07/10 11:40:15 INFO SecurityManager: Changing view acls groups to: 
     11 19/07/10 11:40:15 INFO SecurityManager: Changing modify acls groups to: 
     12 19/07/10 11:40:15 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(zyr); groups with view permissions: Set(); users  with modify permissions: Set(zyr); groups with modify permissions: Set()
     13 19/07/10 11:40:17 INFO Utils: Successfully started service 'sparkDriver' on port 45437.
     14 19/07/10 11:40:17 INFO SparkEnv: Registering MapOutputTracker
     15 19/07/10 11:40:18 INFO SparkEnv: Registering BlockManagerMaster
     16 19/07/10 11:40:18 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
     17 19/07/10 11:40:18 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
     18 19/07/10 11:40:18 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-2d502b3d-b275-49f2-9660-e1310680f61d
     19 19/07/10 11:40:18 INFO MemoryStore: MemoryStore started with capacity 345.0 MB
     20 19/07/10 11:40:18 INFO SparkEnv: Registering OutputCommitCoordinator
     21 19/07/10 11:40:20 INFO Utils: Successfully started service 'SparkUI' on port 4040.
     22 19/07/10 11:40:20 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.31.160:4040
     23 19/07/10 11:40:21 INFO Executor: Starting executor ID driver on host localhost
     24 19/07/10 11:40:22 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 44595.
     25 19/07/10 11:40:22 INFO NettyBlockTransferService: Server created on 192.168.31.160:44595
     26 19/07/10 11:40:22 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
     27 19/07/10 11:40:22 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.31.160, 44595, None)
     28 19/07/10 11:40:23 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.31.160:44595 with 345.0 MB RAM, BlockManagerId(driver, 192.168.31.160, 44595, None)
     29 19/07/10 11:40:23 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.31.160, 44595, None)
     30 19/07/10 11:40:23 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.31.160, 44595, None)
     31 19/07/10 11:40:25 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 214.6 KB, free 344.8 MB)
     32 19/07/10 11:40:26 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 20.4 KB, free 344.8 MB)
     33 19/07/10 11:40:26 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.31.160:44595 (size: 20.4 KB, free: 345.0 MB)
     34 19/07/10 11:40:26 INFO SparkContext: Created broadcast 0 from textFile at Kmeans.scala:25
     35 19/07/10 11:40:26 WARN KMeans: The input data is not directly cached, which may hurt performance if its parent RDDs are also uncached.
     36 19/07/10 11:40:26 INFO FileInputFormat: Total input paths to process : 1
     37 19/07/10 11:40:26 INFO SparkContext: Starting job: takeSample at KMeans.scala:386
     38 19/07/10 11:40:26 INFO DAGScheduler: Got job 0 (takeSample at KMeans.scala:386) with 2 output partitions
     39 19/07/10 11:40:26 INFO DAGScheduler: Final stage: ResultStage 0 (takeSample at KMeans.scala:386)
     40 19/07/10 11:40:26 INFO DAGScheduler: Parents of final stage: List()
     41 19/07/10 11:40:26 INFO DAGScheduler: Missing parents: List()
     42 19/07/10 11:40:26 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[5] at map at KMeans.scala:248), which has no missing parents
     43 19/07/10 11:40:27 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 4.3 KB, free 344.8 MB)
     44 19/07/10 11:40:27 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.5 KB, free 344.8 MB)
     45 19/07/10 11:40:27 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.31.160:44595 (size: 2.5 KB, free: 345.0 MB)
     46 19/07/10 11:40:27 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1161
     47 19/07/10 11:40:27 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[5] at map at KMeans.scala:248) (first 15 tasks are for partitions Vector(0, 1))
     48 19/07/10 11:40:27 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
     49 19/07/10 11:40:27 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 8200 bytes)
     50 19/07/10 11:40:27 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 8200 bytes)
     51 19/07/10 11:40:27 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
     52 19/07/10 11:40:27 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
     53 19/07/10 11:40:28 INFO HadoopRDD: Input split: file:/home/zyr/kmeanstest.txt:36+36
     54 19/07/10 11:40:28 INFO HadoopRDD: Input split: file:/home/zyr/kmeanstest.txt:0+36
     55 19/07/10 11:40:28 INFO MemoryStore: Block rdd_1_0 stored as values in memory (estimated size 288.0 B, free 344.8 MB)
     56 19/07/10 11:40:28 INFO MemoryStore: Block rdd_1_1 stored as values in memory (estimated size 152.0 B, free 344.8 MB)
     57 19/07/10 11:40:28 INFO BlockManagerInfo: Added rdd_1_0 in memory on 192.168.31.160:44595 (size: 288.0 B, free: 345.0 MB)
     58 19/07/10 11:40:28 INFO BlockManagerInfo: Added rdd_1_1 in memory on 192.168.31.160:44595 (size: 152.0 B, free: 345.0 MB)
     59 19/07/10 11:40:28 INFO BlockManager: Found block rdd_1_0 locally
     60 19/07/10 11:40:28 INFO BlockManager: Found block rdd_1_1 locally
     61 19/07/10 11:40:28 INFO MemoryStore: Block rdd_3_0 stored as values in memory (estimated size 48.0 B, free 344.8 MB)
     62 19/07/10 11:40:28 INFO BlockManagerInfo: Added rdd_3_0 in memory on 192.168.31.160:44595 (size: 48.0 B, free: 345.0 MB)
     63 19/07/10 11:40:28 INFO MemoryStore: Block rdd_3_1 stored as values in memory (estimated size 32.0 B, free 344.8 MB)
     64 19/07/10 11:40:28 INFO BlockManagerInfo: Added rdd_3_1 in memory on 192.168.31.160:44595 (size: 32.0 B, free: 345.0 MB)
     65 19/07/10 11:40:28 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 875 bytes result sent to driver
     66 19/07/10 11:40:28 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 875 bytes result sent to driver
     67 19/07/10 11:40:28 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 837 ms on localhost (executor driver) (1/2)
     68 19/07/10 11:40:28 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 942 ms on localhost (executor driver) (2/2)
     69 19/07/10 11:40:28 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
     70 19/07/10 11:40:28 INFO DAGScheduler: ResultStage 0 (takeSample at KMeans.scala:386) finished in 1.635 s
     71 19/07/10 11:40:28 INFO DAGScheduler: Job 0 finished: takeSample at KMeans.scala:386, took 1.961204 s
     72 19/07/10 11:40:28 INFO SparkContext: Starting job: takeSample at KMeans.scala:386
     73 19/07/10 11:40:28 INFO DAGScheduler: Got job 1 (takeSample at KMeans.scala:386) with 2 output partitions
     74 19/07/10 11:40:28 INFO DAGScheduler: Final stage: ResultStage 1 (takeSample at KMeans.scala:386)
     75 19/07/10 11:40:28 INFO DAGScheduler: Parents of final stage: List()
     76 19/07/10 11:40:28 INFO DAGScheduler: Missing parents: List()
     77 19/07/10 11:40:28 INFO DAGScheduler: Submitting ResultStage 1 (PartitionwiseSampledRDD[7] at takeSample at KMeans.scala:386), which has no missing parents
     78 19/07/10 11:40:28 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 5.1 KB, free 344.8 MB)
     79 19/07/10 11:40:28 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 2.9 KB, free 344.8 MB)
     80 19/07/10 11:40:28 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.31.160:44595 (size: 2.9 KB, free: 345.0 MB)
     81 19/07/10 11:40:28 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1161
     82 19/07/10 11:40:28 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 1 (PartitionwiseSampledRDD[7] at takeSample at KMeans.scala:386) (first 15 tasks are for partitions Vector(0, 1))
     83 19/07/10 11:40:28 INFO TaskSchedulerImpl: Adding task set 1.0 with 2 tasks
     84 19/07/10 11:40:28 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 2, localhost, executor driver, partition 0, PROCESS_LOCAL, 8309 bytes)
     85 19/07/10 11:40:28 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 3, localhost, executor driver, partition 1, PROCESS_LOCAL, 8309 bytes)
     86 19/07/10 11:40:28 INFO Executor: Running task 0.0 in stage 1.0 (TID 2)
     87 19/07/10 11:40:28 INFO Executor: Running task 1.0 in stage 1.0 (TID 3)
     88 19/07/10 11:40:28 INFO BlockManager: Found block rdd_1_0 locally
     89 19/07/10 11:40:28 INFO BlockManager: Found block rdd_3_0 locally
     90 19/07/10 11:40:28 INFO BlockManager: Found block rdd_1_1 locally
     91 19/07/10 11:40:28 INFO BlockManager: Found block rdd_3_1 locally
     92 19/07/10 11:40:28 INFO Executor: Finished task 0.0 in stage 1.0 (TID 2). 1283 bytes result sent to driver
     93 19/07/10 11:40:28 INFO Executor: Finished task 1.0 in stage 1.0 (TID 3). 1175 bytes result sent to driver
     94 19/07/10 11:40:28 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 2) in 92 ms on localhost (executor driver) (1/2)
     95 19/07/10 11:40:28 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 3) in 96 ms on localhost (executor driver) (2/2)
     96 19/07/10 11:40:28 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
     97 19/07/10 11:40:28 INFO DAGScheduler: ResultStage 1 (takeSample at KMeans.scala:386) finished in 0.132 s
     98 19/07/10 11:40:28 INFO DAGScheduler: Job 1 finished: takeSample at KMeans.scala:386, took 0.153980 s
     99 19/07/10 11:40:29 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 144.0 B, free 344.8 MB)
    100 19/07/10 11:40:29 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 344.0 B, free 344.8 MB)
    101 19/07/10 11:40:29 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on 192.168.31.160:44595 (size: 344.0 B, free: 345.0 MB)
    102 19/07/10 11:40:29 INFO SparkContext: Created broadcast 3 from broadcast at KMeans.scala:400
    103 19/07/10 11:40:29 INFO SparkContext: Starting job: sum at KMeans.scala:406
    104 19/07/10 11:40:29 INFO DAGScheduler: Got job 2 (sum at KMeans.scala:406) with 2 output partitions
    105 19/07/10 11:40:29 INFO DAGScheduler: Final stage: ResultStage 2 (sum at KMeans.scala:406)
    106 19/07/10 11:40:29 INFO DAGScheduler: Parents of final stage: List()
    107 19/07/10 11:40:29 INFO DAGScheduler: Missing parents: List()
    108 19/07/10 11:40:29 INFO DAGScheduler: Submitting ResultStage 2 (MapPartitionsRDD[9] at map at KMeans.scala:403), which has no missing parents
    109 19/07/10 11:40:29 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 5.4 KB, free 344.7 MB)
    110 19/07/10 11:40:29 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 3.0 KB, free 344.7 MB)
    111 19/07/10 11:40:29 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on 192.168.31.160:44595 (size: 3.0 KB, free: 345.0 MB)
    112 19/07/10 11:40:29 INFO SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1161
    113 19/07/10 11:40:29 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 2 (MapPartitionsRDD[9] at map at KMeans.scala:403) (first 15 tasks are for partitions Vector(0, 1))
    114 19/07/10 11:40:29 INFO TaskSchedulerImpl: Adding task set 2.0 with 2 tasks
    115 19/07/10 11:40:29 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 4, localhost, executor driver, partition 0, PROCESS_LOCAL, 8232 bytes)
    116 19/07/10 11:40:29 INFO TaskSetManager: Starting task 1.0 in stage 2.0 (TID 5, localhost, executor driver, partition 1, PROCESS_LOCAL, 8232 bytes)
    117 19/07/10 11:40:29 INFO Executor: Running task 0.0 in stage 2.0 (TID 4)
    118 19/07/10 11:40:29 INFO Executor: Running task 1.0 in stage 2.0 (TID 5)
    119 19/07/10 11:40:29 INFO BlockManager: Found block rdd_1_0 locally
    120 19/07/10 11:40:29 INFO BlockManager: Found block rdd_3_0 locally
    121 19/07/10 11:40:29 INFO BlockManager: Found block rdd_1_0 locally
    122 19/07/10 11:40:29 INFO BlockManager: Found block rdd_3_0 locally
    123 19/07/10 11:40:29 INFO BlockManager: Found block rdd_1_1 locally
    124 19/07/10 11:40:29 INFO BlockManager: Found block rdd_3_1 locally
    125 19/07/10 11:40:29 INFO BlockManager: Found block rdd_1_1 locally
    126 19/07/10 11:40:29 INFO BlockManager: Found block rdd_3_1 locally
    127 19/07/10 11:40:29 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS
    128 19/07/10 11:40:29 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS
    129 19/07/10 11:40:29 INFO MemoryStore: Block rdd_9_1 stored as values in memory (estimated size 32.0 B, free 344.7 MB)
    130 19/07/10 11:40:29 INFO BlockManagerInfo: Added rdd_9_1 in memory on 192.168.31.160:44595 (size: 32.0 B, free: 345.0 MB)
    131 19/07/10 11:40:29 INFO Executor: Finished task 1.0 in stage 2.0 (TID 5). 834 bytes result sent to driver
    132 19/07/10 11:40:29 INFO TaskSetManager: Finished task 1.0 in stage 2.0 (TID 5) in 149 ms on localhost (executor driver) (1/2)
    133 19/07/10 11:40:29 INFO MemoryStore: Block rdd_9_0 stored as values in memory (estimated size 48.0 B, free 344.7 MB)
    134 19/07/10 11:40:29 INFO BlockManagerInfo: Added rdd_9_0 in memory on 192.168.31.160:44595 (size: 48.0 B, free: 345.0 MB)
    135 19/07/10 11:40:29 INFO Executor: Finished task 0.0 in stage 2.0 (TID 4). 834 bytes result sent to driver
    136 19/07/10 11:40:29 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 4) in 168 ms on localhost (executor driver) (2/2)
    137 19/07/10 11:40:29 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool 
    138 19/07/10 11:40:29 INFO DAGScheduler: ResultStage 2 (sum at KMeans.scala:406) finished in 0.199 s
    139 19/07/10 11:40:29 INFO DAGScheduler: Job 2 finished: sum at KMeans.scala:406, took 0.221468 s
    140 19/07/10 11:40:29 INFO MapPartitionsRDD: Removing RDD 6 from persistence list
    141 19/07/10 11:40:29 INFO BlockManager: Removing RDD 6
    142 19/07/10 11:40:29 INFO SparkContext: Starting job: collect at KMeans.scala:414
    143 19/07/10 11:40:29 INFO DAGScheduler: Got job 3 (collect at KMeans.scala:414) with 2 output partitions
    144 19/07/10 11:40:29 INFO DAGScheduler: Final stage: ResultStage 3 (collect at KMeans.scala:414)
    145 19/07/10 11:40:29 INFO DAGScheduler: Parents of final stage: List()
    146 19/07/10 11:40:29 INFO DAGScheduler: Missing parents: List()
    147 19/07/10 11:40:29 INFO DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[11] at mapPartitionsWithIndex at KMeans.scala:411), which has no missing parents
    148 19/07/10 11:40:29 INFO MemoryStore: Block broadcast_5 stored as values in memory (estimated size 6.1 KB, free 344.7 MB)
    149 19/07/10 11:40:29 INFO MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 3.3 KB, free 344.7 MB)
    150 19/07/10 11:40:29 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on 192.168.31.160:44595 (size: 3.3 KB, free: 345.0 MB)
    151 19/07/10 11:40:29 INFO SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:1161
    152 19/07/10 11:40:29 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 3 (MapPartitionsRDD[11] at mapPartitionsWithIndex at KMeans.scala:411) (first 15 tasks are for partitions Vector(0, 1))
    153 19/07/10 11:40:29 INFO TaskSchedulerImpl: Adding task set 3.0 with 2 tasks
    154 19/07/10 11:40:29 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 6, localhost, executor driver, partition 0, PROCESS_LOCAL, 8264 bytes)
    155 19/07/10 11:40:29 INFO TaskSetManager: Starting task 1.0 in stage 3.0 (TID 7, localhost, executor driver, partition 1, PROCESS_LOCAL, 8264 bytes)
    156 19/07/10 11:40:29 INFO Executor: Running task 0.0 in stage 3.0 (TID 6)
    157 19/07/10 11:40:29 INFO Executor: Running task 1.0 in stage 3.0 (TID 7)
    158 19/07/10 11:40:29 INFO BlockManager: Found block rdd_1_0 locally
    159 19/07/10 11:40:29 INFO BlockManager: Found block rdd_3_0 locally
    160 19/07/10 11:40:29 INFO BlockManager: Found block rdd_9_0 locally
    161 19/07/10 11:40:29 INFO Executor: Finished task 0.0 in stage 3.0 (TID 6). 1078 bytes result sent to driver
    162 19/07/10 11:40:29 INFO BlockManager: Found block rdd_1_1 locally
    163 19/07/10 11:40:29 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID 6) in 21 ms on localhost (executor driver) (1/2)
    164 19/07/10 11:40:29 INFO BlockManager: Found block rdd_3_1 locally
    165 19/07/10 11:40:29 INFO BlockManager: Found block rdd_9_1 locally
    166 19/07/10 11:40:29 INFO Executor: Finished task 1.0 in stage 3.0 (TID 7). 1132 bytes result sent to driver
    167 19/07/10 11:40:29 INFO TaskSetManager: Finished task 1.0 in stage 3.0 (TID 7) in 31 ms on localhost (executor driver) (2/2)
    168 19/07/10 11:40:29 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks have all completed, from pool 
    169 19/07/10 11:40:29 INFO DAGScheduler: ResultStage 3 (collect at KMeans.scala:414) finished in 0.061 s
    170 19/07/10 11:40:29 INFO DAGScheduler: Job 3 finished: collect at KMeans.scala:414, took 0.084564 s
    171 19/07/10 11:40:29 INFO MemoryStore: Block broadcast_6 stored as values in memory (estimated size 320.0 B, free 344.7 MB)
    172 19/07/10 11:40:29 INFO MemoryStore: Block broadcast_6_piece0 stored as bytes in memory (estimated size 426.0 B, free 344.7 MB)
    173 19/07/10 11:40:29 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory on 192.168.31.160:44595 (size: 426.0 B, free: 345.0 MB)
    174 19/07/10 11:40:29 INFO SparkContext: Created broadcast 6 from broadcast at KMeans.scala:400
    175 19/07/10 11:40:29 INFO SparkContext: Starting job: sum at KMeans.scala:406
    176 19/07/10 11:40:29 INFO DAGScheduler: Got job 4 (sum at KMeans.scala:406) with 2 output partitions
    177 19/07/10 11:40:29 INFO DAGScheduler: Final stage: ResultStage 4 (sum at KMeans.scala:406)
    178 19/07/10 11:40:29 INFO DAGScheduler: Parents of final stage: List()
    179 19/07/10 11:40:29 INFO DAGScheduler: Missing parents: List()
    180 19/07/10 11:40:29 INFO DAGScheduler: Submitting ResultStage 4 (MapPartitionsRDD[13] at map at KMeans.scala:403), which has no missing parents
    181 19/07/10 11:40:29 INFO MemoryStore: Block broadcast_7 stored as values in memory (estimated size 5.7 KB, free 344.7 MB)
    182 19/07/10 11:40:29 INFO MemoryStore: Block broadcast_7_piece0 stored as bytes in memory (estimated size 3.1 KB, free 344.7 MB)
    183 19/07/10 11:40:29 INFO BlockManagerInfo: Added broadcast_7_piece0 in memory on 192.168.31.160:44595 (size: 3.1 KB, free: 345.0 MB)
    184 19/07/10 11:40:29 INFO SparkContext: Created broadcast 7 from broadcast at DAGScheduler.scala:1161
    185 19/07/10 11:40:29 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 4 (MapPartitionsRDD[13] at map at KMeans.scala:403) (first 15 tasks are for partitions Vector(0, 1))
    186 19/07/10 11:40:29 INFO TaskSchedulerImpl: Adding task set 4.0 with 2 tasks
    187 19/07/10 11:40:29 INFO TaskSetManager: Starting task 0.0 in stage 4.0 (TID 8, localhost, executor driver, partition 0, PROCESS_LOCAL, 8264 bytes)
    188 19/07/10 11:40:29 INFO TaskSetManager: Starting task 1.0 in stage 4.0 (TID 9, localhost, executor driver, partition 1, PROCESS_LOCAL, 8264 bytes)
    189 19/07/10 11:40:29 INFO Executor: Running task 0.0 in stage 4.0 (TID 8)
    190 19/07/10 11:40:29 INFO Executor: Running task 1.0 in stage 4.0 (TID 9)
    191 19/07/10 11:40:29 INFO BlockManager: Found block rdd_1_0 locally
    192 19/07/10 11:40:29 INFO BlockManager: Found block rdd_1_1 locally
    193 19/07/10 11:40:29 INFO BlockManager: Found block rdd_3_1 locally
    194 19/07/10 11:40:29 INFO BlockManager: Found block rdd_9_1 locally
    195 19/07/10 11:40:29 INFO BlockManager: Found block rdd_3_0 locally
    196 19/07/10 11:40:29 INFO BlockManager: Found block rdd_9_0 locally
    197 19/07/10 11:40:29 INFO MemoryStore: Block rdd_13_1 stored as values in memory (estimated size 32.0 B, free 344.7 MB)
    198 19/07/10 11:40:29 INFO BlockManagerInfo: Added rdd_13_1 in memory on 192.168.31.160:44595 (size: 32.0 B, free: 345.0 MB)
    199 19/07/10 11:40:29 INFO MemoryStore: Block rdd_13_0 stored as values in memory (estimated size 48.0 B, free 344.7 MB)
    200 19/07/10 11:40:29 INFO Executor: Finished task 1.0 in stage 4.0 (TID 9). 834 bytes result sent to driver
    201 19/07/10 11:40:29 INFO TaskSetManager: Finished task 1.0 in stage 4.0 (TID 9) in 64 ms on localhost (executor driver) (1/2)
    202 19/07/10 11:40:29 INFO BlockManagerInfo: Added rdd_13_0 in memory on 192.168.31.160:44595 (size: 48.0 B, free: 345.0 MB)
    203 19/07/10 11:40:29 INFO Executor: Finished task 0.0 in stage 4.0 (TID 8). 834 bytes result sent to driver
    204 19/07/10 11:40:29 INFO TaskSetManager: Finished task 0.0 in stage 4.0 (TID 8) in 75 ms on localhost (executor driver) (2/2)
    205 19/07/10 11:40:29 INFO TaskSchedulerImpl: Removed TaskSet 4.0, whose tasks have all completed, from pool 
    206 19/07/10 11:40:29 INFO DAGScheduler: ResultStage 4 (sum at KMeans.scala:406) finished in 0.108 s
    207 19/07/10 11:40:29 INFO DAGScheduler: Job 4 finished: sum at KMeans.scala:406, took 0.129273 s
    208 19/07/10 11:40:29 INFO MapPartitionsRDD: Removing RDD 9 from persistence list
    209 19/07/10 11:40:29 INFO BlockManager: Removing RDD 9
    210 19/07/10 11:40:29 INFO SparkContext: Starting job: collect at KMeans.scala:414
    211 19/07/10 11:40:29 INFO DAGScheduler: Got job 5 (collect at KMeans.scala:414) with 2 output partitions
    212 19/07/10 11:40:29 INFO DAGScheduler: Final stage: ResultStage 5 (collect at KMeans.scala:414)
    213 19/07/10 11:40:29 INFO DAGScheduler: Parents of final stage: List()
    214 19/07/10 11:40:29 INFO DAGScheduler: Missing parents: List()
    215 19/07/10 11:40:29 INFO DAGScheduler: Submitting ResultStage 5 (MapPartitionsRDD[15] at mapPartitionsWithIndex at KMeans.scala:411), which has no missing parents
    216 19/07/10 11:40:29 INFO MemoryStore: Block broadcast_8 stored as values in memory (estimated size 6.4 KB, free 344.7 MB)
    217 19/07/10 11:40:29 INFO MemoryStore: Block broadcast_8_piece0 stored as bytes in memory (estimated size 3.4 KB, free 344.7 MB)
    218 19/07/10 11:40:29 INFO BlockManagerInfo: Added broadcast_8_piece0 in memory on 192.168.31.160:44595 (size: 3.4 KB, free: 345.0 MB)
    219 19/07/10 11:40:30 INFO SparkContext: Created broadcast 8 from broadcast at DAGScheduler.scala:1161
    220 19/07/10 11:40:30 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 5 (MapPartitionsRDD[15] at mapPartitionsWithIndex at KMeans.scala:411) (first 15 tasks are for partitions Vector(0, 1))
    221 19/07/10 11:40:30 INFO TaskSchedulerImpl: Adding task set 5.0 with 2 tasks
    222 19/07/10 11:40:30 INFO TaskSetManager: Starting task 0.0 in stage 5.0 (TID 10, localhost, executor driver, partition 0, PROCESS_LOCAL, 8296 bytes)
    223 19/07/10 11:40:30 INFO TaskSetManager: Starting task 1.0 in stage 5.0 (TID 11, localhost, executor driver, partition 1, PROCESS_LOCAL, 8296 bytes)
    224 19/07/10 11:40:30 INFO Executor: Running task 0.0 in stage 5.0 (TID 10)
    225 19/07/10 11:40:30 INFO Executor: Running task 1.0 in stage 5.0 (TID 11)
    226 19/07/10 11:40:30 INFO BlockManager: Found block rdd_1_1 locally
    227 19/07/10 11:40:30 INFO BlockManager: Found block rdd_3_1 locally
    228 19/07/10 11:40:30 INFO BlockManager: Found block rdd_1_0 locally
    229 19/07/10 11:40:30 INFO BlockManager: Found block rdd_3_0 locally
    230 19/07/10 11:40:30 INFO BlockManager: Found block rdd_13_0 locally
    231 19/07/10 11:40:30 INFO Executor: Finished task 0.0 in stage 5.0 (TID 10). 1132 bytes result sent to driver
    232 19/07/10 11:40:30 INFO BlockManager: Found block rdd_13_1 locally
    233 19/07/10 11:40:30 INFO Executor: Finished task 1.0 in stage 5.0 (TID 11). 826 bytes result sent to driver
    234 19/07/10 11:40:30 INFO TaskSetManager: Finished task 0.0 in stage 5.0 (TID 10) in 56 ms on localhost (executor driver) (1/2)
    235 19/07/10 11:40:30 INFO TaskSetManager: Finished task 1.0 in stage 5.0 (TID 11) in 58 ms on localhost (executor driver) (2/2)
    236 19/07/10 11:40:30 INFO TaskSchedulerImpl: Removed TaskSet 5.0, whose tasks have all completed, from pool 
    237 19/07/10 11:40:30 INFO DAGScheduler: ResultStage 5 (collect at KMeans.scala:414) finished in 0.147 s
    238 19/07/10 11:40:30 INFO DAGScheduler: Job 5 finished: collect at KMeans.scala:414, took 0.178237 s
    239 19/07/10 11:40:30 INFO MapPartitionsRDD: Removing RDD 13 from persistence list
    240 19/07/10 11:40:30 INFO BlockManager: Removing RDD 13
    241 19/07/10 11:40:30 INFO TorrentBroadcast: Destroying Broadcast(3) (from destroy at KMeans.scala:421)
    242 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 27
    243 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 42
    244 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 51
    245 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 124
    246 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 32
    247 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 72
    248 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 47
    249 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 84
    250 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 36
    251 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 73
    252 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 46
    253 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 25
    254 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 130
    255 19/07/10 11:40:30 INFO TorrentBroadcast: Destroying Broadcast(6) (from destroy at KMeans.scala:421)
    256 19/07/10 11:40:30 INFO BlockManagerInfo: Removed broadcast_3_piece0 on 192.168.31.160:44595 in memory (size: 344.0 B, free: 345.0 MB)
    257 19/07/10 11:40:30 INFO BlockManagerInfo: Removed broadcast_2_piece0 on 192.168.31.160:44595 in memory (size: 2.9 KB, free: 345.0 MB)
    258 19/07/10 11:40:30 INFO MemoryStore: Block broadcast_9 stored as values in memory (estimated size 568.0 B, free 344.7 MB)
    259 19/07/10 11:40:30 INFO BlockManagerInfo: Removed broadcast_6_piece0 on 192.168.31.160:44595 in memory (size: 426.0 B, free: 345.0 MB)
    260 19/07/10 11:40:30 INFO MemoryStore: Block broadcast_9_piece0 stored as bytes in memory (estimated size 529.0 B, free 344.7 MB)
    261 19/07/10 11:40:30 INFO BlockManagerInfo: Added broadcast_9_piece0 in memory on 192.168.31.160:44595 (size: 529.0 B, free: 345.0 MB)
    262 19/07/10 11:40:30 INFO SparkContext: Created broadcast 9 from broadcast at KMeans.scala:431
    263 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 116
    264 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 94
    265 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 90
    266 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 96
    267 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 89
    268 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 81
    269 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 121
    270 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 149
    271 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 145
    272 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 50
    273 19/07/10 11:40:30 INFO BlockManagerInfo: Removed broadcast_7_piece0 on 192.168.31.160:44595 in memory (size: 3.1 KB, free: 345.0 MB)
    274 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 135
    275 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 108
    276 19/07/10 11:40:30 INFO BlockManagerInfo: Removed broadcast_8_piece0 on 192.168.31.160:44595 in memory (size: 3.4 KB, free: 345.0 MB)
    277 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 34
    278 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 106
    279 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 70
    280 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 44
    281 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 57
    282 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 105
    283 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 97
    284 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 31
    285 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 33
    286 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 113
    287 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 140
    288 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 100
    289 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 43
    290 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 68
    291 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 133
    292 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 138
    293 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 39
    294 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 129
    295 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 49
    296 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 37
    297 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 85
    298 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 132
    299 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 53
    300 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 82
    301 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 69
    302 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 104
    303 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 52
    304 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 98
    305 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 141
    306 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 76
    307 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 77
    308 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 102
    309 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 134
    310 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 79
    311 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 61
    312 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 59
    313 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 118
    314 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 74
    315 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 54
    316 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 86
    317 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 136
    318 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 110
    319 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 45
    320 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 30
    321 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 60
    322 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 64
    323 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 137
    324 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 95
    325 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 87
    326 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 38
    327 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 29
    328 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 56
    329 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 125
    330 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 131
    331 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 41
    332 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 55
    333 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 128
    334 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 127
    335 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 88
    336 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 91
    337 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 123
    338 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 103
    339 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 115
    340 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 139
    341 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 142
    342 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 147
    343 19/07/10 11:40:30 INFO BlockManagerInfo: Removed broadcast_5_piece0 on 192.168.31.160:44595 in memory (size: 3.3 KB, free: 345.0 MB)
    344 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 26
    345 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 48
    346 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 78
    347 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 93
    348 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 40
    349 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 35
    350 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 63
    351 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 75
    352 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 112
    353 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 146
    354 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 58
    355 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 62
    356 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 83
    357 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 101
    358 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 148
    359 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 144
    360 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 111
    361 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 117
    362 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 71
    363 19/07/10 11:40:30 INFO BlockManagerInfo: Removed broadcast_4_piece0 on 192.168.31.160:44595 in memory (size: 3.0 KB, free: 345.0 MB)
    364 19/07/10 11:40:30 INFO SparkContext: Starting job: countByValue at KMeans.scala:434
    365 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 28
    366 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 122
    367 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 143
    368 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 65
    369 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 126
    370 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 92
    371 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 114
    372 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 120
    373 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 80
    374 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 107
    375 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 109
    376 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 99
    377 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 66
    378 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 67
    379 19/07/10 11:40:30 INFO ContextCleaner: Cleaned accumulator 119
    380 19/07/10 11:40:31 INFO DAGScheduler: Registering RDD 18 (countByValue at KMeans.scala:434)
    381 19/07/10 11:40:31 INFO DAGScheduler: Got job 6 (countByValue at KMeans.scala:434) with 2 output partitions
    382 19/07/10 11:40:31 INFO DAGScheduler: Final stage: ResultStage 7 (countByValue at KMeans.scala:434)
    383 19/07/10 11:40:31 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 6)
    384 19/07/10 11:40:31 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 6)
    385 19/07/10 11:40:31 INFO DAGScheduler: Submitting ShuffleMapStage 6 (MapPartitionsRDD[18] at countByValue at KMeans.scala:434), which has no missing parents
    386 19/07/10 11:40:31 INFO MemoryStore: Block broadcast_10 stored as values in memory (estimated size 6.7 KB, free 344.8 MB)
    387 19/07/10 11:40:31 INFO MemoryStore: Block broadcast_10_piece0 stored as bytes in memory (estimated size 3.7 KB, free 344.8 MB)
    388 19/07/10 11:40:31 INFO BlockManagerInfo: Added broadcast_10_piece0 in memory on 192.168.31.160:44595 (size: 3.7 KB, free: 345.0 MB)
    389 19/07/10 11:40:31 INFO SparkContext: Created broadcast 10 from broadcast at DAGScheduler.scala:1161
    390 19/07/10 11:40:31 INFO DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 6 (MapPartitionsRDD[18] at countByValue at KMeans.scala:434) (first 15 tasks are for partitions Vector(0, 1))
    391 19/07/10 11:40:31 INFO TaskSchedulerImpl: Adding task set 6.0 with 2 tasks
    392 19/07/10 11:40:31 INFO TaskSetManager: Starting task 0.0 in stage 6.0 (TID 12, localhost, executor driver, partition 0, PROCESS_LOCAL, 8189 bytes)
    393 19/07/10 11:40:31 INFO TaskSetManager: Starting task 1.0 in stage 6.0 (TID 13, localhost, executor driver, partition 1, PROCESS_LOCAL, 8189 bytes)
    394 19/07/10 11:40:31 INFO Executor: Running task 0.0 in stage 6.0 (TID 12)
    395 19/07/10 11:40:31 INFO Executor: Running task 1.0 in stage 6.0 (TID 13)
    396 19/07/10 11:40:31 INFO BlockManager: Found block rdd_1_0 locally
    397 19/07/10 11:40:31 INFO BlockManager: Found block rdd_3_0 locally
    398 19/07/10 11:40:31 INFO BlockManager: Found block rdd_1_1 locally
    399 19/07/10 11:40:31 INFO BlockManager: Found block rdd_3_1 locally
    400 19/07/10 11:40:31 INFO Executor: Finished task 1.0 in stage 6.0 (TID 13). 1156 bytes result sent to driver
    401 19/07/10 11:40:31 INFO Executor: Finished task 0.0 in stage 6.0 (TID 12). 1113 bytes result sent to driver
    402 19/07/10 11:40:31 INFO TaskSetManager: Finished task 1.0 in stage 6.0 (TID 13) in 294 ms on localhost (executor driver) (1/2)
    403 19/07/10 11:40:31 INFO TaskSetManager: Finished task 0.0 in stage 6.0 (TID 12) in 300 ms on localhost (executor driver) (2/2)
    404 19/07/10 11:40:31 INFO TaskSchedulerImpl: Removed TaskSet 6.0, whose tasks have all completed, from pool 
    405 19/07/10 11:40:31 INFO DAGScheduler: ShuffleMapStage 6 (countByValue at KMeans.scala:434) finished in 0.367 s
    406 19/07/10 11:40:31 INFO DAGScheduler: looking for newly runnable stages
    407 19/07/10 11:40:31 INFO DAGScheduler: running: Set()
    408 19/07/10 11:40:31 INFO DAGScheduler: waiting: Set(ResultStage 7)
    409 19/07/10 11:40:31 INFO DAGScheduler: failed: Set()
    410 19/07/10 11:40:32 INFO DAGScheduler: Submitting ResultStage 7 (ShuffledRDD[19] at countByValue at KMeans.scala:434), which has no missing parents
    411 19/07/10 11:40:32 INFO MemoryStore: Block broadcast_11 stored as values in memory (estimated size 3.3 KB, free 344.7 MB)
    412 19/07/10 11:40:32 INFO MemoryStore: Block broadcast_11_piece0 stored as bytes in memory (estimated size 2.0 KB, free 344.7 MB)
    413 19/07/10 11:40:32 INFO BlockManagerInfo: Added broadcast_11_piece0 in memory on 192.168.31.160:44595 (size: 2.0 KB, free: 345.0 MB)
    414 19/07/10 11:40:32 INFO SparkContext: Created broadcast 11 from broadcast at DAGScheduler.scala:1161
    415 19/07/10 11:40:32 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 7 (ShuffledRDD[19] at countByValue at KMeans.scala:434) (first 15 tasks are for partitions Vector(0, 1))
    416 19/07/10 11:40:32 INFO TaskSchedulerImpl: Adding task set 7.0 with 2 tasks
    417 19/07/10 11:40:32 INFO TaskSetManager: Starting task 0.0 in stage 7.0 (TID 14, localhost, executor driver, partition 0, ANY, 7662 bytes)
    418 19/07/10 11:40:32 INFO TaskSetManager: Starting task 1.0 in stage 7.0 (TID 15, localhost, executor driver, partition 1, ANY, 7662 bytes)
    419 19/07/10 11:40:32 INFO Executor: Running task 0.0 in stage 7.0 (TID 14)
    420 19/07/10 11:40:32 INFO Executor: Running task 1.0 in stage 7.0 (TID 15)
    421 19/07/10 11:40:32 INFO ShuffleBlockFetcherIterator: Getting 2 non-empty blocks including 2 local blocks and 0 remote blocks
    422 19/07/10 11:40:32 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 19 ms
    423 19/07/10 11:40:32 INFO ShuffleBlockFetcherIterator: Getting 2 non-empty blocks including 2 local blocks and 0 remote blocks
    424 19/07/10 11:40:32 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 15 ms
    425 19/07/10 11:40:32 INFO Executor: Finished task 1.0 in stage 7.0 (TID 15). 1415 bytes result sent to driver
    426 19/07/10 11:40:32 INFO TaskSetManager: Finished task 1.0 in stage 7.0 (TID 15) in 290 ms on localhost (executor driver) (1/2)
    427 19/07/10 11:40:32 INFO Executor: Finished task 0.0 in stage 7.0 (TID 14). 1372 bytes result sent to driver
    428 19/07/10 11:40:32 INFO TaskSetManager: Finished task 0.0 in stage 7.0 (TID 14) in 308 ms on localhost (executor driver) (2/2)
    429 19/07/10 11:40:32 INFO TaskSchedulerImpl: Removed TaskSet 7.0, whose tasks have all completed, from pool 
    430 19/07/10 11:40:32 INFO DAGScheduler: ResultStage 7 (countByValue at KMeans.scala:434) finished in 0.347 s
    431 19/07/10 11:40:32 INFO DAGScheduler: Job 6 finished: countByValue at KMeans.scala:434, took 1.822957 s
    432 19/07/10 11:40:32 INFO TorrentBroadcast: Destroying Broadcast(9) (from destroy at KMeans.scala:436)
    433 19/07/10 11:40:32 INFO BlockManagerInfo: Removed broadcast_9_piece0 on 192.168.31.160:44595 in memory (size: 529.0 B, free: 345.0 MB)
    434 19/07/10 11:40:32 INFO LocalKMeans: Local KMeans++ converged in 2 iterations.
    435 19/07/10 11:40:32 INFO KMeans: Initialization with k-means|| took 5.977 seconds.
    436 19/07/10 11:40:32 INFO MemoryStore: Block broadcast_12 stored as values in memory (estimated size 296.0 B, free 344.7 MB)
    437 19/07/10 11:40:32 INFO MemoryStore: Block broadcast_12_piece0 stored as bytes in memory (estimated size 334.0 B, free 344.7 MB)
    438 19/07/10 11:40:32 INFO BlockManagerInfo: Added broadcast_12_piece0 in memory on 192.168.31.160:44595 (size: 334.0 B, free: 345.0 MB)
    439 19/07/10 11:40:32 INFO SparkContext: Created broadcast 12 from broadcast at KMeans.scala:299
    440 19/07/10 11:40:32 INFO SparkContext: Starting job: collectAsMap at KMeans.scala:320
    441 19/07/10 11:40:32 INFO DAGScheduler: Registering RDD 20 (mapPartitions at KMeans.scala:302)
    442 19/07/10 11:40:32 INFO DAGScheduler: Got job 7 (collectAsMap at KMeans.scala:320) with 2 output partitions
    443 19/07/10 11:40:32 INFO DAGScheduler: Final stage: ResultStage 9 (collectAsMap at KMeans.scala:320)
    444 19/07/10 11:40:32 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 8)
    445 19/07/10 11:40:32 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 8)
    446 19/07/10 11:40:32 INFO DAGScheduler: Submitting ShuffleMapStage 8 (MapPartitionsRDD[20] at mapPartitions at KMeans.scala:302), which has no missing parents
    447 19/07/10 11:40:32 INFO MemoryStore: Block broadcast_13 stored as values in memory (estimated size 6.3 KB, free 344.7 MB)
    448 19/07/10 11:40:32 INFO MemoryStore: Block broadcast_13_piece0 stored as bytes in memory (estimated size 3.5 KB, free 344.7 MB)
    449 19/07/10 11:40:32 INFO BlockManagerInfo: Added broadcast_13_piece0 in memory on 192.168.31.160:44595 (size: 3.5 KB, free: 345.0 MB)
    450 19/07/10 11:40:32 INFO SparkContext: Created broadcast 13 from broadcast at DAGScheduler.scala:1161
    451 19/07/10 11:40:32 INFO DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 8 (MapPartitionsRDD[20] at mapPartitions at KMeans.scala:302) (first 15 tasks are for partitions Vector(0, 1))
    452 19/07/10 11:40:32 INFO TaskSchedulerImpl: Adding task set 8.0 with 2 tasks
    453 19/07/10 11:40:32 INFO TaskSetManager: Starting task 0.0 in stage 8.0 (TID 16, localhost, executor driver, partition 0, PROCESS_LOCAL, 8189 bytes)
    454 19/07/10 11:40:32 INFO TaskSetManager: Starting task 1.0 in stage 8.0 (TID 17, localhost, executor driver, partition 1, PROCESS_LOCAL, 8189 bytes)
    455 19/07/10 11:40:32 INFO Executor: Running task 0.0 in stage 8.0 (TID 16)
    456 19/07/10 11:40:32 INFO Executor: Running task 1.0 in stage 8.0 (TID 17)
    457 19/07/10 11:40:32 INFO BlockManager: Found block rdd_1_0 locally
    458 19/07/10 11:40:32 INFO BlockManager: Found block rdd_3_0 locally
    459 19/07/10 11:40:32 INFO BlockManager: Found block rdd_1_1 locally
    460 19/07/10 11:40:32 INFO BlockManager: Found block rdd_3_1 locally
    461 19/07/10 11:40:32 INFO Executor: Finished task 1.0 in stage 8.0 (TID 17). 1226 bytes result sent to driver
    462 19/07/10 11:40:32 INFO Executor: Finished task 0.0 in stage 8.0 (TID 16). 1226 bytes result sent to driver
    463 19/07/10 11:40:32 INFO TaskSetManager: Finished task 1.0 in stage 8.0 (TID 17) in 73 ms on localhost (executor driver) (1/2)
    464 19/07/10 11:40:32 INFO TaskSetManager: Finished task 0.0 in stage 8.0 (TID 16) in 82 ms on localhost (executor driver) (2/2)
    465 19/07/10 11:40:32 INFO TaskSchedulerImpl: Removed TaskSet 8.0, whose tasks have all completed, from pool 
    466 19/07/10 11:40:32 INFO DAGScheduler: ShuffleMapStage 8 (mapPartitions at KMeans.scala:302) finished in 0.135 s
    467 19/07/10 11:40:32 INFO DAGScheduler: looking for newly runnable stages
    468 19/07/10 11:40:32 INFO DAGScheduler: running: Set()
    469 19/07/10 11:40:32 INFO DAGScheduler: waiting: Set(ResultStage 9)
    470 19/07/10 11:40:32 INFO DAGScheduler: failed: Set()
    471 19/07/10 11:40:32 INFO DAGScheduler: Submitting ResultStage 9 (ShuffledRDD[21] at reduceByKey at KMeans.scala:317), which has no missing parents
    472 19/07/10 11:40:32 INFO MemoryStore: Block broadcast_14 stored as values in memory (estimated size 2.8 KB, free 344.7 MB)
    473 19/07/10 11:40:32 INFO MemoryStore: Block broadcast_14_piece0 stored as bytes in memory (estimated size 1742.0 B, free 344.7 MB)
    474 19/07/10 11:40:32 INFO BlockManagerInfo: Added broadcast_14_piece0 in memory on 192.168.31.160:44595 (size: 1742.0 B, free: 345.0 MB)
    475 19/07/10 11:40:32 INFO SparkContext: Created broadcast 14 from broadcast at DAGScheduler.scala:1161
    476 19/07/10 11:40:32 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 9 (ShuffledRDD[21] at reduceByKey at KMeans.scala:317) (first 15 tasks are for partitions Vector(0, 1))
    477 19/07/10 11:40:32 INFO TaskSchedulerImpl: Adding task set 9.0 with 2 tasks
    478 19/07/10 11:40:32 INFO TaskSetManager: Starting task 0.0 in stage 9.0 (TID 18, localhost, executor driver, partition 0, ANY, 7662 bytes)
    479 19/07/10 11:40:32 INFO TaskSetManager: Starting task 1.0 in stage 9.0 (TID 19, localhost, executor driver, partition 1, ANY, 7662 bytes)
    480 19/07/10 11:40:32 INFO Executor: Running task 0.0 in stage 9.0 (TID 18)
    481 19/07/10 11:40:32 INFO Executor: Running task 1.0 in stage 9.0 (TID 19)
    482 19/07/10 11:40:32 INFO ShuffleBlockFetcherIterator: Getting 2 non-empty blocks including 2 local blocks and 0 remote blocks
    483 19/07/10 11:40:32 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
    484 19/07/10 11:40:32 INFO ShuffleBlockFetcherIterator: Getting 2 non-empty blocks including 2 local blocks and 0 remote blocks
    485 19/07/10 11:40:32 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
    486 19/07/10 11:40:32 INFO Executor: Finished task 0.0 in stage 9.0 (TID 18). 1531 bytes result sent to driver
    487 19/07/10 11:40:32 INFO Executor: Finished task 1.0 in stage 9.0 (TID 19). 1498 bytes result sent to driver
    488 19/07/10 11:40:32 INFO TaskSetManager: Finished task 0.0 in stage 9.0 (TID 18) in 73 ms on localhost (executor driver) (1/2)
    489 19/07/10 11:40:33 INFO TaskSetManager: Finished task 1.0 in stage 9.0 (TID 19) in 79 ms on localhost (executor driver) (2/2)
    490 19/07/10 11:40:33 INFO TaskSchedulerImpl: Removed TaskSet 9.0, whose tasks have all completed, from pool 
    491 19/07/10 11:40:33 INFO DAGScheduler: ResultStage 9 (collectAsMap at KMeans.scala:320) finished in 0.182 s
    492 19/07/10 11:40:33 INFO DAGScheduler: Job 7 finished: collectAsMap at KMeans.scala:320, took 0.369581 s
    493 19/07/10 11:40:33 INFO TorrentBroadcast: Destroying Broadcast(12) (from destroy at KMeans.scala:330)
    494 19/07/10 11:40:33 INFO KMeans: Iterations took 0.564 seconds.
    495 19/07/10 11:40:33 INFO KMeans: KMeans converged in 1 iterations.
    496 19/07/10 11:40:33 INFO KMeans: The cost is 0.07500000000004324.
    497 19/07/10 11:40:33 INFO BlockManagerInfo: Removed broadcast_12_piece0 on 192.168.31.160:44595 in memory (size: 334.0 B, free: 345.0 MB)
    498 19/07/10 11:40:33 INFO MapPartitionsRDD: Removing RDD 3 from persistence list
    499 19/07/10 11:40:33 INFO BlockManager: Removing RDD 3
    500 19/07/10 11:40:33 WARN KMeans: The input data was not directly cached, which may hurt performance if its parent RDDs are also uncached.
    501 [0.1,0.1,0.1]
    502 [9.05,9.05,9.05]
    503 [9.2,9.2,9.2]
    504 19/07/10 11:40:33 INFO MemoryStore: Block broadcast_15 stored as values in memory (estimated size 296.0 B, free 344.7 MB)
    505 19/07/10 11:40:33 INFO MemoryStore: Block broadcast_15_piece0 stored as bytes in memory (estimated size 334.0 B, free 344.7 MB)
    506 19/07/10 11:40:33 INFO BlockManagerInfo: Added broadcast_15_piece0 in memory on 192.168.31.160:44595 (size: 334.0 B, free: 345.0 MB)
    507 19/07/10 11:40:33 INFO SparkContext: Created broadcast 15 from broadcast at KMeansModel.scala:102
    508 19/07/10 11:40:33 INFO SparkContext: Starting job: sum at KMeansModel.scala:105
    509 19/07/10 11:40:33 INFO DAGScheduler: Got job 8 (sum at KMeansModel.scala:105) with 2 output partitions
    510 19/07/10 11:40:33 INFO DAGScheduler: Final stage: ResultStage 10 (sum at KMeansModel.scala:105)
    511 19/07/10 11:40:33 INFO DAGScheduler: Parents of final stage: List()
    512 19/07/10 11:40:33 INFO DAGScheduler: Missing parents: List()
    513 19/07/10 11:40:33 INFO DAGScheduler: Submitting ResultStage 10 (MapPartitionsRDD[22] at map at KMeansModel.scala:103), which has no missing parents
    514 19/07/10 11:40:33 INFO MemoryStore: Block broadcast_16 stored as values in memory (estimated size 5.3 KB, free 344.7 MB)
    515 19/07/10 11:40:33 INFO MemoryStore: Block broadcast_16_piece0 stored as bytes in memory (estimated size 3.0 KB, free 344.7 MB)
    516 19/07/10 11:40:33 INFO BlockManagerInfo: Added broadcast_16_piece0 in memory on 192.168.31.160:44595 (size: 3.0 KB, free: 345.0 MB)
    517 19/07/10 11:40:33 INFO SparkContext: Created broadcast 16 from broadcast at DAGScheduler.scala:1161
    518 19/07/10 11:40:33 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 10 (MapPartitionsRDD[22] at map at KMeansModel.scala:103) (first 15 tasks are for partitions Vector(0, 1))
    519 19/07/10 11:40:33 INFO TaskSchedulerImpl: Adding task set 10.0 with 2 tasks
    520 19/07/10 11:40:33 INFO TaskSetManager: Starting task 0.0 in stage 10.0 (TID 20, localhost, executor driver, partition 0, PROCESS_LOCAL, 7889 bytes)
    521 19/07/10 11:40:33 INFO TaskSetManager: Starting task 1.0 in stage 10.0 (TID 21, localhost, executor driver, partition 1, PROCESS_LOCAL, 7889 bytes)
    522 19/07/10 11:40:33 INFO Executor: Running task 0.0 in stage 10.0 (TID 20)
    523 19/07/10 11:40:33 INFO Executor: Running task 1.0 in stage 10.0 (TID 21)
    524 19/07/10 11:40:33 INFO BlockManager: Found block rdd_1_1 locally
    525 19/07/10 11:40:33 INFO BlockManager: Found block rdd_1_0 locally
    526 19/07/10 11:40:33 INFO Executor: Finished task 0.0 in stage 10.0 (TID 20). 834 bytes result sent to driver
    527 19/07/10 11:40:33 INFO Executor: Finished task 1.0 in stage 10.0 (TID 21). 834 bytes result sent to driver
    528 19/07/10 11:40:33 INFO TaskSetManager: Finished task 0.0 in stage 10.0 (TID 20) in 30 ms on localhost (executor driver) (1/2)
    529 19/07/10 11:40:33 INFO TaskSetManager: Finished task 1.0 in stage 10.0 (TID 21) in 31 ms on localhost (executor driver) (2/2)
    530 19/07/10 11:40:33 INFO TaskSchedulerImpl: Removed TaskSet 10.0, whose tasks have all completed, from pool 
    531 19/07/10 11:40:33 INFO DAGScheduler: ResultStage 10 (sum at KMeansModel.scala:105) finished in 0.066 s
    532 19/07/10 11:40:33 INFO DAGScheduler: Job 8 finished: sum at KMeansModel.scala:105, took 0.074275 s
    533 19/07/10 11:40:33 INFO TorrentBroadcast: Destroying Broadcast(15) (from destroy at KMeansModel.scala:106)
    534 误差为:0.07500000000004324
    535 19/07/10 11:40:33 INFO BlockManagerInfo: Removed broadcast_15_piece0 on 192.168.31.160:44595 in memory (size: 334.0 B, free: 345.0 MB)
    536 19/07/10 11:40:33 INFO SparkContext: Starting job: foreach at Kmeans.scala:74
    537 19/07/10 11:40:33 INFO DAGScheduler: Got job 9 (foreach at Kmeans.scala:74) with 2 output partitions
    538 19/07/10 11:40:33 INFO DAGScheduler: Final stage: ResultStage 11 (foreach at Kmeans.scala:74)
    539 19/07/10 11:40:33 INFO DAGScheduler: Parents of final stage: List()
    540 19/07/10 11:40:33 INFO DAGScheduler: Missing parents: List()
    541 19/07/10 11:40:33 INFO DAGScheduler: Submitting ResultStage 11 (MapPartitionsRDD[23] at map at Kmeans.scala:68), which has no missing parents
    542 19/07/10 11:40:33 INFO MemoryStore: Block broadcast_17 stored as values in memory (estimated size 4.6 KB, free 344.7 MB)
    543 19/07/10 11:40:33 INFO MemoryStore: Block broadcast_17_piece0 stored as bytes in memory (estimated size 2.7 KB, free 344.7 MB)
    544 19/07/10 11:40:33 INFO BlockManagerInfo: Added broadcast_17_piece0 in memory on 192.168.31.160:44595 (size: 2.7 KB, free: 345.0 MB)
    545 19/07/10 11:40:33 INFO SparkContext: Created broadcast 17 from broadcast at DAGScheduler.scala:1161
    546 19/07/10 11:40:33 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 11 (MapPartitionsRDD[23] at map at Kmeans.scala:68) (first 15 tasks are for partitions Vector(0, 1))
    547 19/07/10 11:40:33 INFO TaskSchedulerImpl: Adding task set 11.0 with 2 tasks
    548 19/07/10 11:40:33 INFO TaskSetManager: Starting task 0.0 in stage 11.0 (TID 22, localhost, executor driver, partition 0, PROCESS_LOCAL, 7889 bytes)
    549 19/07/10 11:40:33 INFO TaskSetManager: Starting task 1.0 in stage 11.0 (TID 23, localhost, executor driver, partition 1, PROCESS_LOCAL, 7889 bytes)
    550 19/07/10 11:40:33 INFO Executor: Running task 0.0 in stage 11.0 (TID 22)
    551 19/07/10 11:40:33 INFO Executor: Running task 1.0 in stage 11.0 (TID 23)
    552 19/07/10 11:40:33 INFO BlockManager: Found block rdd_1_1 locally
    553 19/07/10 11:40:33 INFO BlockManager: Found block rdd_1_0 locally
    554 [0.0,0.0,0.0]==>0
    555 [0.1,0.1,0.1]==>0
    556 [0.2,0.2,0.2]==>0
    557 [9.0,9.0,9.0]==>1
    558 19/07/10 11:40:33 INFO Executor: Finished task 0.0 in stage 11.0 (TID 22). 837 bytes result sent to driver
    559 [9.1,9.1,9.1]==>1
    560 [9.2,9.2,9.2]==>2
    561 19/07/10 11:40:33 INFO Executor: Finished task 1.0 in stage 11.0 (TID 23). 794 bytes result sent to driver
    562 19/07/10 11:40:33 INFO TaskSetManager: Finished task 0.0 in stage 11.0 (TID 22) in 35 ms on localhost (executor driver) (1/2)
    563 19/07/10 11:40:33 INFO TaskSetManager: Finished task 1.0 in stage 11.0 (TID 23) in 37 ms on localhost (executor driver) (2/2)
    564 19/07/10 11:40:33 INFO TaskSchedulerImpl: Removed TaskSet 11.0, whose tasks have all completed, from pool 
    565 19/07/10 11:40:33 INFO DAGScheduler: ResultStage 11 (foreach at Kmeans.scala:74) finished in 0.074 s
    566 19/07/10 11:40:33 INFO DAGScheduler: Job 9 finished: foreach at Kmeans.scala:74, took 0.090780 s
    567 19/07/10 11:40:33 INFO SparkContext: Invoking stop() from shutdown hook
    568 19/07/10 11:40:33 INFO SparkUI: Stopped Spark web UI at http://192.168.31.160:4040
    569 19/07/10 11:40:33 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
    570 19/07/10 11:40:34 INFO MemoryStore: MemoryStore cleared
    571 19/07/10 11:40:34 INFO BlockManager: BlockManager stopped
    572 19/07/10 11:40:34 INFO BlockManagerMaster: BlockManagerMaster stopped
    573 19/07/10 11:40:34 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
    574 19/07/10 11:40:34 INFO SparkContext: Successfully stopped SparkContext
    575 19/07/10 11:40:34 INFO ShutdownHookManager: Shutdown hook called
    576 19/07/10 11:40:34 INFO ShutdownHookManager: Deleting directory /tmp/spark-21be76b6-98d0-46ec-a1d6-640cb5556eff
    577 
    578 Process finished with exit code 0
    View Code

         做一个总结,对于这种版本问题导致的程序不能正常运行的问题,如果长期不解决,或者没有提示,真的是让人很容易放弃这个产品的。

  • 相关阅读:
    斐波那契数列实现方式,以及递归和非递归时间对比
    月份与季节
    时针与分针夹角
    二叉树非递归遍历 以及二叉树节点删除思路
    向左向右 —折半查找(二分法)
    c语言之字符串及字符集简介
    c语言之排序
    C语言代码页 预处理 和宏 结构体 共用体 枚举 指针简绍
    C语言之函数调用约定,递归,数组简介
    C语言之条件判断
  • 原文地址:https://www.cnblogs.com/zyrblog/p/11162932.html
Copyright © 2011-2022 走看看