zoukankan      html  css  js  c++  java
  • Centos7 Flink安装及WordCount

    Flink安装及WordCount

    大数据的框架几乎都不支持Windows环境,生活不易,猫猫叹气。

    一想到在服务器上捣鼓东西就开始怂,不过这次装Flink框架简直顺利得不行。

    打开宝塔面板,在/usr/local/路径下上传从官网下载的压缩包,下载的是2_11,最新版本。

    然后解压,这个注意没必要指定文件夹,解压后它会以"flink-1.12.1"命名一个文件夹的(脑抽指定了发现要多打开一次 不开心)

    image-20210124183726222

    要记得检查安全组是否放行:

    安全组放行

    访问8081端口,可以看到Flink的仪表盘:

    Flink面板

    整套流程跟着官方教程一步步走下来其实很顺利,但是写博客记录时找不到了。

    WordCount

    然后还是WordCount,本地开发环境,我这里就是idea+maven,maven导入依赖:

    <dependency>
      <groupId>org.apache.flink</groupId>
      <artifactId>flink-java</artifactId>
      <version>1.12.1</version>
    </dependency>
    <dependency>
      <groupId>org.apache.flink</groupId>
      <artifactId>flink-streaming-java_2.11</artifactId>
      <version>1.12.1</version>
    </dependency>
    <dependency>
      <groupId>org.apache.flink</groupId>
      <artifactId>flink-clients_2.11</artifactId>
      <version>1.12.1</version>
    </dependency>
    

    代码:

    package site.buzhou;
    
    import org.apache.flink.api.common.functions.FlatMapFunction;
    import org.apache.flink.api.java.DataSet;
    import org.apache.flink.api.java.ExecutionEnvironment;
    import org.apache.flink.api.java.tuple.Tuple2;
    import org.apache.flink.util.Collector;
    
    
    /**
     * @program: FlinkTutorialDemo01
     * @description:
     * @author: 不周
     * @create: 2021-01-24 18:03
     **/
    public class WordCount {
        public static void main(String[] args) throws Exception {
            //创建执行环境
            ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
            String inputPath = "E:\Project\Hadoop\FlinkTutorialDemo01\src\main\resources\words.txt";
            DataSet<String> stringDataSource = env.readTextFile(inputPath);
    
            //对数据集进行处理
            DataSet<Tuple2<String,Integer>> dataSet = stringDataSource.flatMap(new MyFlatMapper()).groupBy(0).sum(1);
            dataSet.print();
        }
    
        public static class MyFlatMapper implements FlatMapFunction<String, Tuple2<String,Integer>>{
    
    
            public void flatMap(String s, Collector<Tuple2<String, Integer>> collector) throws Exception {
                String[] words = s.split(" ");
                for(String word:words){
                    collector.collect(new Tuple2<String, Integer>(word,1));
                }
            }
        }
    }
    
    

    编译结果:

    image-20210124183951491

    看得出来就是复制官网的文段嘛hh:

    We recommend packaging the application code and all its required dependencies into one jar-with-dependencies which we refer to as the application jar. The application jar can be submitted to an already running Flink cluster, or added to a Flink application container image.
    
    Projects created from the Java Project Template or Scala Project Template are configured to automatically include the application dependencies into the application jar when running mvn clean package. For projects that are not set up from those templates, we recommend adding the Maven Shade Plugin (as listed in the Appendix below) to build the application jar with all required dependencies.
    
    Important: For Maven (and other build tools) to correctly package the dependencies into the application jar, these application dependencies must be specified in scope compile (unlike the core dependencies, which must be specified in scope provided).
    
    Scala Versions
    Scala versions (2.11, 2.12, etc.) are not binary compatible with one another. For that reason, Flink for Scala 2.11 cannot be used with an application that uses Scala 2.12.
    
    All Flink dependencies that (transitively) depend on Scala are suffixed with the Scala version that they are built for, for example flink-streaming-scala_2.11.
    
    Developers that only use Java can pick any Scala version, Scala developers need to pick the Scala version that matches their application’s Scala version.
    
    Please refer to the build guide for details on how to build Flink for a specific Scala version.
    

    https://flink.apache.org/downloads.html#apache-flink-1121

  • 相关阅读:
    沟通是项目管理知识体系中的九大知识领域之一
    项目管理的三要素时间、成本、质量
    项目管理提升效率的几大关键点
    收到FRDMKL02Z
    【转】arm 开发工具比较(ADS vs RealviewMDK vs RVDS)
    你不能自己把自己放弃写在毕业季
    【转】为什么你应该(从现在开始就)写博客
    Vivado 2014.4 FFT IP 使用及仿真
    项目需求的一些事
    娇荣电子工作室成立了
  • 原文地址:https://www.cnblogs.com/buzhouke/p/14321856.html
Copyright © 2011-2022 走看看