zoukankan      html  css  js  c++  java
  • [Spark]Spark-streaming通过Receiver方式实时消费Kafka流程(Yarn-cluster)

    1.启动zookeeper
    2.启动kafka服务(broker)
    [root@master kafka_2.11-0.10.2.1]# ./bin/kafka-server-start.sh config/server.properties
    
    3.启动kafka的producer(前提:已经创建好topic
    [root@master kafka_2.11-0.10.2.1]# ./bin/kafka-console-producer.sh --broker-list master:9092 --topic test
    
    4.启动kafka的consumer
    [root@master kafka_2.11-0.10.2.1]#./bin/kafka-console-consumer.sh --zookeeper master:2181 --topic test --from-beginning
    
    5.打jar包,将带有依赖的jar包上传到集群上
    mvn clean assembly:assembly
    
    6.编写启动脚本,启动任务 sh run_receiver.sh
    /usr/local/src/spark-2.0.2-bin-hadoop2.6/bin/spark-submit
            --class com.skyell.streaming.ReceiverFromKafka
            --master yarn-cluster 
            --executor-memory 1G 
            --total-executor-cores 2 
            --files $HIVE_HOME/conf/hive-site.xml 
            ./Spark8Pro-2.0-SNAPSHOT-jar-with-dependencies.jar
    
    监控任务及查看日志

    http://master:8088/cluster

    关闭spark streaming任务
    yarn application -kill application_1539421032843_0093
    

    数据驱动变革-云将 个人博客地址

  • 相关阅读:
    term "JavaScript"
    Pro Git
    Pro Git
    Pro Git
    git
    flask
    OJ
    [蓝桥杯]Huffuman树
    priority_queue优先队列
    [蓝桥杯]高精度加法
  • 原文地址:https://www.cnblogs.com/skyell/p/10048189.html
Copyright © 2011-2022 走看看