zoukankan      html  css  js  c++  java
  • storm集群的安装

    storm图解


    storm的基本概念

      Topologies:拓扑,也俗称一个任务
      Spoults:拓扑的消息源
      Bolts:拓扑的处理逻辑单元
      tuple:消息元组,在Spoults和Bolts传递数据报的一种格式
      Streams:流
      Streams groupings:流的分组策略
      Tasks:任务处理单元
      Executor:工作线程
      Workers:工作进程
      Configuration:topology的配置

    官网:http://storm.apache.org/
    storm:
      实时在线运算,用于流式计算,就是数据像水一样源源不断的来,storm此时就得把这些数据处理完
      storm一般不单独使用,因为它不存储,一般数据从消息队列进来处理完可以存储到mysql或其他数据库中去
      Apache Storm是一个免费的开源分布式实时计算系统。Apache Storm可以轻松可靠地处理无限数据流,实现Hadoop为批处理所做的实时处理。Apache Storm很简单,可以与任何编程语言一起使用,并且使用起来很有趣!
      Apache Storm有许多用例:实时分析,在线机器学习,连续计算,分布式RPC,ETL等。Apache Storm很快:一个基准测试时钟表示每个节点每秒处理超过一百万个元组。它具有可扩展性,容错性,可确保您的数据得到处理,并且易于设置和操作。
      Apache Storm与您已经使用的消息队列和数据库技术集成。Apache Storm拓扑消耗数据流并以任意复杂的方式处理这些流,然后在计算的每个阶段之间重新划分流。


    Storm与Hadoop的对比
    Topology与Mapreduce
      一个关键的区别是:一个MapReduce job最终会结束,而一个Topology永远会存在(除非手动kill掉)
    Nimbus与JobTracker
      在Storm的集群里面有两种节点:控制节点(master node)和工作槽位节点(worker node,默认每台机器最多4个slots槽位).控制节点上面运行一个叫nimbus后台程序,它的作用类似于haddop里面的JobTracker。nimbus负责在集群里面分发代码,分配计算任务给机器,并且监控状态.。
    Supervisor与TaskTracker
      每一个工作节点上面运行一个叫做Supervisor的节点,Supervisor会监听分配给它那台机器的工作,根据需要启动/关闭工作进程.每一个工作进程执行一个topology的一个子集;一个运行的topology由运行在很多机器上的很多工作进程组成。

    安装步骤:

    1.安装一个zookeeper集群
    2.下载storm的安装包,解压
    3.修改配置文件storm.yaml

    #所使用的zookeeper集群主机
    - hadoop01
    - hadoop02
    - hadoop03

    #nimbus所在的主机名
    nimbus.host: "hadoop01"
    #默认4个槽位,可以根据机器性能配置大于4个
    supervisor.slots.ports
    -6701
    -6702
    -6703
    -6704
    -6705

    #启动storm
    #在nimbus主机上
    nohup ./storm nimbus 1 > /dev/bull 2>&1 &
    nohup ./storm ui 1 > /dev/null 2>&1 &

    在supervisor主机上
    nohup ./storm supervisor 1 > /dev/null 2>&1 &

    1.zookeeper集群前面已经安装过

    2.下载storm的安装包,解压

    [linyouyi@hadoop01 software]$ wget https://mirrors.aliyun.com/apache/storm/apache-storm-2.0.0/apache-storm-2.0.0.tar.gz
    [linyouyi@hadoop01 software]$ ll
    total 739172
    -rw-rw-r-- 1 linyouyi linyouyi 312465430 Apr 30 06:17 apache-storm-2.0.0.tar.gz
    -rw-r--r-- 1 linyouyi linyouyi 218720521 Aug  3 17:56 hadoop-2.7.7.tar.gz
    -rw-rw-r-- 1 linyouyi linyouyi 132569269 Mar 18 14:28 hbase-2.0.5-bin.tar.gz
    -rw-r--r-- 1 linyouyi linyouyi  54701720 Aug  3 17:47 server-jre-8u144-linux-x64.tar.gz
    -rw-r--r-- 1 linyouyi linyouyi  37676320 Aug  8 09:36 zookeeper-3.4.14.tar.gz
    [linyouyi@hadoop01 software]$ tar -zxvf apache-storm-2.0.0.tar.gz -C /hadoop/module/
    [linyouyi@hadoop01 software]$ cd /hadoop/module/apache-storm-2.0.0
    [linyouyi@hadoop01 apache-storm-2.0.0]$ ll
    total 308
    drwxrwxr-x  2 linyouyi linyouyi  4096 Aug 12 21:11 bin
    drwxrwxr-x  2 linyouyi linyouyi  4096 Aug 12 21:11 conf
    -rw-r--r--  1 linyouyi linyouyi 91939 Apr 30 05:13 DEPENDENCY-LICENSES
    drwxr-xr-x 19 linyouyi linyouyi  4096 Apr 30 05:13 examples
    drwxrwxr-x 19 linyouyi linyouyi  4096 Aug 12 21:11 external
    drwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:59 extlib
    drwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:59 extlib-daemon
    drwxrwxr-x  2 linyouyi linyouyi  4096 Aug 12 21:11 lib
    drwxrwxr-x  5 linyouyi linyouyi  4096 Aug 12 21:11 lib-tools
    drwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:59 lib-webapp
    drwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:58 lib-worker
    -rw-r--r--  1 linyouyi linyouyi 82390 Apr 30 05:13 LICENSE
    drwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:13 licenses
    drwxrwxr-x  2 linyouyi linyouyi  4096 Aug 12 21:11 log4j2
    -rw-r--r--  1 linyouyi linyouyi 34065 Apr 30 05:13 NOTICE
    drwxrwxr-x  6 linyouyi linyouyi  4096 Aug 12 21:11 public
    -rw-r--r--  1 linyouyi linyouyi  7914 Apr 30 05:13 README.markdown
    -rw-r--r--  1 linyouyi linyouyi     6 Apr 30 05:13 RELEASE
    -rw-r--r--  1 linyouyi linyouyi 23865 Apr 30 05:13 SECURITY.md

    3.修改配置文件storm.yaml

    [linyouyi@hadoop01 apache-storm-2.0.0]$ vim conf/storm.yaml
    #zookeeper地址
    storm.zookeeper.servers:
        - "hadoop01"
        - "hadoop02"
        - "hadoop03"
    nimbus.seeds: ["hadoop01"]
    #nimbus.seeds: ["host1", "host2", "host3"]
    
    [linyouyi@hadoop01 apache-storm-2.0.0]$ cd ../ 
    [linyouyi@hadoop01 module]$ scp -r apache-storm-2.0.0 linyouyi@hadoop02:/hadoop/module/
    [linyouyi@hadoop01 module]$ scp -r apache-storm-2.0.0 linyouyi@hadoop03:/hadoop/module/

    4.启动服务

    [linyouyi@hadoop01 module]$ cd apache-storm-2.0.0
    //如果报找不到java_home则需要配置conf/strom-env.sh文件
    [linyouyi@hadoop01 apache-storm-2.0.0]$ bin/storm nimbus &
    [linyouyi@hadoop01 apache-storm-2.0.0]$ jps
    30051 Nimbus
    44057 QuorumPeerMain
    30381 Jps
    [linyouyi@hadoop01 apache-storm-2.0.0]$ netstat -tnpl | grep 30684
    (Not all processes could be identified, non-owned process info
     will not be shown, you would have to be root to see it all.)
    tcp6       0      0 :::6627                 :::*                    LISTEN      30684/java
    [linyouyi@hadoop01 apache-storm-2.0.0]$ bin/storm ui &
    [linyouyi@hadoop01 apache-storm-2.0.0]$ jps
    32674 UIServer
    44057 QuorumPeerMain
    30684 Nimbus
    32989 Jps
    [linyouyi@hadoop01 apache-storm-2.0.0]$ netstat -tnpl | grep 32674
    tcp6       0      0 :::8080                 :::*                    LISTEN      32674/java
    //浏览器查看http://hadoop01:8080发现很多工作槽都是0,下面我们在hadoop02,hadoop03启动supervisor,工作槽就不再是0了
    [linyouyi@hadoop02 apache-storm-2.0.0]$ bin/storm supervisor
    [linyouyi@hadoop02 apache-storm-2.0.0]$ jps
    70952 Jps
    70794 Supervisor
    34879 QuorumPeerMain
    [linyouyi@hadoop03 apache-storm-2.0.0]$ bin/storm supervisor
    [linyouyi@hadoop03 apache-storm-2.0.0]$ jps
    119587 QuorumPeerMain
    116291 Jps
    116143 Supervisor

     

    storm提交Topologies常用命令

    //命令格式: storm jar [jar路径] [拓扑包名.拓扑类名] [stormIP地址] [storm端口] [拓扑名称] [参数]
    [linyouyi@hadoop01 apache-storm-2.0.0]$ bin/storm jar --help
    usage: storm jar [-h] [--jars JARS] [--artifacts ARTIFACTS]
                     [--artifactRepositories ARTIFACTREPOSITORIES]
                     [--mavenLocalRepositoryDirectory MAVENLOCALREPOSITORYDIRECTORY]
                     [--proxyUrl PROXYURL] [--proxyUsername PROXYUSERNAME]
                     [--proxyPassword PROXYPASSWORD] [--storm-server-classpath]
                     [--config CONFIG] [-storm_config_opts STORM_CONFIG_OPTS]
                     topology-jar-path topology-main-class
                     [topology_main_args [topology_main_args ...]]
    
    positional arguments:
      topology-jar-path     will upload the jar at topology-jar-path when the
                            topology is submitted.
      topology-main-class   main class of the topology jar being submitted
      topology_main_args    Runs the main method with the specified arguments.
    
    optional arguments:
      --artifactRepositories ARTIFACTREPOSITORIES
                            When you need to pull the artifacts from other than
                            Maven Central, you can pass remote repositories to
                            --artifactRepositories option with a comma-separated
                            string. Repository format is "<name>^<url>". '^' is
                            taken as separator because URL allows various
                            characters. For example, --artifactRepositories
                            "jboss-repository^http://repository.jboss.com/maven2,H
                            DPRepo^http://repo.hortonworks.com/content/groups/publ
                            ic/" will add JBoss and HDP repositories for
                            dependency resolver.
      --artifacts ARTIFACTS
                            When you want to ship maven artifacts and its
                            transitive dependencies, you can pass them to
                            --artifacts with comma-separated string. You can also
                            exclude some dependencies like what you're doing in
                            maven pom. Please add exclusion artifacts with '^'
                            separated string after the artifact. For example,
                            -artifacts "redis.clients:jedis:2.9.0,org.apache.kafka
                            :kafka-clients:1.0.0^org.slf4j:slf4j-api" will load
                            jedis and kafka-clients artifact and all of transitive
                            dependencies but exclude slf4j-api from kafka.
      --config CONFIG       Override default storm conf file
      --jars JARS           When you want to ship other jars which are not
                            included to application jar, you can pass them to
                            --jars option with comma-separated string. For
                            example, --jars "your-local-jar.jar,your-local-
                            jar2.jar" will load your-local-jar.jar and your-local-
                            jar2.jar.
      --mavenLocalRepositoryDirectory MAVENLOCALREPOSITORYDIRECTORY
                            You can provide local maven repository directory via
                            --mavenLocalRepositoryDirectory if you would like to
                            use specific directory. It might help when you don't
                            have '.m2/repository' directory in home directory,
                            because CWD is sometimes non-deterministic (fragile).
      --proxyPassword PROXYPASSWORD
                            password of proxy if it requires basic auth
      --proxyUrl PROXYURL   You can also provide proxy information to let
                            dependency resolver utilizing proxy if needed. URL
                            representation of proxy ('http://host:port')
      --proxyUsername PROXYUSERNAME
                            username of proxy if it requires basic auth
      --storm-server-classpath
                            If for some reason you need to have the full storm
                            classpath, not just the one for the worker you may
                            include the command line option `--storm-server-
                            classpath`. Please be careful because this will add
                            things to the classpath that will not be on the worker
                            classpath and could result in the worker not running.
      -h, --help            show this help message and exit
      -storm_config_opts STORM_CONFIG_OPTS, -c STORM_CONFIG_OPTS
                            Override storm conf properties , e.g.
                            nimbus.ui.port=4443
    
    
    [linyouyi@hadoop01 apache-storm-2.0.0]$ storm jar /home/storm/storm-starter.jar storm.start.WordCountTopology.wordcountTop

    提交storm-starter.jar到远程集群,并启动wordcountTop拓扑

  • 相关阅读:
    UVA 11174 Stand in a Line,UVA 1436 Counting heaps —— (组合数的好题)
    UVA 1393 Highways,UVA 12075 Counting Triangles —— (组合数,dp)
    【Same Tree】cpp
    【Recover Binary Search Tree】cpp
    【Binary Tree Zigzag Level Order Traversal】cpp
    【Binary Tree Level Order Traversal II 】cpp
    【Binary Tree Level Order Traversal】cpp
    【Binary Tree Post order Traversal】cpp
    【Binary Tree Inorder Traversal】cpp
    【Binary Tree Preorder Traversal】cpp
  • 原文地址:https://www.cnblogs.com/linyouyi/p/11342906.html
Copyright © 2011-2022 走看看