zoukankan      html  css  js  c++  java
  • storm集群的安装

    storm图解


    storm的基本概念

      Topologies:拓扑,也俗称一个任务
      Spoults:拓扑的消息源
      Bolts:拓扑的处理逻辑单元
      tuple:消息元组,在Spoults和Bolts传递数据报的一种格式
      Streams:流
      Streams groupings:流的分组策略
      Tasks:任务处理单元
      Executor:工作线程
      Workers:工作进程
      Configuration:topology的配置

    官网:http://storm.apache.org/
    storm:
      实时在线运算,用于流式计算,就是数据像水一样源源不断的来,storm此时就得把这些数据处理完
      storm一般不单独使用,因为它不存储,一般数据从消息队列进来处理完可以存储到mysql或其他数据库中去
      Apache Storm是一个免费的开源分布式实时计算系统。Apache Storm可以轻松可靠地处理无限数据流,实现Hadoop为批处理所做的实时处理。Apache Storm很简单,可以与任何编程语言一起使用,并且使用起来很有趣!
      Apache Storm有许多用例:实时分析,在线机器学习,连续计算,分布式RPC,ETL等。Apache Storm很快:一个基准测试时钟表示每个节点每秒处理超过一百万个元组。它具有可扩展性,容错性,可确保您的数据得到处理,并且易于设置和操作。
      Apache Storm与您已经使用的消息队列和数据库技术集成。Apache Storm拓扑消耗数据流并以任意复杂的方式处理这些流,然后在计算的每个阶段之间重新划分流。


    Storm与Hadoop的对比
    Topology与Mapreduce
      一个关键的区别是:一个MapReduce job最终会结束,而一个Topology永远会存在(除非手动kill掉)
    Nimbus与JobTracker
      在Storm的集群里面有两种节点:控制节点(master node)和工作槽位节点(worker node,默认每台机器最多4个slots槽位).控制节点上面运行一个叫nimbus后台程序,它的作用类似于haddop里面的JobTracker。nimbus负责在集群里面分发代码,分配计算任务给机器,并且监控状态.。
    Supervisor与TaskTracker
      每一个工作节点上面运行一个叫做Supervisor的节点,Supervisor会监听分配给它那台机器的工作,根据需要启动/关闭工作进程.每一个工作进程执行一个topology的一个子集;一个运行的topology由运行在很多机器上的很多工作进程组成。

    安装步骤:

    1.安装一个zookeeper集群
    2.下载storm的安装包,解压
    3.修改配置文件storm.yaml

    #所使用的zookeeper集群主机
    - hadoop01
    - hadoop02
    - hadoop03

    #nimbus所在的主机名
    nimbus.host: "hadoop01"
    #默认4个槽位,可以根据机器性能配置大于4个
    supervisor.slots.ports
    -6701
    -6702
    -6703
    -6704
    -6705

    #启动storm
    #在nimbus主机上
    nohup ./storm nimbus 1 > /dev/bull 2>&1 &
    nohup ./storm ui 1 > /dev/null 2>&1 &

    在supervisor主机上
    nohup ./storm supervisor 1 > /dev/null 2>&1 &

    1.zookeeper集群前面已经安装过

    2.下载storm的安装包,解压

    [linyouyi@hadoop01 software]$ wget https://mirrors.aliyun.com/apache/storm/apache-storm-2.0.0/apache-storm-2.0.0.tar.gz
    [linyouyi@hadoop01 software]$ ll
    total 739172
    -rw-rw-r-- 1 linyouyi linyouyi 312465430 Apr 30 06:17 apache-storm-2.0.0.tar.gz
    -rw-r--r-- 1 linyouyi linyouyi 218720521 Aug  3 17:56 hadoop-2.7.7.tar.gz
    -rw-rw-r-- 1 linyouyi linyouyi 132569269 Mar 18 14:28 hbase-2.0.5-bin.tar.gz
    -rw-r--r-- 1 linyouyi linyouyi  54701720 Aug  3 17:47 server-jre-8u144-linux-x64.tar.gz
    -rw-r--r-- 1 linyouyi linyouyi  37676320 Aug  8 09:36 zookeeper-3.4.14.tar.gz
    [linyouyi@hadoop01 software]$ tar -zxvf apache-storm-2.0.0.tar.gz -C /hadoop/module/
    [linyouyi@hadoop01 software]$ cd /hadoop/module/apache-storm-2.0.0
    [linyouyi@hadoop01 apache-storm-2.0.0]$ ll
    total 308
    drwxrwxr-x  2 linyouyi linyouyi  4096 Aug 12 21:11 bin
    drwxrwxr-x  2 linyouyi linyouyi  4096 Aug 12 21:11 conf
    -rw-r--r--  1 linyouyi linyouyi 91939 Apr 30 05:13 DEPENDENCY-LICENSES
    drwxr-xr-x 19 linyouyi linyouyi  4096 Apr 30 05:13 examples
    drwxrwxr-x 19 linyouyi linyouyi  4096 Aug 12 21:11 external
    drwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:59 extlib
    drwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:59 extlib-daemon
    drwxrwxr-x  2 linyouyi linyouyi  4096 Aug 12 21:11 lib
    drwxrwxr-x  5 linyouyi linyouyi  4096 Aug 12 21:11 lib-tools
    drwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:59 lib-webapp
    drwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:58 lib-worker
    -rw-r--r--  1 linyouyi linyouyi 82390 Apr 30 05:13 LICENSE
    drwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:13 licenses
    drwxrwxr-x  2 linyouyi linyouyi  4096 Aug 12 21:11 log4j2
    -rw-r--r--  1 linyouyi linyouyi 34065 Apr 30 05:13 NOTICE
    drwxrwxr-x  6 linyouyi linyouyi  4096 Aug 12 21:11 public
    -rw-r--r--  1 linyouyi linyouyi  7914 Apr 30 05:13 README.markdown
    -rw-r--r--  1 linyouyi linyouyi     6 Apr 30 05:13 RELEASE
    -rw-r--r--  1 linyouyi linyouyi 23865 Apr 30 05:13 SECURITY.md

    3.修改配置文件storm.yaml

    [linyouyi@hadoop01 apache-storm-2.0.0]$ vim conf/storm.yaml
    #zookeeper地址
    storm.zookeeper.servers:
        - "hadoop01"
        - "hadoop02"
        - "hadoop03"
    nimbus.seeds: ["hadoop01"]
    #nimbus.seeds: ["host1", "host2", "host3"]
    
    [linyouyi@hadoop01 apache-storm-2.0.0]$ cd ../ 
    [linyouyi@hadoop01 module]$ scp -r apache-storm-2.0.0 linyouyi@hadoop02:/hadoop/module/
    [linyouyi@hadoop01 module]$ scp -r apache-storm-2.0.0 linyouyi@hadoop03:/hadoop/module/

    4.启动服务

    [linyouyi@hadoop01 module]$ cd apache-storm-2.0.0
    //如果报找不到java_home则需要配置conf/strom-env.sh文件
    [linyouyi@hadoop01 apache-storm-2.0.0]$ bin/storm nimbus &
    [linyouyi@hadoop01 apache-storm-2.0.0]$ jps
    30051 Nimbus
    44057 QuorumPeerMain
    30381 Jps
    [linyouyi@hadoop01 apache-storm-2.0.0]$ netstat -tnpl | grep 30684
    (Not all processes could be identified, non-owned process info
     will not be shown, you would have to be root to see it all.)
    tcp6       0      0 :::6627                 :::*                    LISTEN      30684/java
    [linyouyi@hadoop01 apache-storm-2.0.0]$ bin/storm ui &
    [linyouyi@hadoop01 apache-storm-2.0.0]$ jps
    32674 UIServer
    44057 QuorumPeerMain
    30684 Nimbus
    32989 Jps
    [linyouyi@hadoop01 apache-storm-2.0.0]$ netstat -tnpl | grep 32674
    tcp6       0      0 :::8080                 :::*                    LISTEN      32674/java
    //浏览器查看http://hadoop01:8080发现很多工作槽都是0,下面我们在hadoop02,hadoop03启动supervisor,工作槽就不再是0了
    [linyouyi@hadoop02 apache-storm-2.0.0]$ bin/storm supervisor
    [linyouyi@hadoop02 apache-storm-2.0.0]$ jps
    70952 Jps
    70794 Supervisor
    34879 QuorumPeerMain
    [linyouyi@hadoop03 apache-storm-2.0.0]$ bin/storm supervisor
    [linyouyi@hadoop03 apache-storm-2.0.0]$ jps
    119587 QuorumPeerMain
    116291 Jps
    116143 Supervisor

     

    storm提交Topologies常用命令

    //命令格式: storm jar [jar路径] [拓扑包名.拓扑类名] [stormIP地址] [storm端口] [拓扑名称] [参数]
    [linyouyi@hadoop01 apache-storm-2.0.0]$ bin/storm jar --help
    usage: storm jar [-h] [--jars JARS] [--artifacts ARTIFACTS]
                     [--artifactRepositories ARTIFACTREPOSITORIES]
                     [--mavenLocalRepositoryDirectory MAVENLOCALREPOSITORYDIRECTORY]
                     [--proxyUrl PROXYURL] [--proxyUsername PROXYUSERNAME]
                     [--proxyPassword PROXYPASSWORD] [--storm-server-classpath]
                     [--config CONFIG] [-storm_config_opts STORM_CONFIG_OPTS]
                     topology-jar-path topology-main-class
                     [topology_main_args [topology_main_args ...]]
    
    positional arguments:
      topology-jar-path     will upload the jar at topology-jar-path when the
                            topology is submitted.
      topology-main-class   main class of the topology jar being submitted
      topology_main_args    Runs the main method with the specified arguments.
    
    optional arguments:
      --artifactRepositories ARTIFACTREPOSITORIES
                            When you need to pull the artifacts from other than
                            Maven Central, you can pass remote repositories to
                            --artifactRepositories option with a comma-separated
                            string. Repository format is "<name>^<url>". '^' is
                            taken as separator because URL allows various
                            characters. For example, --artifactRepositories
                            "jboss-repository^http://repository.jboss.com/maven2,H
                            DPRepo^http://repo.hortonworks.com/content/groups/publ
                            ic/" will add JBoss and HDP repositories for
                            dependency resolver.
      --artifacts ARTIFACTS
                            When you want to ship maven artifacts and its
                            transitive dependencies, you can pass them to
                            --artifacts with comma-separated string. You can also
                            exclude some dependencies like what you're doing in
                            maven pom. Please add exclusion artifacts with '^'
                            separated string after the artifact. For example,
                            -artifacts "redis.clients:jedis:2.9.0,org.apache.kafka
                            :kafka-clients:1.0.0^org.slf4j:slf4j-api" will load
                            jedis and kafka-clients artifact and all of transitive
                            dependencies but exclude slf4j-api from kafka.
      --config CONFIG       Override default storm conf file
      --jars JARS           When you want to ship other jars which are not
                            included to application jar, you can pass them to
                            --jars option with comma-separated string. For
                            example, --jars "your-local-jar.jar,your-local-
                            jar2.jar" will load your-local-jar.jar and your-local-
                            jar2.jar.
      --mavenLocalRepositoryDirectory MAVENLOCALREPOSITORYDIRECTORY
                            You can provide local maven repository directory via
                            --mavenLocalRepositoryDirectory if you would like to
                            use specific directory. It might help when you don't
                            have '.m2/repository' directory in home directory,
                            because CWD is sometimes non-deterministic (fragile).
      --proxyPassword PROXYPASSWORD
                            password of proxy if it requires basic auth
      --proxyUrl PROXYURL   You can also provide proxy information to let
                            dependency resolver utilizing proxy if needed. URL
                            representation of proxy ('http://host:port')
      --proxyUsername PROXYUSERNAME
                            username of proxy if it requires basic auth
      --storm-server-classpath
                            If for some reason you need to have the full storm
                            classpath, not just the one for the worker you may
                            include the command line option `--storm-server-
                            classpath`. Please be careful because this will add
                            things to the classpath that will not be on the worker
                            classpath and could result in the worker not running.
      -h, --help            show this help message and exit
      -storm_config_opts STORM_CONFIG_OPTS, -c STORM_CONFIG_OPTS
                            Override storm conf properties , e.g.
                            nimbus.ui.port=4443
    
    
    [linyouyi@hadoop01 apache-storm-2.0.0]$ storm jar /home/storm/storm-starter.jar storm.start.WordCountTopology.wordcountTop

    提交storm-starter.jar到远程集群,并启动wordcountTop拓扑

  • 相关阅读:
    音频播放器
    SQL Server找不到配置管理器怎么办
    SQL——游标循环的写法
    SQL——多条相似内容只取一条
    SQL——delete left join
    SQL——查询包含某字段的所有表
    SQL——获取数据库表结构
    SQL Server数据库改名
    SQL——left join的结果行数可能大于左表
    SQL——用临时表代替过多的变量声明赋值
  • 原文地址:https://www.cnblogs.com/linyouyi/p/11342906.html
Copyright © 2011-2022 走看看