zoukankan      html  css  js  c++  java
  • 大数据工具篇之flume1.4-安装部署指南

    一、引言

      flume-ng是一个分布式、高可靠和高效的日志收集系统,flume-ng是flume的新版本的意思,其中“ng”意为new generate(新一代),目前来说,flume-ng 1.4是最新的版本。flume-ng与flume相比,发生了很大的变化,因为之前一直在flume0.9的版本,一直没有升级到flume-ng,最近因为项目需要,做了一次升级,发现了一些问题,特记录下来,分享给大家。

    二、版本说明

      flume-ng 1.4.0

    三、安装步骤

      下载、解压、安装JDK、设置环境变量部分已经有很多介绍性的问题,不做说明。需要特别说明之处的是,flume-ng不需要要zookeeper,无需设置。

    四、flume-ng bug  

      安装完成后运行flume-ng会出现错误信息,这主要是因为shell脚本的问题,我将修改后的flume-ng完整的上传如下,其中标注:#zhangzl下面的行是需要修改的部分。完整脚本如下所示:  

      1 #!/bin/bash
      2 #
      3 #
      4 # Licensed to the Apache Software Foundation (ASF) under one
      5 # or more contributor license agreements.  See the NOTICE file
      6 # distributed with this work for additional information
      7 # regarding copyright ownership.  The ASF licenses this file
      8 # to you under the Apache License, Version 2.0 (the
      9 # "License"); you may not use this file except in compliance
     10 # with the License.  You may obtain a copy of the License at
     11 #
     12 #   http://www.apache.org/licenses/LICENSE-2.0
     13 #
     14 # Unless required by applicable law or agreed to in writing,
     15 # software distributed under the License is distributed on an
     16 # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
     17 # KIND, either express or implied.  See the License for the
     18 # specific language governing permissions and limitations
     19 # under the License.
     20 #
     21 
     22 ################################
     23 # constants
     24 ################################
     25 
     26 FLUME_AGENT_CLASS="org.apache.flume.node.Application"
     27 FLUME_AVRO_CLIENT_CLASS="org.apache.flume.client.avro.AvroCLIClient"
     28 FLUME_VERSION_CLASS="org.apache.flume.tools.VersionInfo"
     29 FLUME_TOOLS_CLASS="org.apache.flume.tools.FlumeToolsMain"
     30 
     31 CLEAN_FLAG=1
     32 ################################
     33 # functions
     34 ################################
     35 
     36 info() {
     37   if [ ${CLEAN_FLAG} -ne 0 ]; then
     38     local msg=$1
     39     echo "Info: $msg" >&2
     40   fi
     41 }
     42 
     43 warn() {
     44   if [ ${CLEAN_FLAG} -ne 0 ]; then
     45     local msg=$1
     46     echo "Warning: $msg" >&2
     47   fi
     48 }
     49 
     50 error() {
     51   local msg=$1
     52   local exit_code=$2
     53 
     54   echo "Error: $msg" >&2
     55 
     56   if [ -n "$exit_code" ] ; then
     57     exit $exit_code
     58   fi
     59 }
     60 
     61 # If avail, add Hadoop paths to the FLUME_CLASSPATH and to the
     62 # FLUME_JAVA_LIBRARY_PATH env vars.
     63 # Requires Flume jars to already be on FLUME_CLASSPATH.
     64 add_hadoop_paths() {
     65   local HADOOP_IN_PATH=$(PATH="${HADOOP_HOME:-${HADOOP_PREFIX}}/bin:$PATH" 
     66       which hadoop 2>/dev/null)
     67 
     68   if [ -f "${HADOOP_IN_PATH}" ]; then
     69     info "Including Hadoop libraries found via ($HADOOP_IN_PATH) for HDFS access"
     70 
     71     # determine hadoop java.library.path and use that for flume
     72     local HADOOP_CLASSPATH=""
     73     local HADOOP_JAVA_LIBRARY_PATH=$(HADOOP_CLASSPATH="$FLUME_CLASSPATH" 
     74         ${HADOOP_IN_PATH} org.apache.flume.tools.GetJavaProperty 
     75         java.library.path)
     76 
     77     # look for the line that has the desired property value
     78     # (considering extraneous output from some GC options that write to stdout)
     79     # IFS = InternalFieldSeparator (set to recognize only newline char as delimiter)
     80     IFS=$'
    '
     81     for line in $HADOOP_JAVA_LIBRARY_PATH; do
     82       #if [[ $line =~ ^java.library.path=(.*)$ ]]; then
     83       if [[ "$line" =~ "^java.library.path=(.*)$" ]]; then
     84         HADOOP_JAVA_LIBRARY_PATH=${BASH_REMATCH[1]}
     85         break
     86       fi
     87     done
     88     unset IFS
     89 
     90     if [ -n "${HADOOP_JAVA_LIBRARY_PATH}" ]; then
     91       FLUME_JAVA_LIBRARY_PATH="$FLUME_JAVA_LIBRARY_PATH:$HADOOP_JAVA_LIBRARY_PATH"
     92     fi
     93 
     94     # determine hadoop classpath
     95     HADOOP_CLASSPATH=$($HADOOP_IN_PATH classpath)
     96 
     97     # hack up and filter hadoop classpath
     98     local ELEMENTS=$(sed -e 's/:/ /g' <<<${HADOOP_CLASSPATH})
     99     local ELEMENT
    100     for ELEMENT in $ELEMENTS; do
    101       local PIECE
    102       for PIECE in $(echo $ELEMENT); do
    103           #zhangzl
    104         if [[ $PIECE =~ "slf4j-(api|log4j12).*.jar" ]]; then
    105           info "Excluding $PIECE from classpath"
    106           continue
    107         else
    108           FLUME_CLASSPATH="$FLUME_CLASSPATH:$PIECE"
    109         fi
    110       done
    111     done
    112 
    113   fi
    114 }
    115 add_HBASE_paths() {
    116   local HBASE_IN_PATH=$(PATH="${HBASE_HOME}/bin:$PATH" 
    117       which hbase 2>/dev/null)
    118 
    119   if [ -f "${HBASE_IN_PATH}" ]; then
    120     info "Including HBASE libraries found via ($HBASE_IN_PATH) for HBASE access"
    121 
    122     # determine HBASE java.library.path and use that for flume
    123     local HBASE_CLASSPATH=""
    124     local HBASE_JAVA_LIBRARY_PATH=$(HBASE_CLASSPATH="$FLUME_CLASSPATH" 
    125         ${HBASE_IN_PATH} org.apache.flume.tools.GetJavaProperty 
    126         java.library.path)
    127 
    128     # look for the line that has the desired property value
    129     # (considering extraneous output from some GC options that write to stdout)
    130     # IFS = InternalFieldSeparator (set to recognize only newline char as delimiter)
    131     IFS=$'
    '
    132     for line in $HBASE_JAVA_LIBRARY_PATH; do
    133     #zhangzl
    134       if [[ $line =~ "^java.library.path=(.*)$" ]]; then
    135         HBASE_JAVA_LIBRARY_PATH=${BASH_REMATCH[1]}
    136         break
    137       fi
    138     done
    139     unset IFS
    140 
    141     if [ -n "${HBASE_JAVA_LIBRARY_PATH}" ]; then
    142       FLUME_JAVA_LIBRARY_PATH="$FLUME_JAVA_LIBRARY_PATH:$HBASE_JAVA_LIBRARY_PATH"
    143     fi
    144 
    145     # determine HBASE classpath
    146     HBASE_CLASSPATH=$($HBASE_IN_PATH classpath)
    147 
    148     # hack up and filter HBASE classpath
    149     local ELEMENTS=$(sed -e 's/:/ /g' <<<${HBASE_CLASSPATH})
    150     local ELEMENT
    151     for ELEMENT in $ELEMENTS; do
    152       local PIECE
    153       for PIECE in $(echo $ELEMENT); do
    154       #zhangzl
    155         if [[ $PIECE =~ "slf4j-(api|log4j12).*.jar" ]]; then
    156           info "Excluding $PIECE from classpath"
    157           continue
    158         else
    159           FLUME_CLASSPATH="$FLUME_CLASSPATH:$PIECE"
    160         fi
    161       done
    162     done
    163     FLUME_CLASSPATH="$FLUME_CLASSPATH:$HBASE_HOME/conf"
    164 
    165   fi
    166 }
    167 
    168 set_LD_LIBRARY_PATH(){
    169 #Append the FLUME_JAVA_LIBRARY_PATH to whatever the user may have specified in
    170 #flume-env.sh
    171   if [ -n "${FLUME_JAVA_LIBRARY_PATH}" ]; then
    172     export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${FLUME_JAVA_LIBRARY_PATH}"
    173   fi
    174 }
    175 
    176 display_help() {
    177   cat <<EOF
    178 Usage: $0 <command> [options]...
    179 
    180 commands:
    181   help                  display this help text
    182   agent                 run a Flume agent
    183   avro-client           run an avro Flume client
    184   version               show Flume version info
    185 
    186 global options:
    187   --conf,-c <conf>      use configs in <conf> directory
    188   --classpath,-C <cp>   append to the classpath
    189   --dryrun,-d           do not actually start Flume, just print the command
    190   --plugins-path <dirs> colon-separated list of plugins.d directories. See the
    191                         plugins.d section in the user guide for more details.
    192                         Default: $FLUME_HOME/plugins.d
    193   -Dproperty=value      sets a Java system property value
    194   -Xproperty=value      sets a Java -X option
    195 
    196 agent options:
    197   --conf-file,-f <file> specify a config file (required)
    198   --name,-n <name>      the name of this agent (required)
    199   --help,-h             display help text
    200 
    201 avro-client options:
    202   --rpcProps,-P <file>   RPC client properties file with server connection params
    203   --host,-H <host>       hostname to which events will be sent
    204   --port,-p <port>       port of the avro source
    205   --dirname <dir>        directory to stream to avro source
    206   --filename,-F <file>   text file to stream to avro source (default: std input)
    207   --headerFile,-R <file> File containing event headers as key/value pairs on each new line
    208   --help,-h              display help text
    209 
    210   Either --rpcProps or both --host and --port must be specified.
    211 
    212 Note that if <conf> directory is specified, then it is always included first
    213 in the classpath.
    214 
    215 EOF
    216 }
    217 
    218 run_flume() {
    219   local FLUME_APPLICATION_CLASS
    220 
    221   if [ "$#" -gt 0 ]; then
    222     FLUME_APPLICATION_CLASS=$1
    223     shift
    224   else
    225     error "Must specify flume application class" 1
    226   fi
    227 
    228   if [ ${CLEAN_FLAG} -ne 0 ]; then
    229     set -x
    230   fi
    231   $EXEC $JAVA_HOME/bin/java $JAVA_OPTS -cp "$FLUME_CLASSPATH" 
    232       -Djava.library.path=$FLUME_JAVA_LIBRARY_PATH "$FLUME_APPLICATION_CLASS" $*
    233 }
    234 
    235 ################################
    236 # main
    237 ################################
    238 
    239 # set default params
    240 FLUME_CLASSPATH=""
    241 FLUME_JAVA_LIBRARY_PATH=""
    242 JAVA_OPTS="-Xmx20m"
    243 LD_LIBRARY_PATH=""
    244 
    245 opt_conf=""
    246 opt_classpath=""
    247 opt_plugins_dirs=""
    248 opt_java_props=""
    249 opt_dryrun=""
    250 
    251 mode=$1
    252 shift
    253 
    254 case "$mode" in
    255   help)
    256     display_help
    257     exit 0
    258     ;;
    259   agent)
    260     opt_agent=1
    261     ;;
    262   node)
    263     opt_agent=1
    264     warn "The "node" command is deprecated. Please use "agent" instead."
    265     ;;
    266   avro-client)
    267     opt_avro_client=1
    268     ;;
    269   tool)
    270     opt_tool=1
    271     ;;
    272   version)
    273    opt_version=1
    274    CLEAN_FLAG=0
    275    ;;
    276   *)
    277     error "Unknown or unspecified command '$mode'"
    278     echo
    279     display_help
    280     exit 1
    281     ;;
    282 esac
    283 
    284 args=""
    285 while [ -n "$*" ] ; do
    286   arg=$1
    287   shift
    288 
    289   case "$arg" in
    290     --conf|-c)
    291       [ -n "$1" ] || error "Option --conf requires an argument" 1
    292       opt_conf=$1
    293       shift
    294       ;;
    295     --classpath|-C)
    296       [ -n "$1" ] || error "Option --classpath requires an argument" 1
    297       opt_classpath=$1
    298       shift
    299       ;;
    300     --dryrun|-d)
    301       opt_dryrun="1"
    302       ;;
    303     --plugins-path)
    304       opt_plugins_dirs=$1
    305       shift
    306       ;;
    307     -D*)
    308       opt_java_props="$opt_java_props $arg"
    309       ;;
    310     -X*)
    311       opt_java_props="$opt_java_props $arg"
    312       ;;
    313     *)
    314       args="$args $arg"
    315       ;;
    316   esac
    317 done
    318 
    319 # make opt_conf absolute
    320 if [[ -n "$opt_conf" && -d "$opt_conf" ]]; then
    321   opt_conf=$(cd $opt_conf; pwd)
    322 fi
    323 
    324 # allow users to override the default env vars via conf/flume-env.sh
    325 if [ -z "$opt_conf" ]; then
    326   warn "No configuration directory set! Use --conf <dir> to override."
    327 elif [ -f "$opt_conf/flume-env.sh" ]; then
    328   info "Sourcing environment configuration script $opt_conf/flume-env.sh"
    329   source "$opt_conf/flume-env.sh"
    330 fi
    331 
    332 # append command-line java options to stock or env script JAVA_OPTS
    333 if [ -n "${opt_java_props}" ]; then
    334   JAVA_OPTS="${JAVA_OPTS} ${opt_java_props}"
    335 fi
    336 
    337 # prepend command-line classpath to env script classpath
    338 if [ -n "${opt_classpath}" ]; then
    339   if [ -n "${FLUME_CLASSPATH}" ]; then
    340     FLUME_CLASSPATH="${opt_classpath}:${FLUME_CLASSPATH}"
    341   else
    342     FLUME_CLASSPATH="${opt_classpath}"
    343   fi
    344 fi
    345 
    346 if [ -z "${FLUME_HOME}" ]; then
    347   FLUME_HOME=$(cd $(dirname $0)/..; pwd)
    348 fi
    349 
    350 # prepend $FLUME_HOME/lib jars to the specified classpath (if any)
    351 if [ -n "${FLUME_CLASSPATH}" ] ; then
    352   FLUME_CLASSPATH="${FLUME_HOME}/lib/*:$FLUME_CLASSPATH"
    353 else
    354   FLUME_CLASSPATH="${FLUME_HOME}/lib/*"
    355 fi
    356 
    357 # load plugins.d directories
    358 PLUGINS_DIRS=""
    359 if [ -n "${opt_plugins_dirs}" ]; then
    360   PLUGINS_DIRS=$(sed -e 's/:/ /g' <<<${opt_plugins_dirs})
    361 else
    362   PLUGINS_DIRS="${FLUME_HOME}/plugins.d"
    363 fi
    364 
    365 unset plugin_lib plugin_libext plugin_native
    366 for PLUGINS_DIR in $PLUGINS_DIRS; do
    367   if [[ -d ${PLUGINS_DIR} ]]; then
    368     for plugin in ${PLUGINS_DIR}/*; do
    369       if [[ -d "$plugin/lib" ]]; then
    370         plugin_lib="${plugin_lib}${plugin_lib+:}${plugin}/lib/*"
    371       fi
    372       if [[ -d "$plugin/libext" ]]; then
    373         plugin_libext="${plugin_libext}${plugin_libext+:}${plugin}/libext/*"
    374       fi
    375       if [[ -d "$plugin/native" ]]; then
    376         plugin_native="${plugin_native}${plugin_native+:}${plugin}/native"
    377       fi
    378     done
    379   fi
    380 done
    381 
    382 if [[ -n "${plugin_lib}" ]]
    383 then
    384   FLUME_CLASSPATH="${FLUME_CLASSPATH}:${plugin_lib}"
    385 fi
    386 
    387 if [[ -n "${plugin_libext}" ]]
    388 then
    389   FLUME_CLASSPATH="${FLUME_CLASSPATH}:${plugin_libext}"
    390 fi
    391 
    392 if [[ -n "${plugin_native}" ]]
    393 then
    394   if [[ -n "${FLUME_JAVA_LIBRARY_PATH}" ]]
    395   then
    396     FLUME_JAVA_LIBRARY_PATH="${FLUME_JAVA_LIBRARY_PATH}:${plugin_native}"
    397   else
    398     FLUME_JAVA_LIBRARY_PATH="${plugin_native}"
    399   fi
    400 fi
    401 
    402 # find java
    403 if [ -z "${JAVA_HOME}" ] ; then
    404   warn "JAVA_HOME is not set!"
    405   # Try to use Bigtop to autodetect JAVA_HOME if it's available
    406   if [ -e /usr/libexec/bigtop-detect-javahome ] ; then
    407     . /usr/libexec/bigtop-detect-javahome
    408   elif [ -e /usr/lib/bigtop-utils/bigtop-detect-javahome ] ; then
    409     . /usr/lib/bigtop-utils/bigtop-detect-javahome
    410   fi
    411 
    412   # Using java from path if bigtop is not installed or couldn't find it
    413   if [ -z "${JAVA_HOME}" ] ; then
    414     JAVA_DEFAULT=$(type -p java)
    415     [ -n "$JAVA_DEFAULT" ] || error "Unable to find java executable. Is it in your PATH?" 1
    416     JAVA_HOME=$(cd $(dirname $JAVA_DEFAULT)/..; pwd)
    417   fi
    418 fi
    419 
    420 # look for hadoop libs
    421 add_hadoop_paths
    422 add_HBASE_paths
    423 
    424 # prepend conf dir to classpath
    425 if [ -n "$opt_conf" ]; then
    426   FLUME_CLASSPATH="$opt_conf:$FLUME_CLASSPATH"
    427 fi
    428 
    429 set_LD_LIBRARY_PATH
    430 # allow dryrun
    431 EXEC="exec"
    432 if [ -n "${opt_dryrun}" ]; then
    433   warn "Dryrun mode enabled (will not actually initiate startup)"
    434   EXEC="echo"
    435 fi
    436 
    437 # finally, invoke the appropriate command
    438 if [ -n "$opt_agent" ] ; then
    439   run_flume $FLUME_AGENT_CLASS $args
    440 elif [ -n "$opt_avro_client" ] ; then
    441   run_flume $FLUME_AVRO_CLIENT_CLASS $args
    442 elif [ -n "${opt_version}" ] ; then
    443   run_flume $FLUME_VERSION_CLASS $args
    444 elif [ -n "${opt_tool}" ] ; then
    445   run_flume $FLUME_TOOLS_CLASS $args
    446 else
    447   error "This message should never appear" 1
    448 fi
    449 
    450 exit 0
    View Code

    五、测试配置文件

      在conf目录下创建example-conf.properties文件,属性如下所示:  

     1 # Describe the source 
     2 a1.sources = r1
     3 a1.sinks = k1
     4 a1.channels = c1
     5 
     6 # Describe/configure the source
     7 a1.sources.r1.type = avro
     8 a1.sources.r1.bind = localhost
     9 a1.sources.r1.port = 44444
    10 
    11 # Describe the sink
    12 # 将数据输出至日志中
    13 a1.sinks.k1.type = logger
    14 
    15 
    16 # Use a channel which buffers events in memory
    17 a1.channels.c1.type = memory
    18 a1.channels.c1.capacity = 1000
    19 a1.channels.c1.transactionCapacity = 100
    20 
    21 # Bind the source and sink to the channel
    22 a1.sources.r1.channels = c1
    23 a1.sinks.k1.channel = c1

    六、运行命令

      6.1 启动代理

    [hadoop@hadoop1 conf]$ flume-ng agent -n a1 -f example-conf.properties

      6.2 启动avro-client客户端向agent代理发送数据-需要单独启动新的窗口

    [hadoop@hadoop1 conf]$ flume-ng avro-client -H localhost -p 44444 -F file01

    七、结果查看

    1 14/01/16 22:26:34 INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1:54289 => /127.0.0.1:44444] OPEN
    2 14/01/16 22:26:34 INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1:54289 => /127.0.0.1:44444] BOUND: /127.0.0.1:44444
    3 14/01/16 22:26:34 INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1:54289 => /127.0.0.1:44444] CONNECTED: /127.0.0.1:54289
    4 14/01/16 22:26:36 INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1:54289 :> /127.0.0.1:44444] DISCONNECTED
    5 14/01/16 22:26:36 INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1:54289 :> /127.0.0.1:44444] UNBOUND
    6 14/01/16 22:26:36 INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1:54289 :> /127.0.0.1:44444] CLOSED
    7 14/01/16 22:26:36 INFO ipc.NettyServer: Connection to /127.0.0.1:54289 disconnected.
    8 14/01/16 22:26:38 INFO sink.LoggerSink: Event: { headers:{} body: 68 65 6C 6C 6F 20 77 6F 72 6C 64                hello world }
  • 相关阅读:
    git 工作流程
    微信小程序 scroll-view 水平滚动使用
    微信小程序 tree组件
    微信小程序各类生命周期
    JS/Jquey 图片链接点击直接下载
    JS/Jquery 表单方式提交总结
    Nodejs 发送邮件 激活邮箱
    搭建Nodejs环境 创建Express应用
    break、continue、return区别
    shell(一)
  • 原文地址:https://www.cnblogs.com/hadoopdev/p/3524041.html
Copyright © 2011-2022 走看看