配置监控
1、修改flume-env.sh
export JAVA_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=5445 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"
2、使用以下启动命令
flume-ng agent -n a1 -c $FLUME_HOME/conf -f $FLUME_HOME/conf/exec-memory-hdfs-partition.conf -Dflume.root.logger=INFO,console -Dflume.monitoring.type=http -Dflume.monitoring.port=1234
指标项整理
OS监控项
objectName | 指标项 | 说明 |
---|---|---|
java.lang:type=OperatingSystem | FreePhysicalMemorySize | 空闲物理内存 |
java.lang:type=OperatingSystem | SystemCpuLoad | 系统CPU利用率 |
java.lang:type=OperatingSystem | ProcessCpuLoad | 进程CPU利用率 |
java.lang:type=GarbageCollector,name=PS Scavenge | CollectionCount | GC次数 |
json数据格式
# curl http:
//localhost
:1234
/metrics
{
"SOURCE.src-1":{
"OpenConnectionCount":"0",
"Type":"SOURCE",
"AppendBatchAcceptedCount":"1355",
"AppendBatchReceivedCount":"1355",
"EventAcceptedCount":"28286",
"AppendReceivedCount":"0",
"StopTime":"0",
"StartTime":"1442566410435",
"EventReceivedCount":"28286",
"AppendAcceptedCount":"0"
},
"CHANNEL.ch-1":{
"EventPutSuccessCount":"28286",
"ChannelFillPercentage":"0.0",
"Type":"CHANNEL",
"StopTime":"0",
"EventPutAttemptCount":"28286",
"ChannelSize":"0",
"StartTime":"1442566410326",
"EventTakeSuccessCount":"28286",
"ChannelCapacity":"1000000",
"EventTakeAttemptCount":"313734329512"
},
"SINK.sink-1":{
"Type":"SINK",
"ConnectionClosedCount":"0",
"EventDrainSuccessCount":"28286",
"KafkaEventSendTimer":"482493",
"BatchCompleteCount":"0",
"ConnectionFailedCount":"0",
"EventDrainAttemptCount":"0",
"ConnectionCreatedCount":"0",
"BatchEmptyCount":"0",
"StopTime":"0",
"RollbackCount":"9",
"StartTime":"1442566411897",
"BatchUnderflowCount":"0"
}
}
指标项说明
source监控项
objectName (会随实际情况而变化) | 指标项 | 说明 |
---|---|---|
org.apache.flume.source:type=r1 | OpenConnectionCount | 目前与客户端或sink保持连接的总数量 |
org.apache.flume.source:type=r1 | AppendBatchAcceptedCount | 成功提交到channel的批次的总数量 |
org.apache.flume.source:type=r1 | AppendBatchReceivedCount | 接收到事件批次的总数量 |
org.apache.flume.source:type=r1 | AppendAcceptedCount | 逐条录入的次数 |
org.apache.flume.source:type=r1 | AppendReceivedCount | 每批只有一个事件的事件总数量 |
org.apache.flume.source:type=r1 | EventAcceptedCount | 成功写出到channel的事件总数量 |
org.apache.flume.source:type=r1 | EventReceivedCount | 目前为止source已经接收到的事件总数量 |
org.apache.flume.source:type=r1 | StartTime | source启动时的毫秒值时间 |
org.apache.flume.source:type=r1 | StopTime | source停止时的毫秒值时间,为0表示一直在运行 |
channel监控项
objectName (会随实际情况而变化) | 指标项 | 说明 |
---|---|---|
org.apache.flume.channel:type=c1 | EventPutAttemptCount | Source尝试写入Channe的事件总次数 |
org.apache.flume.channel:type=c1 | EventPutSuccessCount | 成功写入channel且提交的事件总次数 |
org.apache.flume.channel:type=c1 | EventTakeAttemptCount | sink尝试从channel拉取事件的总次数。 |
org.apache.flume.channel:type=c1 | EventTakeSuccessCount | sink成功从channel读取事件的总数量 |
org.apache.flume.channel:type=c1 | ChannelSize | 目前channel中事件的总数量 |
org.apache.flume.channel:type=c1 | ChannelCapacity | channel的容量 |
org.apache.flume.channel:type=c1 | ChannelFillPercentage | channel已填入的百分比 |
org.apache.flume.channel:type=c1 | StartTime | channel启动时的毫秒值时间 |
org.apache.flume.channel:type=c1 | StopTime | channel停止时的毫秒值时间,为0表示一直在运行 |
sink监控项
objectName (会随实际情况而变化) | 指标项 | 说明 |
---|---|---|
org.apache.flume.sink:type=k1 | ConnectionCreatedCount | 创建的连接数量 |
org.apache.flume.sink:type=k1 | ConnectionClosedCount | 关闭的连接数量 |
org.apache.flume.sink:type=k1 | ConnectionFailedCount | 由于错误关闭的连接数量 |
org.apache.flume.sink:type=k1 | BatchEmptyCount | 批量处理event的个数为0的数量-表示source写入数据的速度比sink处理数据的速度慢 |
org.apache.flume.sink:type=k1 | BatchUnderflowCount | 批量处理event的个数小于批处理大小的数量 |
org.apache.flume.sink:type=k1 | BatchCompleteCount | 批量处理event的个数等于批处理大小的数量 |
org.apache.flume.sink:type=k1 | EventDrainAttemptCount | sink尝试写出到存储的事件总数量 |
org.apache.flume.sink:type=k1 | EventDrainSuccessCount | sink成功写出到存储的事件总数量 |
org.apache.flume.sink:type=k1 | StartTime | channel启动时的毫秒值时间 |
org.apache.flume.sink:type=k1 | StopTime | channel停止时的毫秒值时间,为0表示一直在运行 |
Flume启动关闭脚本:
# vim flume.sh
#!/bin/bash path=/usr/local/apache-flume-1.9.0-bin JAR="flume" Flumeconf="flume.conf" agentname="agent1" function start(){ num=`ps -ef|grep java|grep $JAR|wc -l` if [ "$num" = "0" ] ;then nohup $path/bin/flume-ng agent
-c $path/conf -f $path/conf/$Flumeconf -n $agentname
-Dflume.root.logger=INFO,LOGFILE -Dflume.log.dir=$path/logs
-Dflume.monitoring.type=http -Dflume.monitoring.port=1234 >/dev/null 2>&1 & echo "start successful ......" echo "日志路径: $path/logs/flume.log" else echo "进程已经存在,启动失败,请检查 ......" exit 0 fi } function stop(){ num=`ps -ef|grep java|grep $JAR|wc -l` if [ "$num" != "0" ];then ps -ef|grep java|grep $JAR|awk '{print $2}'|xargs kill -9 echo "stop successful ......" else echo "服务未启动,无需停止 ......" fi } function restart(){ stop num=`ps -ef|grep java|grep $JAR|wc -l` while [ $num -gt 0 ];do sleep 5 num=`ps -ef|grep java|grep $JAR|wc -l` done start echo "restarted successful ......" } case "$1" in "start") start ;; "stop") stop ;; "restart") restart ;; *) ;; esac
# sh flume.sh start