Logstash:
- 支持多数据获取机制,通过TCP/UDP协议、文件、syslog、windows EventLogs及STDIN等;获取到数据后,它支持对数据执行过滤、修改等操作;
- 基于JRuby语言,运行于JVM上;
- agent/server模型
配置框架:
input {
...
}
filter {
...
}
output {
...
}
- 四种类型的插件:
input filter codec output
- 数据类型:
Array:[item1, item2,...] Boolean:true, false Bytes: Codec:编码器 Hash:key => value Number: Password: Path:文件系统路径; String:字符串
- 字段引用:
[ ]
- 条件判断:
==, !=, <, <=, >, >= =~, !~ in, not in and, or ()
Logstash的工作流程:
input | filter | output, 类似于管道模式,如无需对数据进行额外处理,filter可省略;
input {
stdin {}
}
output {
stdout {
codec => rubydebug
}
}
Logstash的插件:
- input插件:
1、File:从指定的文件中读取事件流;
使用FileWatch(Ruby Gem库)监听文件的变化。
.sincedb:记录了每个被监听的文件的inode, major number, minor nubmer, pos;
input {
file {
path => ["/var/log/messages"]
type => "system"
start_position => "beginning"
}
}
output {
stdout {
codec => rubydebug
}
}
2、udp:通过udp协议从网络连接来读取Message,其必备参数为port,用于指明自己监听的端口,host则用指明自己监听的地址;
collectd:性能监控程序,通过udp协议可向logstash发送当前主机的性能信息;
CentOS 7 epel源:
# yum install collectd -y
# vim /etc/collectd.conf
Hostname "node3.magedu.com"
LoadPlugin syslog
LoadPlugin cpu
LoadPlugin df
LoadPlugin interface
LoadPlugin load
LoadPlugin memory
LoadPlugin network
<Plugin network>
<Server "172.16.100.70" "25826">
# 172.16.100.70是logstash主机的地址,25826是其监听的udp端口;
</Server>
</Plugin>
Include "/etc/collectd.d"
# systemctl start collectd.service
# logstash端:
input {
udp {
port => 25826
codec => collectd {}
type => "collectd"
}
}
output {
stdout {
codec => rubydebug
}
}
3、redis插件:
从redis读取数据,支持redis channel和lists两种方式
input {
redis {
host => "localhost"
port => "6379"
data_type => "list"
key => "redisdata"
}
}
output {
stdout {
codec => rubydebug
}
}
- filter插件:
- 用于在将event通过output发出之前对其实现某些处理功能。grok
- grok:用于分析并结构化文本数据;目前 是logstash中将非结构化日志数据转化为结构化的可查询数据的不二之选。
- syslog, apache, nginx
- 模式定义位置:/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-0.3.0/patterns/grok-patterns
语法格式:
%{SYNTAX:SEMANTIC}
SYNTAX:预定义模式名称;
SEMANTIC:匹配到的文本的自定义标识符;
1.1.1.1 GET /index.html 30 0.23
%{IP:clientip} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}
input {
stdin {}
}
filter {
grok {
match => { "message" => "%{IP:clientip} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" }
}
}
output {
stdout {
codec => rubydebug
}
}
{
"message" => "1.1.1.1 GET /index.html 30 0.23",
"@version" => "1",
"@timestamp" => "2018-01-27T02:13:52.558Z",
"host" => "node1",
"clientip" => "1.1.1.1",
"method" => "GET",
"request" => "/index.html",
"bytes" => "30",
"duration" => "0.23"
}
grok:过滤message消息
remove_field => "message"
grok {
match => { "message" => "XXX" }
remove_field => "message"
}
自定义grok的模式:
grok的模式是基于正则表达式编写,其元字符与其它用到正则表达式的工具awk/sed/grep/pcre差别不大。
PATTERN_NAME (?the pattern here)
# 匹配apache log
input {
file {
path => ["/var/log/httpd/access_log"]
type => "apachelog"
start_position => "beginning"
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
}
output {
stdout {
codec => rubydebug
}
}
# 匹配nginx log
nginx log的匹配方式:
将如下信息添加至 /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-0.3.0/patterns/grok-patterns文件的尾部:
NGUSERNAME [a-zA-Z.@-+_%]+
NGUSER %{NGUSERNAME}
NGINXACCESS %{IPORHOST:clientip} - %{NOTSPACE:remote_user} [%{HTTPDATE:timestamp}] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{NOTSPACE:http_x_forwarded_for}
input {
file {
path => ["/var/log/nginx/access.log"]
type => "nginxlog"
start_position => "beginning"
}
}
filter {
grok {
match => { "message" => "%{NGINXACCESS}" }
}
}
output {
stdout {
codec => rubydebug
}
}
- output插件:
1、redis插件: 写入数据到redis中,支持redis channel和lists两种方式
input {
file {
path => ["/var/log/nginx/access.log"]
type => "nginxlog"
start_position => "beginning"
}
}
output {
redis {
host => "localhost"
port => "6379"
data_type => "list"
key => "logstash-%{type}"
}
}
2、elasticsearch插件:
写入数据到elasticsearch集群中
input {
redis {
host => "localhost""
port => "6379"
data_type => "list"
key => "logstash-nginxlog"
}
}
output {
elasticsearch {
cluster => "loges"
index => "logstath-%{+YYYY.MM.dd}"
}
}