zoukankan      html  css  js  c++  java
  • logstash 学习小记

    logstash 学习小记

    标签(空格分隔): 日志收集


    Introduce

    Logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for
    later use (like, for searching). – http://logstash.net

    自从2013年logstash被ES公司收购之后,ELK stask正式称为官方用语。非常多公司都開始ELK实践。我们也不例外,借用新浪是怎样分析处理32亿条实时日志的?的一张图
    此处输入图片的描写叙述
    这是一个再常见只是的架构了:
    (1)Kafka:接收用户日志的消息队列。
    (2)Logstash:做日志解析,统一成JSON输出给Elasticsearch。
    (3)Elasticsearch:实时日志分析服务的核心技术,一个schemaless,实时的数据存储服务。通过index组织数据,兼具强大的搜索和统计功能。
    (4)Kibana:基于Elasticsearch的数据可视化组件。超强的数据可视化能力是众多公司选择ELK stack的重要原因。

    可是众多log 收集framwork,像flume,scribe。fluent,为什么选用logstash呢?

    原因非常简单:

    1. 部署启动非常easy。仅仅须要有jdk就OK了
    2. 配置简单,无需编码
    3. 支持收集log路径的正則表達式,不像flume那样必须写死要收集的文件名称。logstash不是,像这样

      path => [“/var/log/.log“]

      有个Flume VS Fluentd VS Logstash能够看看

    Logstash Examples

    logstash事件处理流程氛围三个stages:input ,filter,output。input支持非常多。如file。redis,kafka等等,filter主要是对input的log进行自己想要的处理,output则是输出到你要存储log的第三方framework。如kafka,redis,elasticsearch。db什么的。详细的查看官网。
    废话不多说,開始样例:
    1. 最最简单的样例
    input和output都是标准输入输出

    [joeywen@192 logstash]$ bin/logstash -e 'input { stdin { } } output { stdout {}}'
    
    Logstash startup completed
    >hello world  ## 输入的内容
    >2015-08-02T05:26:55.564Z joeywens-MacBook-Pro.local hello world   ## logstash收集的内容
    1. 编写config文件
    input {
      file {
        path => ["/var/log/*.log"]
        type => "syslog"
      }
    }
    
    
    output {
      stdout {  codec => rubydebug }
      #  elasticsearch {
      #      host => 'localhost'
      #      protocol => 'transport'
      #      cluster => 'elasticsearch'
     #       index => 'logstash-joeymac-%{+YYYY.MM.dd}'
     #   }
    }

    输入是file形式,收集系统日志,假设有异常发生,通常异常会多行,这里用codec => multiline 来对出现异常的多行转换为一行输入

    输出就是ES。或者你也能够把stdout作为调试打开看看,输出的是什么内容。执行命令例如以下以及输出

    [joeywen@192 logstash]$ bin/logstash -f sys.conf
    Logstash startup completed
    
    {
        "@timestamp" => "2015-08-02T05:36:08.972Z",
           "message" => "Aug  2 13:36:08 joeywens-MacBook-Pro.local GoogleSoftwareUpdateAgent[1976]: 2015-08-02 13:34:51.764 GoogleSoftwareUpdateAgent[1976/0xb029b000] [lvl=2] -[KSUpdateEngine(PrivateMethods) updateFinish] KSUpdateEngine update processing complete.",
          "@version" => "1",
              "host" => "joeywens-MacBook-Pro.local",
              "path" => "/var/log/system.log",
              "type" => "syslog"
    }
    {
        "@timestamp" => "2015-08-02T05:36:08.973Z",
           "message" => "Aug  2 13:36:08 joeywens-MacBook-Pro.local GoogleSoftwareUpdateAgent[1976]: 2015-08-02 13:36:08.105 GoogleSoftwareUpdateAgent[1976/0xb029b000] [lvl=3] -[KSAgentUploader fetcher:failedWithError:] Failed to upload stats to <NSMutableURLRequest https://tools.google.com/service/update2> with error Error Domain=NSURLErrorDomain Code=-1001 "The request timed out." UserInfo=0x3605f0 {NSErrorFailingURLStringKey=https://tools.google.com/service/update2, _kCFStreamErrorCodeKey=60, NSErrorFailingURLKey=https://tools.google.com/service/update2, NSLocalizedDescription=The request timed out., _kCFStreamErrorDomainKey=1, NSUnderlyingError=0x35fd30 "The request timed out."}",
          "@version" => "1",
              "host" => "joeywens-MacBook-Pro.local",
              "path" => "/var/log/system.log",
              "type" => "syslog"
    }
    {
        "@timestamp" => "2015-08-02T05:36:08.973Z",
           "message" => "Aug  2 13:36:08 joeywens-MacBook-Pro.local GoogleSoftwareUpdateAgent[1976]: 2015-08-02 13:36:08.272 GoogleSoftwareUpdateAgent[1976/0xb029b000] [lvl=3] -[KSAgentApp uploadStats:] Failed to upload stats <KSStatsCollection:0x4323e0 path="/Users/joeywen/Library/Google/GoogleSoftwareUpdate/Stats/Keystone.stats", count=6, stats={",
          "@version" => "1",
              "host" => "joeywens-MacBook-Pro.local",
              "path" => "/var/log/system.log",
              "type" => "syslog"
    }

    假设想加入或删除字段,该怎么办?filter就该登场了

    1. filter
      filter的功能十分强大。能够对input的内容做不论什么更改。input的内容会转换为一个叫event的map,里面存放着key/value对,正如你所示输出一样,@timestamp,type,@version, host,message等等,都是event里面的key,在filter里面你能够启动ruby 编程plugin对其做不论什么更改
      如:
    input {
      file {
        path => ["/var/log/*.log"]
        type => "syslog"
      }
    }
    
    filter {
        multiline {
          pattern => "(^d+serror)|(^.+Exception:.+)|(^s+at .+)|(^s+... d+ more)|(^s*Causedby:.+)"
          what => "previous"
        }
        if [type] =~ /^syslog/ {
            ruby {
                code => "file_name = event['path'].split('/')[-1]
                event['file_name'] = file_name"
            }
        }
    }
    
    output {
      stdout {  codec => rubydebug }
     }

    如上我对type已syslog开头的event做更改,调用ruby编程
    看看输出

    [joeywen@192 logstash]$ bin/logstash -f sys.conf
    Logstash startup completed
    {
        "@timestamp" => "2015-08-02T05:46:52.771Z",
           "message" => "Aug  2 13:46:40 joeywens-MacBook-Pro.local Dock[234]: CGSConnectionByID: 0 is not a valid connection ID.",
          "@version" => "1",
              "host" => "joeywens-MacBook-Pro.local",
              "path" => "/var/log/system.log",
              "type" => "syslog",
         "file_name" => "system.log"
    }

    能够看到多了个file_name的字段,
    假设相对message做解析的话。须要调用grok plugin来做,grok是非常强大插件,比如

    input {
      file {
        path => "/var/log/http.log"
      }
    }
    filter {
      grok {
        patterns_dir => ["/opt/logstash/patterns", "/opt/logstash/extra_patterns"]
        match => { "message" => "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" }
      }
    }

    对于message字段调用正则匹配,语法是%{SYNTAX:SEMANTIC}
    第一个SYNTAX是正則表達式名称,第二个是对于匹配成功的字段取名字,这些SYNTAX存在指定的pattern_dir文件夹下的文件,格式是:

    NAME PATTERN
    如 NUMBER d+

    也能够使用mutate来最event的key和value做更改。包含remove,add,update,rename 等等。详细的都能够看看(logstash文档)[https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html]

    这里给个详细的样例吧
    配置:

    input {
      file {
        path => ["/var/log/*.log"]
        type => "syslog"
    
      }
    }
    
    filter {
        multiline {
          pattern => "(^d+serror)|(^.+Exception:.+)|(^s+at .+)|(^s+... d+ more)|(^s*Causedby:.+)"
          what => "previous"
        }
        if [type] =~ /^syslog/ {
            ruby {
                code => "file_name = event['path'].split('/')[-1]
                event['file_name'] = file_name"
            }
            grok {
                patterns_dir => ["./patterns/*"]
                match => {"message" => "%{MAC_BOOK:joeymac}"}
            }
    
            mutate {
                rename => {"file_name" => "fileName"}
                add_field => {"foo_%{joeymac}" => "Hello world, from %{host}"}
            }
        }
    }
    
    output {
      stdout {  codec => rubydebug }
     }

    输出

    [joeywen@192 logstash]$ bin/logstash -f sys.conf
    Logstash startup completed
    {
                      "@timestamp" => "2015-08-02T06:10:13.161Z",
                         "message" => "Aug  2 14:10:12 joeywens-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.quicklook[2206]): Endpoint has been activated through legacy launch(3) APIs. Please switch to XPC or bootstrap_check_in(): com.apple.quicklook",
                        "@version" => "1",
                            "host" => "joeywens-MacBook-Pro.local",
                            "path" => "/var/log/system.log",
                            "type" => "syslog",
                         "joeymac" => "joeywens-MacBook-Pro",
                        "fileName" => "system.log",
        "foo_joeywens-MacBook-Pro" => "Hello world, from joeywens-MacBook-Pro.local"
    }

    转载请注明出处

  • 相关阅读:
    51 Nod 1086 多重背包问题(单调队列优化)
    51 Nod 1086 多重背包问题(二进制优化)
    51 Nod 1085 01背包问题
    poj 2559 Largest Rectangle(单调栈)
    51 Nod 1089 最长回文子串(Manacher算法)
    51 Nod N的阶乘的长度 (斯特林近似)
    51 Nod 1134 最长递增子序列(经典问题回顾)
    51 Nod 1020 逆序排列
    PCA-主成分分析(Principal components analysis)
    Python中cPickle
  • 原文地址:https://www.cnblogs.com/yxysuanfa/p/7399028.html
Copyright © 2011-2022 走看看