zoukankan      html  css  js  c++  java
  • Logstash学习之路(二)Elasticsearch导入json数据文件

    一、数据从文件导入elasticsearch

    1、数据准备:

    1、数据文件:test.json
    2、索引名称:index
    3、数据类型:doc
    4、批量操作API:bulk
    {"index":{"_index":"index2","_type":"type2","_id":0}}
    {"age":10,"name":"jim"}
        {"index":{"_index":"index2","_type":"type2","_id":1}}
    {"age":16,"name":"tom"}

    2、_bulk API导入ES的JSON文件需要满足一定的格式,每条记录之前,需要有文档ID且每一行 结束

    curl -H 'Content-Type: application/x-ndjson'  -s -XPOST localhost:9200/_bulk --data-binary @test.json

    如果是在test.json文件中没有指定index名、type、id时:

    curl -H 'Content-Type: application/x-ndjson'  -s -XPOST localhost:9200/index2/type2/_bulk --data-binary @test.json
    { "index" : { } }
    {"age":16,"name":"tom"}

    但是id会自动生成

    3、对于普通json文件的导入,可以logstash进行导入:

    logstash的安装准备详细过程请查阅:

    https://www.cnblogs.com/yfb918/p/10763292.html

    json数据准备

    [root@master mnt]# cat data.json
    {"age":16,"name":"tom"}
    {"age":11,"name":"tsd"}

     创建配置文件:

    [root@master bin]# cat json.conf 
    input{
            file{
                    path=>"/mnt/data.json"
                    start_position=>"beginning"
                    sincedb_path=>"/dev/null"
                    codec=>json{
                            charset=>"ISO-8859-1"
                    }
            }
    }
    output{
            elasticsearch{
                    hosts=>"http://192.168.200.100:9200"
                    index=>"jsontestlogstash"
                    document_type=>"doc"
            }
            stdout{}
    }

    执行结果:

    [root@master bin]# ./logstash -f json.conf
    [2019-04-25T10:59:14,803][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
    [2019-04-25T10:59:16,084][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
    {
    "name" => "tom",
    "age" => 16,
    "path" => "/mnt/data.json",
    "@timestamp" => 2019-04-25T02:59:16.009Z,
    "host" => "master",
    "@version" => "1"
    }
    {
    "name" => "tsd",
    "age" => 11,
    "path" => "/mnt/data.json",
    "@timestamp" => 2019-04-25T02:59:16.096Z,
    "host" => "master",
    "@version" => "1"
    }

     从结果中可以看到:默认增加了几个字段。那么我们想要这几个默认生成的字段我们应该怎么么办呢,可以如下解决:

    在配置文件中使用filter进行过滤:

    [root@master bin]# cat json.conf 
    input{
            file{
                    path=>"/mnt/data.json"
                    start_position=>"beginning"
                    sincedb_path=>"/dev/null"
                    codec=>json{
                            charset=>"ISO-8859-1"
                    }
            }
    }
    filter{
            mutate {  
                    remove_field => "@timestamp"    
                    remove_field => "@version"    
                    remove_field => "host"    
                    remove_field => "path"
            }
    }
    output{
            elasticsearch{
                    hosts=>"http://192.168.200.100:9200"
                    index=>"jsontestlogstash"
                    document_type=>"doc"
            }
            stdout{}
    }

    过滤之后的结果:

  • 相关阅读:
    创建类以及引用一个类
    修改hosts文件
    微信第三方登录接口开发
    Android定位
    Leetcode 102. Binary Tree Level Order Traversal
    Leetcode 725. Split Linked List in Parts
    Leetcode 445. Add Two Numbers II
    Leetcode 328. Odd Even Linked List
    Leetcode 237. Delete Node in a Linked List
    Leetcode 234. Palindrome Linked List
  • 原文地址:https://www.cnblogs.com/yfb918/p/10762984.html
Copyright © 2011-2022 走看看