zoukankan      html  css  js  c++  java
  • es-03-DSL的简单使用

    以下操作在kibana中进行, 如果在linux的shell中, 请使用

    curl -Xget 'http://node1:9200/index/type/id' -d '{ ... }' 的形式, 其中 -d 是传参

    1, 获取集群状态 

    1), 查看健康状况: 

    GET /_cat/health?v

    2), 查看节点: 

    GET /_cat/nodes?v

    2, index操作(类似数据库databases)

    1, index操作

    1), 创建数据库

    put lag
    {
        "settings": {
            "index": {
                "number_of_shards": 5,
                "number_of_replicas": 1
            }
        }
    }

    2), 修改settings

    分片不可以更改, 副本可以更改

    put lag/_settings
    {
        "number_of_shards": 3
    }

    3), 获取所有的索引

        get _all

     获取索引

    get lag/_settings
    get _all/settings
    get .kibana,lagou/_settings
    get _settings

    4), 查看所有index

    GET /_cat/indices?v

    5), 创建数据

    put customer/_doc/1?pretty
    {
      "name": "vini"
    }

    4), 查询

    get customer/_doc/1?pretty

    5), 删除index

    delete customer?pretty
    GET /_cat/indices?v

     

    2, document操作(类似记录 record)

    1), 保存文档

    index/type/id 不指定id的话, 会自动生成uuid

    put lag/job/1
    {
        "title": 'python 爬虫“,
        ‘salary”: 15000,
        ’city‘: ’bj‘,
        ’company“:   {
            "name": "Baidu",
            "company_addr": "bj"
        },
        "publish_date": "2018"
    }

    2), 获取文档

    get lagou/job/1

    或者

    看下面query

    3), 修改数据

    PUT /customer/_doc/1/_update?pretty
    {
      "name": "wenbronk"
    }

    就可以将原来的name进行更改

    4), 使用post进行修改, 只修改某些字段

    只能更新已经存在的id, 增量修改, 没有的字段会添加, 有的会覆盖

    post lagou/doc/1/_update?pretty
    {
      "doc": {
        "name": "vini",
        "age": 28
      }
    }

    5), 进行简单的脚本计算

    post customer/_doc/1/_update?pretty
    {
      "script": "ctx._source.age += 5"
      
    }

    6), 删除document

    DELETE /customer/_doc/1?pretty

    3, batch处理

    可以合并多个操作, 比如index, delete, update, 也可以从一个index导入另一个index

    1), 批量插入数据

    每条数据由2行构成, delete除外, 第一行为元数据行, 第二行为数据行, upsert比较特殊, 可能为upsert, doc, 或者script

    元数据必须放在一行!!!!!

    POST /customer/_doc/_bulk?pretty
    {"index":{"_id":"1"}}    # 针对哪个索引完成的
    {"name": "John Doe" }    # 数据行, 必须放在一行, 不能做美化
    {"index":{"_id":"2"}}
    {"name": "Jane Doe" }

    如果不写index或者type, 需要在元数据中指定

    2), 执行修改第一条, 删除第二条

    delte操作, 只有一行,没有元数据行

    POST /customer/_doc/_bulk?pretty
    {"update":{"_id":"1"}}
    {"doc": { "name": "John Doe becomes Jane Doe" } }
    {"delete":{"_id":"2"}}

    单条出错不影响, 会继续执行剩下的

    3), 批量修改

    post _bulk?pretty
    {
        "update": {"_index": "lag", "_type": "job", "_id": 1}
        {"doc": {"fileds": "values"}
    }

    4), 批量获取

    get _mget{
        "docs": [
            {"index": "tested",
            "_type": "job",
            "_id": 1
            },
            {"_index": "lag",
                "_type": "job2",
                "_id": 2
            }
        ]
    }

    或者同一个index或者同一个 type

    get lagou/job1
    {
        "docs": [
            {"_id": 1},
            {"_id": 2}
        ]
    }

    或者缩写

    get lagou/job1
    {
      "ids": [1, 2]
    }

    4, 查询

    基本查询, 组合查询, 过滤查询

    1), 导入基础数据

    https://raw.githubusercontent.com/elastic/elasticsearch/master/docs/src/test/resources/accounts.json

    curl -H "Content-Type: application/json" -XPOST "10.110.122.172:9200/bank/_doc/_bulk?pretty&refresh" --data-binary "@accounts.json"
    GET /_cat/indices?v

    2), 使用 q 进行查询

    GET /bank/_search?q=*&sort=account_number:asc&pretty

    只获取部分字段

    get lag/job/1?_source

    3)  使用body体进行查询

    from 从哪开始, size: 取多少条, sort: 排序

    使用 wildcard 进行 * 通配符查询

    4), match 分词匹配, 部分匹配

    a. match_all 查询所有

    get /bank/_search
    {
      "query": {"match_all": {}}, 
      "from": 10,
      "size": 10,
      "sort": [
        {"account_number": "asc"}
        ]
    }

    _source: 显示取字段

    get bank/_search
    {
      "query": {"match": {
        "age": 37
      }}, 
      "_source": [
        "account_number", "age", "address"
        ]
      
    }

    5), macth_parse 短语匹配

    会将 查询进行分词, 满足所有分词才会返回结果

    term: 完全匹配, 不分词

    get bank/_search
    {
      "query": {"match_phrase": {
        "address": "mill lane", 
    “slop”: 6 # 必须大于设置的词距离才会被搜索到
    }}, "_source": [ "account_number", "age", "address" ] }

     

    6) term查询, 完全匹配

    如果没有指定schema是什么类型的, 可能会查询失败

    get /ban/_search
    {
        "query" : {
            "term" : {
                "abc": "1234"
            }
        }
    }

    terms 查询

    可传入多个词, 只要有一个匹配, 就可以被查询到

    get /ban/_search
    {
        "query" : {
            "term" : {
                "abc": ["1234", “568”, “23”]
            }
        }
    }

    7), 使用range查询, 范围查询

    get /ban/_search
    {
        "query": {
            "range": {
                "price": {
                    "gte": 10,
                    "lte": 99
                }
            }
        }
    }

    8) multi_match: 多字段匹配

    get /bank/_search 
    {
      "query': {
        "bool": {
          "must": {
            "multi_match": {
              "operator": "and",
              "fileds": [ "name", "author^3"] # 把titil的权重提高, 分值较高的
              "query": "Guide"
            }
          },
          "filter": { 
            "terms": {
              "price": [ 35.99, 188.99]
            }
          }
        }
      }
    }

    5 bool匹配

    1) must

    get bank/_search
    {
      "query": {
        "bool": {
          "must": [
            {"match": {"address": "mill"}},
            {"match": {"address": "lane"}}
            ]
          }
        }
      }
    }

    2) or 匹配, should

    GET /bank/_search
    {
      "query": {
        "bool": {
          "should": [
            { "match": { "address": "mill" } },
            { "match": { "address": "lane" } }
          ]
        }
      }
    }

    3) must_not匹配

    GET /bank/_search
    {
      "query": {
        "bool": {
          "must_not": [
            { "match": { "address": "mill" } },
            { "match": { "address": "lane" } }
          ]
        }
      }
    }

    4) 混搭

    GET /bank/_search
    {
      "query": {
        "bool": {
          "must": [
            { "match": { "age": "40" } }
          ],
          "must_not": [
            { "match": { "state": "ID" } }
          ]
        }
      }
    }
    get lag/testjob/_search
    {
        "query":{
            "bool": {
                "should": [
                    {"term": {"title"; "python"}},
                    {"bool": {
                        "must": [
                            {"term": {"title": "es"}}, 
                            {"term": {"salary": 30}}
                        ]
                    }
                }
            }
        }
     }
    select * from test job where title = 'python' or
    (title = 'es' and salary = 30)

    5)  fliter查询, es5.x之后, 被 bool 替换, 包裹在bool查询内

    1), 使用filtre实现 gte lte

    GET /bank/_search
    {
      "query": {
        "bool": {
          "must": { "match_all": {} },
          "filter": {
            "range": {
              "balance": {
                "gte": 20000,
                "lte": 30000, 
    "boost": 2.0
    } } } } } }
    GET /bank/_search
    {
      "query": {
        "bool": {
          "filter": {
            "term": {
              "abc": "123"
            }
          }
        }
      }
    }

     fitler查询多个值

    GET /bank/_search
    {
      "query": {
        "bool": {
          "must": { "match_all": {} },
          "filter": {
            "term": [‘adb’, ‘12']
          }
        }
      }
    }

    判断字段是否存在

    exists

    7, 聚合查询

    默认 limit 10

    size : 0 为了不显示搜索结果

    GET /bank/_search
    {
      "size": 0,
      "aggs": {
        "group_by_state": {
          "terms": {
            "field": "state.keyword"
          }
        }
      }
    }

    相当于

    SELECT state, COUNT(*) FROM bank GROUP BY state ORDER BY COUNT(*) DESC LIMIT 10;

    2), 增加avg聚合

    GET /bank/_search
    {
      "size": 0,
      "aggs": {
        "group_by_state": {
          "terms": {
            "field": "state.keyword",
            "order": {
              "average_balance": "desc"
            }
          },
          "aggs": {
            "average_balance": {
              "avg": {
                "field": "balance"
              }
            }
          }
        }
      }
    }

    3), from-to, 控制查询的返回数量, 其实就是分页

    from: 从..开始

    to: 到..结束

    size: 10

    GET /bank/_search
    {
      "size": 0,
      "aggs": {
        "group_by_age": {
          "range": {
            "field": "age",
            "ranges": [
              {
                "from": 20,
                "to": 30
              },
              {
                "from": 30,
                "to": 40
              },
              {
                "from": 40,
                "size": 10
              }
            ]
          },
          "aggs": {
            "group_by_gender": {
              "terms": {
                "field": "gender.keyword"
              },
              "aggs": {
                "average_balance": {
                  "avg": {
                    "field": "balance"
                  }
                }
              }
            }
          }
        }
      }
    }

     4), sort

    get lagou/_search
    {
        'query': {
            'match_all': {}
        }
        "sort": [{
            "comments": {
                 "order": "asa"
             }
        }]
    }

     注意, 被排序的字段, 必须被存储, 即  stored: true

  • 相关阅读:
    Ubuntu下使用Sysvinit实现自定义服务(简单研究)
    Linux初始化init系统-Sysvinit、Upstart、Systemd
    Ubuntu查看系统版本的方法
    Linux查看文件内容命令:more(转)
    Linux查看文件内容命令:less(转)
    Ubuntu 16.04下MySQL 5.7.18取消开机启动(解决无法使用Sysvinit(update-rc.d/sysv-rc-conf)脚本关闭)
    为什么说Ubuntu的运行级别为2
    Ubuntu 16.04开机进入命令行(tty1)+分辨率调节+字体颜色设置+中文乱码解决(解决虚拟终端Ctrl+Alt+F1分辨率太大)
    Linux运行级别研究(转)
    Linux服务管理(Ubuntu服务管理工具sysv-rc-conf)(转)
  • 原文地址:https://www.cnblogs.com/wenbronk/p/9356261.html
Copyright © 2011-2022 走看看