zoukankan      html  css  js  c++  java
  • Elasticsearch之索引、文档、组合查询、排序查询、filter过滤操作

    Elasticsearch之-索引操作

    # es的倒排索引(扩展阅读.md)
    -把文章进行分词,对每个词建立索引

    具体操作可以查看官方文档

    https://www.elastic.co/guide/en/elasticsearch/reference/7.5/indices.html>

    官方2版本的中文文档

    https://www.elastic.co/guide/cn/elasticsearch/guide/current/index-settings.html

    一 索引初始化

    #新建一个lqz2的索引,索引分片数量为5,索引副本数量为1
    PUT lqz2
    {
      "settings": {
        "index":{
          "number_of_shards":5,
          "number_of_replicas":1
        }
      }
    }
    '''
    number_of_shards
    每个索引的主分片数,默认值是 5 。这个配置在索引创建后不能修改。
    number_of_replicas
    每个主分片的副本数,默认值是 1 。对于活动的索引库,这个配置可以随时修改。
    '''

    二 查询索引配置

    #获取lqz2索引的配置信息
    GET lqz2/_settings
    #获取所有索引的配置信息
    GET _all/_settings
    #同上
    GET _settings
    #获取lqz和lqz2索引的配置信息
    GET lqz,lqz2/_settings

    三 更新索引

    #修改索引副本数量为2
    PUT lqz/_settings
    {
      "number_of_replicas": 2
    }
    #如遇到报错:cluster_block_exception,因为这是由于ES新节点的数据目录data存储空间不足,导致从master主节点接收同步数据的时候失败,此时ES集群为了保护数据,会自动把索引分片index置为只读read-only
    PUT  _all/_settings
    {
    "index": {
      "blocks": {
        "read_only_allow_delete": false
        }
      }
    }

    四 删除索引

    #删除lqz索引
    DELETE lqz

    Elasticsearch之-文档操作

    一 新增文档

    #新增一个id为1的书籍(POST和PUT都可以)
    POST lqz/_doc/1/_create
    #POST lqz/_doc/1
    #POST lqz/_doc 会自动创建id,必须用Post
    {
      "title":"红楼梦",
      "price":12,
      "publish_addr":{
        "province":"黑龙江",
        "city":"鹤岗"
      },
      "publish_date":"2013-11-11",
      "read_num":199,
      "tag":["古典","名著"]
    }

    二 查询文档

    #查询lqz索引下id为7的文档
    GET lqz/_doc/1
    #查询lqz索引下id为7的文档,只要title字段
    GET lqz/_doc/7?_source=title
    #查询lqz索引下id为7的文档,只要title和price字段
    GET lqz/_doc/7?_source=title,price
    #查询lqz索引下id为7的文档,要全部字段
    GET lqz/_doc/7?_source

    三 修改文档

    #修改文档(覆盖修改,原来的字段就没有了)
    PUT lqz/_doc/1
    {
      "title":"xxxx",
      "price":333,
      "publish_addr":{
        "province":"黑龙江",
        "city":"福州"
      }
    }
    #修改文档,增量修改,只修改某个字段(注意是post)(一定要注意包在doc中)
    POST lqz/_update/1
    {
      "doc":{
        "title":"修改"
      }
    }

    四 删除文档

    #删除文档id为10的
    DELETE lqz/_doc/10

    五 批量操作之_mget

    #批量获取lqz索引_doc类型下id为2的数据和lqz2索引_doc类型下id为1的数据
    GET _mget
    {
      "docs":[
        {
          "_index":"lqz",
          "_type":"_doc",
          "_id":2
        },
        {
          "_index":"lqz2",
          "_type":"_doc",
          "_id":1
        }
        ]
    }
    ​
    #批量获取lqz索引下id为1和2的数据
    GET lqz/_mget
    {
      "docs":[
        {
          "_id":2
        },
        {
          "_id":1
        }
        ]
    }
    #同上
    GET lqz/_mget
    {
      "ids":[1,2]
    }

    六 批量操作之 bulk

    PUT test/_doc/2/_create
    {
      "field1" : "value22"
    }
    POST _bulk
    { "index" : { "_index" : "test", "_id" : "1" } }
    { "field1" : "value1" }
    { "delete" : { "_index" : "test", "_id" : "2" } }
    { "create" : { "_index" : "test", "_id" : "3" } }
    { "field1" : "value3" }
    { "update" : {"_id" : "1", "_index" : "test"} }
    { "doc" : {"field2" : "value2"} }

    Elasticsearch之查询的两种方式

    一 前言

    elasticsearch提供两种查询方式:

    • 查询字符串(query string),简单查询,就像是像传递URL参数一样去传递查询语句,被称为简单搜索或查询字符串(query string)搜索。

    • 另外一种是通过DSL语句来进行查询,被称为DSL查询(Query DSL),DSL是Elasticsearch提供的一种丰富且灵活的查询语言,该语言以json请求体的形式出现,通过restful请求与Elasticsearch进行交互。

    二 准备数据

    PUT lqz/doc/1
    {
      "name":"顾老二",
      "age":30,
      "from": "gu",
      "desc": "皮肤黑、武器长、性格直",
      "tags": ["", "", ""]
    }
    ​
    PUT lqz/doc/2
    {
      "name":"大娘子",
      "age":18,
      "from":"sheng",
      "desc":"肤白貌美,娇憨可爱",
      "tags":["", "",""]
    }
    ​
    PUT lqz/doc/3
    {
      "name":"龙套偏房",
      "age":22,
      "from":"gu",
      "desc":"mmp,没怎么看,不知道怎么形容",
      "tags":["造数据", "",""]
    }
    ​
    PUT lqz/doc/4
    {
      "name":"石头",
      "age":29,
      "from":"gu",
      "desc":"粗中有细,狐假虎威",
      "tags":["", "",""]
    }
    ​
    PUT lqz/doc/5
    {
      "name":"魏行首",
      "age":25,
      "from":"广云台",
      "desc":"仿佛兮若轻云之蔽月,飘飘兮若流风之回雪,mmp,最后竟然没有嫁给顾老二!",
      "tags":["闭月","羞花"]
    }
    View Code

    三 查询字符串

    GET lqz/doc/_search?q=from:gu

    还是使用GET命令,通过_serarch查询,查询条件是什么呢?条件是from属性是gu家的人都有哪些。

    结果如下
    
    {
      "took" : 1,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : 3,
        "max_score" : 0.6931472,
        "hits" : [
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "4",
            "_score" : 0.6931472,
            "_source" : {
              "name" : "石头",
              "age" : 29,
              "from" : "gu",
              "desc" : "粗中有细,狐假虎威",
              "tags" : [
                "",
                "",
                ""
              ]
            }
          },
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "1",
            "_score" : 0.2876821,
            "_source" : {
              "name" : "顾老二",
              "age" : 30,
              "from" : "gu",
              "desc" : "皮肤黑、武器长、性格直",
              "tags" : [
                "",
                "",
                ""
              ]
            }
          },
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "3",
            "_score" : 0.2876821,
            "_source" : {
              "name" : "龙套偏房",
              "age" : 22,
              "from" : "gu",
              "desc" : "mmp,没怎么看,不知道怎么形容",
              "tags" : [
                "造数据",
                "",
                ""
              ]
            }
          }
        ]
      }
    }
    结果如下

    我们来重点说下hitshits是返回的结果集——所有from属性为gu的结果集。重点中的重点是_score得分,得分是什么呢?根据算法算出跟查询条件的匹配度,匹配度高得分就高。后面再说这个算法是怎么回事。

    四 结构化查询

    我们现在使用DSL方式,来完成刚才的查询,查看来自顾家的都有哪些人。

    GET lqz/_doc/_search
    {
      "query": {
        "match": {
          "from": "gu"
        }
      }
    }

    上例,查询条件是一步步构建出来的,将查询条件添加到match中即可,而match则是查询所有from字段的值中含有gu的结果就会返回。  当然结果没啥变化:

    {
      "took" : 0,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : 3,
        "max_score" : 0.6931472,
        "hits" : [
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "4",
            "_score" : 0.6931472,
            "_source" : {
              "name" : "石头",
              "age" : 29,
              "from" : "gu",
              "desc" : "粗中有细,狐假虎威",
              "tags" : [
                "",
                "",
                ""
              ]
            }
          },
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "1",
            "_score" : 0.2876821,
            "_source" : {
              "name" : "顾老二",
              "age" : 30,
              "from" : "gu",
              "desc" : "皮肤黑、武器长、性格直",
              "tags" : [
                "",
                "",
                ""
              ]
            }
          },
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "3",
            "_score" : 0.2876821,
            "_source" : {
              "name" : "龙套偏房",
              "age" : 22,
              "from" : "gu",
              "desc" : "mmp,没怎么看,不知道怎么形容",
              "tags" : [
                "造数据",
                "",
                ""
              ]
            }
          }
        ]
      }
    }
    结果如下

    Elasticsearch之排序查询sort

    降序:desc

    比如我们查询顾府都有哪些人,并根据age字段按照降序,并且,我只想看nameage字段:

    GET lqz/doc/_search
    {
      "query": {
        "match": {
          "from": "gu"
        }
      },
      "sort": [
        {
          "age": {
            "order": "desc"
          }
        }
      ]
    }

    上例,在条件查询的基础上,我们又通过sort来做排序,根据age字段排序,由order字段控制,desc是降序。

    {
      "took" : 0,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : 3,
        "max_score" : null,
        "hits" : [
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "1",
            "_score" : null,
            "_source" : {
              "name" : "顾老二",
              "age" : 30,
              "from" : "gu",
              "desc" : "皮肤黑、武器长、性格直",
              "tags" : [
                "",
                "",
                ""
              ]
            },
            "sort" : [
              30
            ]
          },
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "4",
            "_score" : null,
            "_source" : {
              "name" : "石头",
              "age" : 29,
              "from" : "gu",
              "desc" : "粗中有细,狐假虎威",
              "tags" : [
                "",
                "",
                ""
              ]
            },
            "sort" : [
              29
            ]
          },
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "3",
            "_score" : null,
            "_source" : {
              "name" : "龙套偏房",
              "age" : 22,
              "from" : "gu",
              "desc" : "mmp,没怎么看,不知道怎么形容",
              "tags" : [
                "造数据",
                "",
                ""
              ]
            },
            "sort" : [
              22
            ]
          }
        ]
      }
    }
    结果如下

    上例中,结果是以降序排列方式返回的。

    2.2 升序:asc

    GET lqz/doc/_search
    {
      "query": {
        "match_all": {}
      },
      "sort": [
        {
          "age": {
            "order": "asc"
          }
        }
      ]
    }
    {
      "took" : 0,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : 5,
        "max_score" : null,
        "hits" : [
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "2",
            "_score" : null,
            "_source" : {
              "name" : "大娘子",
              "age" : 18,
              "from" : "sheng",
              "desc" : "肤白貌美,娇憨可爱",
              "tags" : [
                "",
                "",
                ""
              ]
            },
            "sort" : [
              18
            ]
          },
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "3",
            "_score" : null,
            "_source" : {
              "name" : "龙套偏房",
              "age" : 22,
              "from" : "gu",
              "desc" : "mmp,没怎么看,不知道怎么形容",
              "tags" : [
                "造数据",
                "",
                ""
              ]
            },
            "sort" : [
              22
            ]
          },
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "5",
            "_score" : null,
            "_source" : {
              "name" : "魏行首",
              "age" : 25,
              "from" : "广云台",
              "desc" : "仿佛兮若轻云之蔽月,飘飘兮若流风之回雪,mmp,最后竟然没有嫁给顾老二!",
              "tags" : [
                "闭月",
                "羞花"
              ]
            },
            "sort" : [
              25
            ]
          },
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "4",
            "_score" : null,
            "_source" : {
              "name" : "石头",
              "age" : 29,
              "from" : "gu",
              "desc" : "粗中有细,狐假虎威",
              "tags" : [
                "",
                "",
                ""
              ]
            },
            "sort" : [
              29
            ]
          },
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "1",
            "_score" : null,
            "_source" : {
              "name" : "顾老二",
              "age" : 30,
              "from" : "gu",
              "desc" : "皮肤黑、武器长、性格直",
              "tags" : [
                "",
                "",
                ""
              ]
            },
            "sort" : [
              30
            ]
          }
        ]
      }
    }
    结果如下

    注意:不是什么数据类型都能排序,只有数字,日期可以排序,其他都不行!!

    Elasticsearch之分页查询

    分页查询:from/size

    GET lqz/doc/_search
    {
      "query": {
        "match_all": {}
      },
      "sort": [
        {
          "age": {
            "order": "desc"
          }
        }
      ], 
      "from": 2,
      "size": 1
    }
    
    #上例,首先以`age`降序排序,查询所有。并且在查询的时候,添加两个属性`from`和`size`来控制查询结果集的数据条数。
    
    - from:从哪开始查
    - size:返回几条结果
    
    # 有了这个查询,如何分页?
    一页有10条数据
    第一页:
      "from": 0,
      "size": 10
    第二页:
      "from": 10,
      "size": 10
    第三页:
      "from": 20,
      "size": 10
    {
      "took" : 0,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : 5,
        "max_score" : null,
        "hits" : [
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "5",
            "_score" : null,
            "_source" : {
              "name" : "魏行首",
              "age" : 25,
              "from" : "广云台",
              "desc" : "仿佛兮若轻云之蔽月,飘飘兮若流风之回雪,mmp,最后竟然没有嫁给顾老二!",
              "tags" : [
                "闭月",
                "羞花"
              ]
            },
            "sort" : [
              25
            ]
          }
        ]
      }
    }
    查询结果

    Elasticsearch之布尔(组合)查询

    多个条件
    
    - must(and- should(or- must_not(not- filter

    组合查询之must

    # 查询form gu和age=30的数据
    GET lqz/doc/_search
    {
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "from": "gu"
              }
            },
            {
              "match": {
                "age": "30"
              }
            }
          ]
        }
      }
    }
    {
      "took" : 8,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : 1,
        "max_score" : 1.287682,
        "hits" : [
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "1",
            "_score" : 1.287682,
            "_source" : {
              "name" : "顾老二",
              "age" : 30,
              "from" : "gu",
              "desc" : "皮肤黑、武器长、性格直",
              "tags" : [
                "",
                "",
                ""
              ]
            }
          }
        ]
      }
    }
    查询结果

    注意:所有属性值为列表的,都可以实现多个条件并列存在

    组合查询之should

    #查询`from`为`gu`或者`tags`为`闭月`的数据
    GET lqz/doc/_search
    {
      "query": {
        "bool": {
          "should": [
            {
              "match": {
                "from": "gu"
              }
            },
            {
              "match": {
                "tags": "闭月"
              }
            }
          ]
        }
      }
    }
    {
      "took" : 1,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : 4,
        "max_score" : 0.6931472,
        "hits" : [
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "4",
            "_score" : 0.6931472,
            "_source" : {
              "name" : "石头",
              "age" : 29,
              "from" : "gu",
              "desc" : "粗中有细,狐假虎威",
              "tags" : [
                "",
                "",
                ""
              ]
            }
          },
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "5",
            "_score" : 0.5753642,
            "_source" : {
              "name" : "魏行首",
              "age" : 25,
              "from" : "广云台",
              "desc" : "仿佛兮若轻云之蔽月,飘飘兮若流风之回雪,mmp,最后竟然没有嫁给顾老二!",
              "tags" : [
                "闭月",
                "羞花"
              ]
            }
          },
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "1",
            "_score" : 0.2876821,
            "_source" : {
              "name" : "顾老二",
              "age" : 30,
              "from" : "gu",
              "desc" : "皮肤黑、武器长、性格直",
              "tags" : [
                "",
                "",
                ""
              ]
            }
          },
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "3",
            "_score" : 0.2876821,
            "_source" : {
              "name" : "龙套偏房",
              "age" : 22,
              "from" : "gu",
              "desc" : "mmp,没怎么看,不知道怎么形容",
              "tags" : [
                "造数据",
                "",
                ""
              ]
            }
          }
        ]
      }
    }
    查询结果

    组合查询之must_not

    #查询`from`既不是`gu`并且`tags`也不是`可爱`,还有`age`不是`18`的数据
    
    GET lqz/doc/_search
    {
      "query": {
        "bool": {
          "must_not": [
            {
              "match": {
                "from": "gu"
              }
            },
            {
              "match": {
                "tags": "可爱"
              }
            },
            {
              "match": {
                "age": 18
              }
            }
          ]
        }
      }
    }

    filter查询

    filter条件过滤查询,过滤条件的范围用`range`表示,`gt`表示大于
    gt:大于   lt:小于     get:大于等于      let:小于等于

    #查询`from`为`gu`,`age`大于`25`的数据
    
    GET lqz/doc/_search
    {
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "from": "gu"
              }
            }
          ],
          "filter": {
            "range": {
              "age": {
                "gt": 25
              }
            }
          }
        }
      }
    }
    {
      "took" : 2,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : 2,
        "max_score" : 0.6931472,
        "hits" : [
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "4",
            "_score" : 0.6931472,
            "_source" : {
              "name" : "石头",
              "age" : 29,
              "from" : "gu",
              "desc" : "粗中有细,狐假虎威",
              "tags" : [
                "",
                "",
                ""
              ]
            }
          },
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "1",
            "_score" : 0.2876821,
            "_source" : {
              "name" : "顾老二",
              "age" : 30,
              "from" : "gu",
              "desc" : "皮肤黑、武器长、性格直",
              "tags" : [
                "",
                "",
                ""
              ]
            }
          }
        ]
      }
    }
    查询结果

    小结:

    • must:与关系,相当于关系型数据库中的and

    • should:或关系,相当于关系型数据库中的or

    • must_not:非关系,相当于关系型数据库中的not

    • filter:过滤条件。

    • range:条件筛选范围。

    • gt:大于,相当于关系型数据库中的>

    • gte:大于等于,相当于关系型数据库中的>=

    • lt:小于,相当于关系型数据库中的<

    • lte:小于等于,相当于关系型数据库中的<=

    Elasticsearch之查询结果过滤

    一 前言

    在未来,一篇文档可能有很多的字段,每次查询都默认给我们返回全部,在数据量很大的时候,是的,比如我只想查姑娘的手机号,你一并给我个喜好啊、三围什么的算什么?  所以,我们对结果做一些过滤,清清白白的告诉elasticsearch

    二 准备数据

    PUT lqz/doc/1
    {
      "name":"顾老二",
      "age":30,
      "from": "gu",
      "desc": "皮肤黑、武器长、性格直",
      "tags": ["", "", ""]
    }

    三 结果过滤:_source

    现在,在所有的结果中,我只需要查看nameage两个属性,其他的不要怎么办?

    GET lqz/doc/_search
    {
      "query": {
        "match": {
          "name": "顾老二"
        }
      },
      "_source": ["name", "age"]
    }
    {
      "took" : 8,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : 1,
        "max_score" : 0.8630463,
        "hits" : [
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "1",
            "_score" : 0.8630463,
            "_source" : {
              "name" : "顾老二",
              "age" : 30
            }
          }
        ]
      }
    }
    查询结果

    在数据量很大的时候,我们需要什么字段,就返回什么字段就好了,提高查询效率

  • 相关阅读:
    iOS开发 | 自定义不规则label
    监控redis的操作命令
    HTML常用标签
    前端学习【第一篇】: HTML内容
    MySQL数据库修改字段的长度
    python模块之:paramiko
    使用pymysql操作mysql数据库
    Python开发【第九篇】: 并发编程
    JNI调用实例
    JVM性能调优入门
  • 原文地址:https://www.cnblogs.com/baohanblog/p/12843189.html
Copyright © 2011-2022 走看看