zoukankan      html  css  js  c++  java
  • Elasticsearch Search APIs

    Search APIs
    Search APIs 官方文档:https://www.elastic.co/guide/en/elasticsearch/reference/current/search.html

    Search APIs

    Search API 可以分成两类:URL SearchRequest Body Search。URL Search 在 URL 中使用查询参数;Request Body Search 使用 Elasticsearch 提供的,基于 JSON 格式的更加完备的 Query Domain Specific Language(DSL)

    语法 范围
    _search 集群上所有索引
    /index1/_search index1
    /index1,index2/_search index1 和 index2
    /index*/_search 以 index 开头的索引

    在 URI Search 中,可以通过多种方式灵活地进行搜索。

    /_search - 整个集群
    /index1/_search - 单个索引
    /index1,index2/_search - 多个索引
    /index*/_search - 通配符索引
    

    使用 q,指定查询字符串

    #URI Query
    GET kibana_sample_data_ecommerce/_search?q=customer_first_name:Eddie
    - q 用来表示查询内容
    - customer_first_name:Eddie 用来表示搜素一个叫 Eddie 的客户
    
    GET kibana*/_search?q=customer_first_name:Eddie
    GET /_all/_search?q=customer_first_name:Eddie
    

    Request Body Query

    #REQUEST Body
    POST kibana_sample_data_ecommerce/_search
    {
        "profile": true,
        "query": {
            "match_all": {}
        }
    }
    query: 查询
    match_all: 返回所有的文档
    

    Response

    took - 花费的时间

    time_out - 是否超时

    _shards - 分片信息

    total - 总文档数

    hits - 结果集,默认前10个

    _index - 索引名称

    _id - 文档的id

    _score - 相关度分数

    _source - 文档原始数据

    https://searchenginewatch.com/sew/news/2065080/search-engines-101

    https://www.huffpost.com/entry/search-engines-101-part-i_b_1104525

    https://www.entrepreneur.com/article/176398

    https://www.searchtechnologies.com/meaning-of-relevancy

    URI Search

    • 指定字段查询,泛查询
    • 使用 Profile 参数
    • Terms 查询和 Phrase 查询的区别
    • 如何对查询条件分组
    • 逻辑操作符
    • 通配符和近似匹配

    基本的查询语句如下:

    GET /movies/_search?q=2012&df=title&sort=year:desc&from=0&size=10&timeout=1s
    {
      "profile": "true"
    }
    
    • q 指定查询语句,使用 Query String Syntax
    • df 默认字段,不指定时
    • sort 排序
    • from 和 size 用于分页
    • profile 可以查看查询时如何被执行的
    • timeout 指定超时时间,默认不超时

    指定字段 vs 泛查询,即 q=title:2012q=2012

    # 带profile
    GET /movies/_search?q=2012&df=title
    {
        "profile":"true"
    }
    
    #泛查询,正对_all,所有字段
    GET /movies/_search?q=2012
    {
        "profile":"true"
    }
    
    #指定字段
    GET /movies/_search?q=title:2012&sort=year:desc&from=0&size=10&timeout=1s
    {
        "profile":"true"
    }
    
    # 泛查询
    GET /movies/_search?q=title:2012
    {
        "profile":"true"
    }
    

    Term vs Phrase

    • Beautiful Mind 等效于 Beautiful OR Mind
    • Beautiful Mind 等效于 Beautiful AND Mind。Pharse查询还要求前后顺序保持一致
    # 查找美丽心灵, Mind为泛查询
    GET /movies/_search?q=title:Beautiful Mind
    {
        "profile":"true"
    }
    
    #使用引号,Phrase查询
    GET /movies/_search?q=title:"Beautiful Mind"
    {
        "profile":"true"
    }
    

    分组与引号

    • title:(Beautiful AND Mind)

    • title=“Beautiful Mind”

    #分组,Bool查询
    GET /movies/_search?q=title:(Beautiful Mind)
    {
        "profile":"true"
    }
    

    布尔操作符 AND OR NOT 或者 && || !

    # 查找美丽心灵
    GET /movies/_search?q=title:(Beautiful AND Mind)
    {
        "profile":"true"
    }
    
    # 查找美丽心灵
    GET /movies/_search?q=title:(Beautiful NOT Mind)
    {
        "profile":"true"
    }
    

    分组+表示 must, - 表示 must_not,title:(+matrix-reloaded)

    # 查找美丽心灵
    GET /movies/_search?q=title:(Beautiful +Mind)
    {
        "profile":"true"
    }
    

    范围查询 ,区间写法。[]闭区间,{}开区间:year:{2019 TO 2018}, year:[* TO 2018]

    算术符号year:>2010,year:(>2010 && <=2018),year:(+>2010 +<=2018)

    GET /movies/_search?q=title:beautiful AND year:[2002 TO 2018)
    {
        "profile":"true"
    }
    
    # 通配符查询
    - 通配符查询效率低,占用内存大,不建议使用,特别是放在最前面的时候
    - ? 表示 1 个字符,* 代表 0 或多个字符
    - title:mi?d
    - title:be*
    
    GET /movies/_search?q=title:b*
    {
        "profile":"true"
    }
    
    
    # 正则表达式
    
    
    # 模糊匹配&近似度匹配
    title:befutifl~1
    title:"lord rings"~2
    
    GET /movies/_search?q=title:beautifl~1
    {
        "profile":"true"
    }
    
    
    GET /movies/_search?q=title:"Lord Rings"~2
    {
        "profile":"true"
    }
    

    https://www.elastic.co/guide/en/elasticsearch/reference/7.0/search-uri-request.html

    https://www.elastic.co/guide/en/elasticsearch/reference/7.0/search-search.html

    https://www.jianshu.com/p/8fa06643a682

    Request Body Search

    • 分页 / 排序
    • Source Filtering
    • Match 查询和 Match Phrase 查询
    • 调整 Precision & Recall
    # 新增一批数据
    POST _bulk
    {"index":{ "_index": "books", "_type": "IT", "_id": "1" }}
    {"id":"1","title":"Java编程思想","language":"java","author":"Bruce Eckel","price":70.20,"publish_time":"2007-10-01","description":"Java学习必读经典,殿堂级著作!赢得了全球程序员的广泛赞誉。"}
    {"index":{ "_index": "books", "_type": "IT", "_id": "2" }}
    {"id":"2","title":"Java程序性能优化","language":"java","author":"葛一鸣","price":46.50,"publish_time":"2012-08-01","description":"让你的Java程序更快、更稳定。深入剖析软件设计层面、代码层面、JVM虚拟机层面的优化方法"}
    {"index":{ "_index": "books", "_type": "IT", "_id": "3" }}
    {"id":"3","title":"Python科学计算","language":"python","author":"张若愚","price":81.40,"publish_time":"2016-05-01","description":"零基础学python,光盘中作者独家整合开发winPython运行环境,涵盖了Python各个扩展库"}
    {"index":{ "_index": "books", "_type": "IT", "_id": "4" }}
    {"id":"4","title":"Python基础教程","language":"python","author":"Helant","price":54.50,"publish_time":"2014-03-01","description":"经典的Python入门教程,层次鲜明,结构严谨,内容翔实"}
    {"index":{ "_index": "books", "_type": "IT", "_id": "5" }}
    {"id":"5","title":"JavaScript高级程序设计","language":"javascript","author":"Nicholas C. Zakas","price":66.40,"publish_time":"2012-10-01","description":"JavaScript技术经典名著"}
    

    查询所有数据

    GET books/_search
    {
      "query": {
        "match_all": {}
      }
    }
    

    等价于

    GET books/_search
    

    Request Body Search 将查询语句通过 HTTP Request Body 发送给 Elasticsearch

    Query DSL,ignore_unavailable=true,可以忽略尝试访问不存在的索引“404_idx”导致的报错

    POST /movies,404_idx/_search?ignore_unavailable=true
    {
      "profile": true,
      "query": {
        "match_all": {}
      }
    }
    

    分页查询

    分页查询,from 从 0 开始,默认返回 10 个结果,size 指定返回结果数量:

    POST /kibana_sample_data_ecommerce/_search
    {
      "from":10,
      "size":20,
      "query":{
        "match_all": {}
      }
    }
    

    排序

    1、最好在“数字型”和“日期型”字段上排序

    2、因为对于多值类型或分析过的字段排序,系统会选一个值,无法得知该值

    # 对日期排序
    POST kibana_sample_data_ecommerce/_search
    {
      "sort":[{"order_date":"desc"}],
      "query":{
        "match_all": {}
      }
    }
    

    source filtering

    source filtering 的意思是对返回的 _source 中的内容进行过滤。如果 _source 没有存储,那就只返回匹配的文档的元数据。_source 支持使用通配符,_source["name*"]

    POST kibana_sample_data_ecommerce/_search
    {
      "_source":["order_date"],
      "query":{
        "match_all": {}
      }
    }
    

    脚本字段

    用例:订单中有不同的汇率,需要结合汇率对订单价格进行排序

    GET kibana_sample_data_ecommerce/_search
    {
      "script_fields": {
        "new_field": {
          "script": {
            "lang": "painless",
            "source": "doc['order_date'].value+'hello'"
          }
        }
      },
      "query": {
        "match_all": {}
      }
    }
    

    查询表达式 - Match

    POST movies/_search
    {
      "query": {
        "match": {
          "title": "last christmas"
        }
      }
    }
    
    
    POST movies/_search
    {
      "query": {
        "match": {
          "title": {
            "query": "last christmas",
            "operator": "and"
          }
        }
      }
    }
    

    短语搜索 - Match Phrase

    POST movies/_search
    {
      "query": {
        "match_phrase": {
          "title":{
            "query": "one love"
    
    
          }
        }
      }
    }
    
    
    POST movies/_search
    {
      "query": {
        "match_phrase": {
          "title":{
            "query": "one love",
            "slop": 1
    
    
          }
        }
      }
    }
    

    Query string && Simple query string

    • URI Query
    • Query String
    • Simple Query String
    POST users/_search
    {
      "query": {
        "query_string": {
          "default_field": "name",
          "query": "Ruan AND Yiming"
        }
      }
    }
    

    Simple Query String 类似 Query String,但是会忽略错误的语法,同时只支持部分查询语法。

    • 不支持 AND OR NOT,会当作字符串处理;
    • Term 之间默认的关系是 OR,可以指定 Operator;
    • 支持部分逻辑:+ 代替 AND,| 代替 OR,- 代替 NOT。
    POST users/_search
    {
      "query": {
        "query_string": {
          "fields":["name","about"],
          "query": "(Ruan AND Yiming) OR (Java AND Elasticsearch)"
        }
      }
    }
    
    #Simple Query 默认的operator是 Or
    POST users/_search
    {
      "query": {
        "simple_query_string": {
          "query": "Ruan AND Yiming",
          "fields": ["name"]
        }
      }
    }
    
    POST users/_search
    {
      "query": {
        "simple_query_string": {
          "query": "Ruan Yiming",
          "fields": ["name"],
          "default_operator": "AND"
        }
      }
    }
    
    GET /movies/_search
    {
      "profile": true,
      "query":{
        "query_string":{
          "default_field": "title",
          "query": "Beafiful AND Mind"
        }
      }
    }
    
    # 多fields
    GET /movies/_search
    {
      "profile": true,
      "query":{
        "query_string":{
          "fields":[
            "title",
            "year"
          ],
          "query": "2012"
        }
      }
    }
    
    GET /movies/_search
    {
      "profile":true,
      "query":{
        "simple_query_string":{
          "query":"Beautiful +mind",
          "fields":["title"]
        }
      }
    }
    

    相关资料

    https://www.elastic.co/guide/en/elasticsearch/reference/7.0/search-uri-request.html

    https://www.elastic.co/guide/en/elasticsearch/reference/7.0/search-search.html

    https://www.jianshu.com/p/6ba5c755fe3f

  • 相关阅读:
    1014 Waiting in Line (30)(30 point(s))
    1013 Battle Over Cities (25)(25 point(s))
    1012 The Best Rank (25)(25 point(s))
    1011 World Cup Betting (20)(20 point(s))
    1010 Radix (25)(25 point(s))
    1009 Product of Polynomials (25)(25 point(s))
    1008 Elevator (20)(20 point(s))
    1007 Maximum Subsequence Sum (25)(25 point(s))
    1006 Sign In and Sign Out (25)(25 point(s))
    1005 Spell It Right (20)(20 point(s))
  • 原文地址:https://www.cnblogs.com/shuiyj/p/13185075.html
Copyright © 2011-2022 走看看