zoukankan      html  css  js  c++  java
  • ElasticSearch搜索

    ElasticSearch搜索

    1 DSL搜索

    DSL(Domain Specifific Language)是ES提出的基于json的搜索方式,在搜索时传入特定的json格式的数据来完成不同的搜索需求。

    1.1.搜索全部记录并分页

      @Test
        public void testSearchAll() throws Exception {
            //搜索请求对象
            SearchRequest searchRequest = new SearchRequest("xc_course");
            searchRequest.types("doc");
            //搜索构建源对象
            SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
            //设置分页
            //页码
            int page =1;
            int size =2;
            //起始记录的下标
            int from=(page-1) * size;
            searchSourceBuilder.from(from); //起始记录的下标,从0开始
            searchSourceBuilder.size(size); //每业显示几条
            //设置搜索方式
            searchSourceBuilder.query(QueryBuilders.matchAllQuery()); //搜索全部
            //设置原字段过滤,第一个参数表示:包括哪些字段,第二个表示不包括那个字段
            searchSourceBuilder.fetchSource(new String[]{"name","studymodel","price","timestamp"},new String[]{});
            // 向搜索请求对象中设置搜索源
            searchRequest.source(searchSourceBuilder);
            //执行搜索,向es发送http请求
            SearchResponse search = client.search(searchRequest);
            //搜索结果
            SearchHits hits = search.getHits();
            //匹配到的总记录数
            long totalHits = hits.getTotalHits();
            // 得到匹配度高的文档
            SearchHit[] hits1 = hits.getHits();
            SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
            for (SearchHit hit : hits1){
                 //文档的主键
                 String id = hit.getId();
                 //源文档内容
                 Map<String, Object> sourceAsMap = hit.getSourceAsMap();
                 System.out.println(sourceAsMap);
                 //日期
    //            Date timestamp = dateFormat.parse((String) sourceAsMap.get("timestamp"));
    //            System.out.println(timestamp);
            }
        }
    

    1.2.Term Query

    Term Query为精确查询,在搜索时会整体匹配关键字,不再将关键字分词。

    将1.1.中搜索方式设置为:

    QueryBuilders.termQuery("name","spring") //根据名字查询,包含spring
    

    1.3.根据id精确匹配查询

    将1.1.中搜索方式改:

    QueryBuilders.termQuery("name","spring") //根据名字查询,包含spring
    

    1.4.match Query

    1、基本使用

    match Query即全文检索,它的搜索方式是先将搜索字符串分词,再使用各各词条从索引中搜索。

    match query与Term query区别是match query在搜索前先将搜索关键字分词,再拿各各词语去索引中搜索。

    发送:post http://localhost:9200/xc_course/doc/_search

    { "query": 
     { "match" :
      { "description" :
       { "query" : "spring开发", "operator" : "or" }
      }
     }
    }
    

    operator:or 表示 只要有一个词在文档中出现则就符合条件,and表示每个词都在文档中出现则才符合条件

    query:搜索的关键字,对于英文关键字如果有多个单词则中间要用半角逗号分隔,而对于中文关键字中间可以用

    逗号分隔也可以不用。

    1. minimum_should_match

      指定文档匹配词的占比:

    { "query": 
     { "match" :
      { "description" :
       { 
           "query" : "spring开发框架", "minimum_should_match": "80%"
       }
      } 
     } 
    }
    

    设置"minimum_should_match": "80%"表示,三个词在文档的匹配占比为80%,即3*0.8=2.4,向上取整得2,表

    示至少有两个词在文档中要匹配成功。

    java中实现:

    //设置搜索方式
    searchSourceBuilder.query(QueryBuilders.matchQuery("description","spring开发框架")
              .minimumShouldMatch("80%"));
    

    1.5.multi Query

    termQuery和matchQuery一次只能匹配一个Field,本节学习multiQuery,一次可以匹配多个字段。

    1、基本使用

    单项匹配是在一个fifield中去匹配,多项匹配是拿关键字去多个Field中匹配。

    例: 拿关键字 “spring css”去匹配name 和description字段。

    { 
        "query": { 
            "multi_match" : {
                "query" : "spring css",
                "minimum_should_match": "50%",
                "fields": [ "name", "description" ]
            }
        }
    }
    

    2.提升boost

    匹配多个字段时可以提升字段的boost(权重)来提高得分;

    在搜索的时候如果一个关键词在名字的权重比内容中的全重大,则优先搜索到名字权重大的;

    javaClient中:

     //设置搜索方式
            searchSourceBuilder.query(QueryBuilders.multiMatchQuery("spring css","name","description")
                    .minimumShouldMatch("80%")  //拼配程度
                    .field("name",10)); //name的占比提高10倍
    

    1.6. 布尔查询

    例: 对name、description进行匹配查询,并且对studymodel进行精确查询

    { "_source" :["name", "studymodel", "description"],
     "from" : 0,
     "size" : 1,
     "query":
     { "bool" :
      { "must":
       [
           { "multi_match" :
            { "query" : "spring框架", 
             "minimum_should_match": "50%",
             "fields": [ "name^10", "description" ] 
            }
           },
           {
               "term":{
                   "studymodel" : "201001"
               } 
           } 
       ]
      }
     } 
    }
    

    must:表示必须,多个查询条件必须都满足。(通常使用must)

    should:表示或者,多个查询条件只要有一个满足即可。

    must_not:表示非。

    /**
         * BoolQuery
         *
         */
        @Test
        public void testBoolQuery() throws Exception {
            //搜索请求对象
            SearchRequest searchRequest = new SearchRequest("xc_course");
            searchRequest.types("doc");
            //搜索构建源对象
            SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
            /**
            *BoolQuery设置搜索方式
            */
            // 1.multiMatchQueryBuilder
            MultiMatchQueryBuilder multiMatchQueryBuilder = QueryBuilders.multiMatchQuery("spring css", "name", "description")
                    .minimumShouldMatch("80%")  //拼配程度
                    .field("name", 10);
            //2.再定义一个termQuery
            TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("studymodel", "201001");
            //BoolQuery
            BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
            boolQueryBuilder.must(multiMatchQueryBuilder); 
            boolQueryBuilder.must(termQueryBuilder);  //必须满足这两个条件
            
            searchSourceBuilder.query(boolQueryBuilder);
            //设置原字段过滤,第一个参数表示:包括哪些字段,第二个表示不包括那个字段
            searchSourceBuilder.fetchSource(new String[]{"name","studymodel","price","timestamp"},new String[]{});
            // 向搜索请求对象中设置搜索源
            searchRequest.source(searchSourceBuilder);
            //执行搜索,向es发送http请求
            SearchResponse search = client.search(searchRequest);
            //搜索结果
            SearchHits hits = search.getHits();
            //匹配到的总记录数
            long totalHits = hits.getTotalHits();
            // 得到匹配度高的文档
            SearchHit[] hits1 = hits.getHits();
            SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
            for (SearchHit hit : hits1){
                //文档的主键
                String id = hit.getId();
                //源文档内容
                Map<String, Object> sourceAsMap = hit.getSourceAsMap();
                System.out.println(sourceAsMap);
            }
        }
    
    

    1.7.过滤器

    过虑是针对搜索的结果进行过虑,过虑器主要判断的是文档是否匹配,不去计算和判断文档的匹配度得分,所以过

    虑器性能比查询要高且方便缓存,推荐尽量使用过虑器去实现查询或者过虑器和查询共同使用。

    {
        "_source" : [ "name", "studymodel", "description","price"],
        "query": {
            "bool" : {
                "must":[{
                    "multi_match" : { 
                        "query" : "spring框架", 
                        "minimum_should_match": "50%",
                        "fields": [ "name^10", "description" ] 
                        }} ],
                "filter": [ { 
                    "term": { "studymodel": "201001" }},
                            { "range": { 
                                "price": {
                                    "gte": 60 ,"lte" : 100
                                }}} ] } } 
    }
    

    range:范围过虑,保留大于等于60 并且小于等于100的记录。

     /**
         * filter
         *
         */
        @Test
        public void testBoolQueryByFilter() throws Exception {
            //搜索请求对象
            SearchRequest searchRequest = new SearchRequest("xc_course");
            searchRequest.types("doc");
            //搜索构建源对象
            SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    
            /**
            *BoolQuery设置搜索方式
            */
            //1.multiMatchQueryBuilder
            MultiMatchQueryBuilder multiMatchQueryBuilder = QueryBuilders.multiMatchQuery("spring css", "name", "description")
                    .minimumShouldMatch("80%")  //拼配程度
                    .field("name", 10);
            //BoolQuery
            BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
            boolQueryBuilder.must(multiMatchQueryBuilder);
            //2.定义一个过滤器
            boolQueryBuilder.filter(QueryBuilders.termQuery("studymodel","201001"));
            boolQueryBuilder.filter(QueryBuilders.rangeQuery("price").gte(60).lte(100));
            
            searchSourceBuilder.query(boolQueryBuilder);
    
            //设置原字段过滤,第一个参数表示:包括哪些字段,第二个表示不包括那个字段
            searchSourceBuilder.fetchSource(new String[]{"name","studymodel","price","timestamp"},new String[]{});
            // 向搜索请求对象中设置搜索源
            searchRequest.source(searchSourceBuilder);
            //执行搜索,向es发送http请求
            SearchResponse search = client.search(searchRequest);
            //搜索结果
            SearchHits hits = search.getHits();
            //匹配到的总记录数
            long totalHits = hits.getTotalHits();
            // 得到匹配度高的文档
            SearchHit[] hits1 = hits.getHits();
            SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
            for (SearchHit hit : hits1){
                //文档的主键
                String id = hit.getId();
                //源文档内容
                Map<String, Object> sourceAsMap = hit.getSourceAsMap();
                System.out.println(sourceAsMap);
            }
        }
    

    1.8.排序sort

    {
        "_source" : [ "name", "studymodel", "description","price"], 
        "query": { 
            "bool" : { 
                "filter": [ { 
                    "range": { 
                        "price": {
                            "gte": 0 ,"lte" : 100}}}
                          ] } }, 
        "sort" : [ {
            "studymodel" : "desc" }, 
             { "price" : "asc" } 
                 ]
    }
    

    java client:

      SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();      
    //BoolQuery
            BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
            //定义一个过滤器
            boolQueryBuilder.filter(QueryBuilders.rangeQuery("price").gte(60).lte(100));
            searchSourceBuilder.query(boolQueryBuilder);
             //添加排序
             searchSourceBuilder.sort("studymodel", SortOrder.DESC);
             searchSourceBuilder.sort("price",SortOrder.ASC);
    

    1.9. 高亮显示

    {
        "_source" : [ "name", "studymodel", "description","price"],
        "query": {
            "bool" : {
                "must":[{
                    "multi_match" : { 
                        "query" : "开发框架", 
                        "minimum_should_match": "50%",
                        "fields": [ "name^10", "description" ] 
                        }} ],
                "filter": [ { 
                    "term": { "studymodel": "201001" }},
                            { "range": { 
                                "price": {
                                    "gte": 60 ,"lte" : 100
                                }}} ] } } 
         "sort" : [ {
            "studymodel" : "desc" }, 
             { "price" : "asc" } 
                 ],
        "highlight": {
            "pre_tags": ["<tag1>"],    #前缀
            "post_tags": ["</tag2>"],  #后缀
            "fields": { "name": {}, "description":{} } }                            
    }
    

    在name和description中出现“开发框架”时,进行高亮,在前后加标签

    java client:

    /**
         * 高亮
         *
         */
        @Test
        public void testHightLight() throws Exception {
            //搜索请求对象
            SearchRequest searchRequest = new SearchRequest("xc_course");
            searchRequest.types("doc");
            //搜索构建源对象
            SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    
            /**
            * BoolQuery设置搜索方式
            */
            //1. multiMatchQueryBuilder
            MultiMatchQueryBuilder multiMatchQueryBuilder = QueryBuilders.multiMatchQuery("开发框架", "name", "description")
                    .minimumShouldMatch("80%")  //拼配程度
                    .field("name", 10);
            //BoolQuery
            BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
            boolQueryBuilder.must(multiMatchQueryBuilder);
            //定义一个过滤器
            boolQueryBuilder.filter(QueryBuilders.rangeQuery("price").gte(0).lte(100));
            searchSourceBuilder.query(boolQueryBuilder);
    
            //设置原字段过滤,第一个参数表示:包括哪些字段,第二个表示不包括那个字段
            searchSourceBuilder.fetchSource(new String[]{"name","studymodel","price","timestamp"},new String[]{});
    
            //设置高亮
            HighlightBuilder highlightBuilder = new HighlightBuilder();
            highlightBuilder.preTags("<Tag>");
            highlightBuilder.postTags("</Tag>");
            highlightBuilder.fields().add(new HighlightBuilder.Field("name"));
            searchSourceBuilder.highlighter(highlightBuilder);
            // 向搜索请求对象中设置搜索源
            searchRequest.source(searchSourceBuilder);
            //执行搜索,向es发送http请求
            SearchResponse search = client.search(searchRequest);
            //搜索结果
            SearchHits hits = search.getHits();
            //匹配到的总记录数
            long totalHits = hits.getTotalHits();
            // 得到匹配度高的文档
            SearchHit[] hits1 = hits.getHits();
            SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
            for (SearchHit hit : hits1){
                //文档的主键
                String id = hit.getId();
                //源文档内容
                Map<String, Object> sourceAsMap = hit.getSourceAsMap();
                //取出name高亮字段
                String name = null;
                Map<String, HighlightField> highlightFields = hit.getHighlightFields();
                if (highlightFields!=null){
                    HighlightField nameHighlightField = highlightFields.get("name");
                    if (nameHighlightField != null){
                        Text[] fragments = nameHighlightField.getFragments();
                        StringBuffer stringBuffer = new StringBuffer();
                        for (Text fra: fragments){
                            stringBuffer.append(fra);
    
                        }
                        name = stringBuffer.toString();
                    }
                }
                System.out.println(name);
            }
        }
    }
    
  • 相关阅读:
    javascript实现优先队列
    javascript中的队列结构
    mysql及php命名规范
    javascript使用栈结构将中缀表达式转换为后缀表达式并计算值
    【转】感知哈希算法——找出相似的图片
    重新注册iis的.NET Framework版本
    Extjs GridPanel用法详解
    Extjs Window用法详解
    Extjs Form用法详解(适用于Extjs5)
    Extjs MVC开发模式详解
  • 原文地址:https://www.cnblogs.com/cqyp/p/13651279.html
Copyright © 2011-2022 走看看