zoukankan      html  css  js  c++  java
  • RestHighLevelClient 之 Scroll

    ES中默认最大查询结果为10000,大于10000时查不出结果,报错超过最大值,如把 from调到大于10000.

    针对这个问题,有两种解决办法。

    第一种,修改 max_result_window

    很多人都用这种方法,简单粗暴。缺点是真的简单粗暴,对部分情形可用,但是对一些特殊情形可能就不行了。

    PUT index/_settings
    
    {
      "index":{
        "max_result_window":100000000
      }
    }

    一篇可以参考的博客:关于搜索elasticsearch的数据条数大于10000的坑 max_result_window的两种设置方式


    第二种,Scroll

    scroll API 可以被用来检索大量的结果, 甚至所有的结果 ,就像在传统数据库中使用的游标 cursor。

    本方法官方文档:https://www.elastic.co/guide/en/elasticsearch/reference/7.2/search-request-scroll.html#scroll-search-context

    中文翻译参考:https://blog.csdn.net/ctwy291314/article/details/82751898

    以下代码是要实现获取ES中全部文档的nid字段,并将其存到文件中,是在单元测试中写的,NID是内部类。

    具体代码:

    public static class NID {
        private String nid;
        public String getNid() {
            return nid;
        }
        public void setNid(String nid) {
            this.nid = nid;
        }
    }
    
    @Test
    public void testScroll() {
        //RestHighLevelClient client = elasticClient.getRestHighLevelClient();
        RestHighLevelClient client = esConfig.client();
        // 初始化scroll
        // 设定滚动时间间隔
        // 这个时间并不需要长到可以处理所有的数据,仅仅需要足够长来处理前一批次的结果。每个 scroll 请求(包含 scroll 参数)设置了一个新的失效时间。
        final Scroll scroll = new Scroll(TimeValue.timeValueMinutes(1L));
        SearchRequest searchRequest = new SearchRequest(esConfig.getCaterIndex()); // 新建索引搜索请求
        searchRequest.scroll(scroll);
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        searchSourceBuilder.query(matchAllQuery());
        searchSourceBuilder.size(5000); //设定每次返回多少条数据
        searchSourceBuilder.fetchSource(new String[]{"nid"},null);//设置返回字段和排除字段
        searchRequest.source(searchSourceBuilder);
    
        SearchResponse searchResponse = null;
        try {
            searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
        } catch (IOException e) {
            e.printStackTrace();
        }
    
        int page = 0 ;
        File outFile = new File("E://cater_nid.csv");//写出的CSV文件
        try {
            BufferedWriter writer = new BufferedWriter(new FileWriter(outFile));
    
            SearchHit[] searchHits = searchResponse.getHits().getHits();
            page++;
            System.out.println("-----第"+ page +"页-----");
            for (SearchHit searchHit : searchHits) {
                //System.out.println(searchHit.getSourceAsString());
                String sourceAsString = searchHit.getSourceAsString();
                NID t = JSON.parseObject(sourceAsString, NID.class);
                writer.write(t.getNid());
                writer.newLine();
            }
    
            //遍历搜索命中的数据,直到没有数据
            String scrollId = searchResponse.getScrollId();
            while (searchHits != null && searchHits.length > 0) {
                SearchScrollRequest scrollRequest = new SearchScrollRequest(scrollId);
                scrollRequest.scroll(scroll);
                try {
                    searchResponse = client.scroll(scrollRequest, RequestOptions.DEFAULT);
                } catch (IOException e) {
                    e.printStackTrace();
                }
                scrollId = searchResponse.getScrollId();
                searchHits = searchResponse.getHits().getHits();
                if (searchHits != null && searchHits.length > 0) {
                    page++;
                    System.out.println("-----第"+ page +"页-----");
                    for (SearchHit searchHit : searchHits) {
                        //System.out.println(searchHit.getSourceAsString());
                        String sourceAsString = searchHit.getSourceAsString();
                        NID t = JSON.parseObject(sourceAsString, NID.class);
                        writer.write(t.getNid());
                        writer.newLine();
                    }
                }
            }
            //清除滚屏
            ClearScrollRequest clearScrollRequest = new ClearScrollRequest();
            clearScrollRequest.addScrollId(scrollId);//也可以选择setScrollIds()将多个scrollId一起使用
            ClearScrollResponse clearScrollResponse = null;
            try {
                clearScrollResponse = client.clearScroll(clearScrollRequest, RequestOptions.DEFAULT);
            } catch (IOException e) {
                e.printStackTrace();
            }
            boolean succeeded = clearScrollResponse.isSucceeded();
            System.out.println("succeeded:" + succeeded);
    
            writer.close();
    
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    代码参考:https://www.cnblogs.com/chentop/p/10296517.html







    TIM图片20190628110618


  • 相关阅读:
    webserivice---通过Ajax访问远程天气预报服务
    IDEA Error:java: 未结束的字符串文字
    UML:它是一种支持模型化和软件系统开发的图形化语言
    核心代码之分页
    struts.xml 的 file 报错 解决方式
    Myeclipse buildpath 加server lib (server runtime)
    核心代码之优化查询
    入园新编
    为啥JS中判断对象是否是类的实例推荐使用instanceof而不推荐constructor
    http常考的题目
  • 原文地址:https://www.cnblogs.com/betterwgo/p/11430140.html
Copyright © 2011-2022 走看看