zoukankan      html  css  js  c++  java
  • 分布式全文检索服务器——ElasticSearch(2)

    写在前面

    前文学习了如何安装和使用自带的Restful API对其进行操作,接下来学习一下Java的相关API。

    构建环境

    我们使用maven工程,来导入所需的jar包:

    <dependencies>
            <dependency>
                <groupId>org.elasticsearch</groupId>
                <artifactId>elasticsearch</artifactId>
                <version>7.8.0</version>
            </dependency>
            <dependency>
                <groupId>org.elasticsearch.client</groupId>
                <artifactId>transport</artifactId>
                <version>7.8.0</version>
            </dependency>
            <dependency>
                <groupId>org.elasticsearch.client</groupId>
                <artifactId>elasticsearch-rest-high-level-client</artifactId>
                <version>7.8.0</version>
            </dependency>
            <dependency>
                <groupId>org.apache.logging.log4j</groupId>
                <artifactId>log4j-to-slf4j</artifactId>
                <version></version>
            </dependency>
            <dependency>
                <groupId>org.slf4j</groupId>
                <artifactId>slf4j-api</artifactId>
                <version>1.7.24</version>
            </dependency>
            <dependency>
                <groupId>org.slf4j</groupId>
                <artifactId>slf4j-simple</artifactId>
                <version>1.7.21</version>
            </dependency>
            <dependency>
                <groupId>log4j</groupId>
                <artifactId>log4j</artifactId>
                <version>1.2.12</version>
            </dependency>
            <dependency>
                <groupId>junit</groupId>
                <artifactId>junit</artifactId>
                <version>4.12</version>
            </dependency>
            <dependency>
                <groupId>com.fasterxml.jackson.core</groupId>
                <artifactId>jackson-databind</artifactId>
                <version>2.9.6</version>
            </dependency>
            <dependency>
                <groupId>com.fasterxml.jackson.core</groupId>
                <artifactId>jackson-annotations</artifactId>
                <version>2.9.0</version>
            </dependency>
        </dependencies>
    

    创建索引

    在一切开始前,我们将必须使用的对象初始化:

    private RestHighLevelClient client;
        @Before
        public void init() {
            client = new RestHighLevelClient(RestClient.builder(
                    new HttpHost("localhost", 9201,"http"),
                    new HttpHost("localhost", 9202,"http"),
                    new HttpHost("localhost", 9203,"http")
            ));
        }
    

    注:由于TransportClient已经被官方标为弃用了,官方建议使用RestHighLevelClient来作为替代,我们这里学习就顺应官方要求了
    之后就可以开始创建索引了:

        @Test
        public void createIndex() throws Exception {
            // 创建索引
            CreateIndexRequest request = new CreateIndexRequest("index_hello2");
            // 创建索引时的参数设置,这里设置成5片,各备份一片
            request.settings(Settings.builder()
                    .put("index.number_of_shards", 5)
                    .put("index.number_of_replicas", 1)
            );
            // 同步创建索引
            CreateIndexResponse createIndexResponse = client.indices().create(request, RequestOptions.DEFAULT);
            // 通过返回的值来确定是否创建成功
            if(createIndexResponse.isAcknowledged()){
                System.out.println("创建成功啦!");
            }else{
                System.out.println("创建失败啦!");
            }
        }
    

    设置mapping

    设置mapping有很多方式,这里使用了XContentBuilder:

    @Test
        public void setMapping() throws Exception{
            PutMappingRequest request = new PutMappingRequest("index_hello2");
            //使用XContentBuilder创建一个mapping信息
            XContentBuilder builder = XContentFactory.jsonBuilder()
                    .startObject()
                    .startObject("properties")
                    .startObject("ids")
                    .field("type", "long")
                    .field("store", true)
                    .endObject()
                    .startObject("title")
                    .field("type", "text")
                    .field("store", true)
                    .field("analyzer", "ik_smart")
                    .endObject()
                    .startObject("content")
                    .field("type", "text")
                    .field("store", true)
                    .field("analyzer", "ik_smart")
                    .endObject()
                    .endObject()
                    .endObject();
            // 将mapping信息设置给request
            request.source(builder);
            // 同步创建索引
            AcknowledgedResponse acknowledgedResponse = client.indices().putMapping(request, RequestOptions.DEFAULT);
            // 通过返回的值来确定是否创建成功
            if(acknowledgedResponse.isAcknowledged()){
                System.out.println("创建mapping成功啦!");
            }else{
                System.out.println("创建失败啦!");
            }
            client.close();
        }
    

    添加文档

        @Test
        public void testAddDocument() throws Exception{
            // 创建一个IndexRequest对象来创建一个文档
            IndexRequest indexRequest = new IndexRequest("index_hello2");
            // 设置添加的文档id
            indexRequest.id("2");
            // 通过Map来设置值
            Map<String,Object> map = new HashMap<>(4);
            map.put("ids",2L);
            map.put("title","我爱谔谔");
            map.put("content","????");
            // 把map设置给indexRequest
            indexRequest.source(map);
            // 执行操作
            IndexResponse indexResponse = client.index(indexRequest, RequestOptions.DEFAULT);
            // 输出返回结果
            System.out.println(indexResponse.toString());
            // 关闭client
            client.close();
        }
    

    这里使用了Map的方式来插入,其实有很多方式,具体可以查看官网API文档:https://www.elastic.co/guide/en/elasticsearch/client/java-api/7.8/index.html

    查询

    首先把一些通用的变量和方法抽出来:

        private RestHighLevelClient client;
        private SearchRequest searchRequest;
        private SearchSourceBuilder searchSourceBuilder;
        @Before
        public void init() {
            client = new RestHighLevelClient(RestClient.builder(
                    new HttpHost("localhost", 9201,"http"),
                    new HttpHost("localhost", 9202,"http"),
                    new HttpHost("localhost", 9203,"http")
            ));
            // 新建一个查询请求
            searchRequest = new SearchRequest("index_hello");
            // 新建一个查询条件Builder
            searchSourceBuilder = new SearchSourceBuilder();
        }
    

    这里我把三个搜索方法进行了提取:(普通查询,带分页的查询,高亮查询)

    private void search(QueryBuilder queryBuilder) throws IOException {
            // 将该query赋值给builder
            searchSourceBuilder.query(queryBuilder);
            // 将builder赋值给request
            searchRequest.source(searchSourceBuilder);
            // 发起请求,返回结果
            SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT);
            SearchHits hits = response.getHits();
            System.out.println("总条数为:"+hits.getTotalHits());
            Iterator<SearchHit> iterator = hits.iterator();
            while(iterator.hasNext()){
                SearchHit searchHit = iterator.next();
                // 打印整个json字符串
                System.out.println(searchHit.getSourceAsString());
                System.out.println("-----------文档的属性");
                Map<String, Object> source = searchHit.getSourceAsMap();
                System.out.println(source.get("ids"));
                System.out.println(source.get("title"));
                System.out.println(source.get("content"));
            }
        }
        private void search(QueryBuilder queryBuilder, String highLightField) throws IOException {
            // 将该query赋值给builder
            searchSourceBuilder.query(queryBuilder);
            HighlightBuilder highlightBuilder = new HighlightBuilder();
            highlightBuilder.field(highLightField);
            highlightBuilder.preTags("<em>");
            highlightBuilder.postTags("</em>");
            searchSourceBuilder.highlighter(highlightBuilder);
            // 将builder赋值给request
            searchRequest.source(searchSourceBuilder);
            // 发起请求,返回结果
            SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT);
            SearchHits hits = response.getHits();
            System.out.println("总条数为:"+hits.getTotalHits());
            Iterator<SearchHit> iterator = hits.iterator();
            while(iterator.hasNext()){
                SearchHit searchHit = iterator.next();
                // 打印整个json字符串
                System.out.println(searchHit.getSourceAsString());
                System.out.println("-----------文档的属性");
                Map<String, Object> source = searchHit.getSourceAsMap();
                System.out.println(source.get("id"));
                System.out.println(source.get("title"));
                System.out.println(source.get("content"));
                System.out.println("*********** 高亮属性");
                Map<String, HighlightField> highlightFields = searchHit.getHighlightFields();
                System.out.println(highlightFields);
                // 取title高亮显示的结果
                HighlightField field = highlightFields.get(highLightField);
                Text[] fragments = field.getFragments();
                if(fragments!=null){
                    String  fragment = fragments[0].toString();
                    System.out.println(fragment);
                }
            }
        }
        private void searchByPage(QueryBuilder queryBuilder) throws IOException {
            // 将该query赋值给builder
            searchSourceBuilder.query(queryBuilder);
            // 设置分页参数,从0开始每页五条
            searchSourceBuilder.from(0);
            searchSourceBuilder.size(5);
            // 将builder赋值给request
            searchRequest.source(searchSourceBuilder);
            // 发起请求,返回结果
            SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT);
            SearchHits hits = response.getHits();
            System.out.println("总条数为:"+hits.getTotalHits());
            Iterator<SearchHit> iterator = hits.iterator();
            while(iterator.hasNext()){
                SearchHit searchHit = iterator.next();
                // 打印整个json字符串
                System.out.println(searchHit.getSourceAsString());
                System.out.println("-----------文档的属性");
                Map<String, Object> source = searchHit.getSourceAsMap();
                System.out.println(source.get("ids"));
                System.out.println(source.get("title"));
                System.out.println(source.get("content"));
            }
        }
    

    基于ID的查询

        @Test
        public void findById() throws Exception{
            // 使用ID查询
            QueryBuilder queryBuilder = QueryBuilders.idsQuery().addIds("1");
            search(queryBuilder);
        }
    

    基于Term的关键字查询

        @Test
        public void findByTerm() throws Exception{
            // 使用Term查询
            QueryBuilder queryBuilder = QueryBuilders.termQuery("title","淫荡");
            search(queryBuilder);
        }
    

    基于QueryString的分词查询

        @Test
        public void testQueryString() throws Exception{
            // 创建一个QueryBuilder对象
            QueryBuilder queryBuilder = QueryBuilders.queryStringQuery("谁不爱呢").defaultField("content");
            // 执行查询
            search(queryBuilder,"content");
            // 执行分页查询
    //        searchByPage(queryBuilder);
        }
    

    可以看到,三个查询方法都十分类型,都是构建了一个QueryBuilder对象。

    SpringData-ElasticSearch

    这是Spring公司封装的API,可以在保留基本特性的基础上达到简化操作的目的。但百度搜到的内容都过于老旧,使用了过时的API,官网的文档我也看不大懂。这里就先放置下,日后需要使用时再好好研究

    总结

    可以看到,ElasticSearch的更新迭代很快,而且整体操作和使用起来也比较复杂。考虑到很多大型公司都在使用,这个搜索服务器我只学习了皮毛,其他的使用等需要使用时再好好研究吧。

  • 相关阅读:
    Unix Programming :文件IO
    Git 小记
    Effective C++ Placement new
    Effective C++ 避免数组多态
    系列文章:云原生Kubernetes日志落地方案
    阿里巴巴大数据产品最新特性介绍--机器学习PAI
    Apache Flink 1.9.0版本新功能介绍
    Flink Checkpoint 问题排查实用指南
    进击的 Java ,云原生时代的蜕变
    8 分钟入门 K8s | 详解容器基本概念
  • 原文地址:https://www.cnblogs.com/wushenjiang/p/13190788.html
Copyright © 2011-2022 走看看