zoukankan      html  css  js  c++  java
  • 全文检索ElasticSearch

    版本及下载地址

      ES 7.6.1;

    • ES:https://mirrors.huaweicloud.com/elasticsearch/7.6.1/?C=N&O=D
    • logstash: https://mirrors.huaweicloud.com/logstash/?C=N&O=D
    • kibana: https://mirrors.huaweicloud.com/kibana/?C=N&O=D

    熟悉目录

    bin       启动文件
    config    配置文件
        log4j2              日志配置文件
        jvm.options         java虚拟机相关的配置
        elasticsearch.yml   elasticsearch的配置文件 默认端口 9200
    lib      相关jar包
    logs      日志
    modules    功能模块
    plugins    插件

    ES集群可视化工具 - elasticsearch head

      下载地址:https://codeload.github.com/mobz/elasticsearch-head/zip/master

      启动

    cnpm install
    npm run start

      跨域解决(在es配置文件中添加允许跨域访问)

    http.cors.enabled: true
    http.cors.allow-origin: "*"

     Kibana

    语言国际化修改:kibana.yml  i18n.locale: "zh-CN"

    ES核心概念

    • 索引
    • 字段类型(mapping)
    • 文档(documents)

    IK分词器(中文分词器)

    下载ik分词器,将ik分词器放入到ES中的plugins文件夹下

    elasticsearch-plugin list通过这个查看加载的插件

    • ik_smart 最少切分
    • ik_max_word 最细粒度划分,穷尽词库的可能
    GET _analyze
    {
      "analyzer": "ik_smart",
      "text": "林允儿爱吃苹果"
    }
    
    GET _analyze
    {
      "analyzer": "ik_max_word",
      "text": "林允儿爱吃苹果"
    }

    添加自定义分词

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
    <properties>
        <comment>IK Analyzer 扩展配置</comment>
        <!--用户可以在这里配置自己的扩展字典 -->
        <entry key="ext_dict">sgrslim.dic</entry>
         <!--用户可以在这里配置自己的扩展停止词字典-->
        <entry key="ext_stopwords"></entry>
        <!--用户可以在这里配置远程扩展字典 -->
        <!-- <entry key="remote_ext_dict">words_location</entry> -->
        <!--用户可以在这里配置远程扩展停止词字典-->
        <!-- <entry key="remote_ext_stopwords">words_location</entry> -->
    </properties>

    Rest风格说明

    method url地址           描述
    PUT localhost:9200/索引名称/类型名称/文档id 创建文档(指定文档id)
    POST localhost:9200/索引名称/类型名称 创建文档(随机文档id)
    POST localhost:9200/索引名称/类型名称/文档id/_update 修改文档
    DELETE localhost:9200/索引名称/类型名称/文档id 删除文档
    GET localhost:9200/索引名称/类型名称/文档id 查询文档(通过id)
    POST localhost:9200/索引名称/类型名称/_search   查询所有数据

    索引操作

    创建一个索引

    PUT /索引名/~类型名~/文档id
    {请求体}
    
    PUT /test1/type1/1
    {
      "name":"sgrslim",
      "age":"18"
    }

     数据类型

    • 字符串类型:text、keyword
    • 数值类型:long、integer、short、byte、double、float、scaled_float、half_float
    • 日期类型:date
    • 布尔类型:boolean
    • 二进制类型:binary  
    • 。。。

    指定类型

    创建索引规则

    PUT /test2
    {
      "mappings": {
        "properties": {
          "name":{
            "type": "text"
          },
          "age":{
            "type": "integer"
          }
        }
      }
    }

    获取索引信息

    GET test2

    创建索引(ES7之后)

    PUT /索引名/_doc/文档id
    {请求体}
    
    PUT /test3/_doc/1
    {
      "name":"sgrslim",
      "age":"18"
    }

    扩展:通过GET _cat/ 可以获得es当前的很多信息

    修改索引

    POST /test3/_doc/1/_update
    {
      "doc":{
        "name":"法外狂徒"
      }
    }

     删除索引

    DELETE test1

    文档操作

    PUT & POST _update  更新

    ##post 只改desc。put 没有修改的值全置为空
    POST /test4/_doc/1/_update { "doc":{ "desc":"111" } }

    简单查询、结果字段过滤、排序、分页

    {
      "query": {
        "match": {
          "name": "sgrslim2"
        }
      },
      "_source": ["name","age"],      ##结果字段过滤
    }

    Bool查询 (多条件查询)

    must(and),所有条件都要符合      

    should  或查询

    GET /user/_search
    {
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "name": "sgrslim"
              }
            },
            {
              "match": {
                "age": 19
              }
            }
          ]
        }
      }
    }

     term查询  

    • term查询,参数不进行分词
    • keyword,是对存储的数据不进行分词
    GET /user/_search
    {
      "query": {
        "term": {
          "name": "sgrslim"
        }
      }
    }

     集成Springboot

    导入依赖

    <dependency>
        <groupId>org.elasticsearch.client</groupId>
        <artifactId>elasticsearch-rest-high-level-client</artifactId>
        <version>7.6.1</version>
    </dependency>

    ES配置类

    @Configuration
    public class ElasticSearchClientConfig {
    
        @Bean
        public RestHighLevelClient restHighLevelClient(){
            RestHighLevelClient client = new RestHighLevelClient(
                    RestClient.builder(
                            new HttpHost("localhost", 9200, "http")));
            return client;
        }
    }

    索引API

    创建索引、获取索引

    //1.创建索引请求
            CreateIndexRequest sgr_index = new CreateIndexRequest("sgr_index");
            RequestOptions requestOptions = RequestOptions.DEFAULT;
            //2.执行请求IndicesClient,请求后获得响应
            CreateIndexResponse createIndexResponse = restHighLevelClient.indices().create(sgr_index, requestOptions);
            

    文档API

    /**
    * 添加文档
    */
    @Test
    void testAddDocument() throws IOException {
    //1.创建对象
    User sgrslim = new User("sgrslim", 18);
    //2.创建请求
    IndexRequest indexRequest = new IndexRequest("sgr_index");

    //3. 设置请求规则 put /sgr_index/_doc/1
    indexRequest.id("1");
    indexRequest.timeout(TimeValue.timeValueSeconds(1));
    indexRequest.timeout("1s");

    //4.将数据放入请求 json
    indexRequest.source(JSON.toJSONString(sgrslim), XContentType.JSON);

    //5.客户端发送请求
    IndexResponse index = restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);
    System.out.println(index.toString());
    System.out.println(index.status());
    }

    /**
    * 测试文档是否存在
    * @throws IOException
    */
    @Test
    void testIsExist() throws IOException {
    GetRequest sgr_index = new GetRequest("sgr_index","1");
    //不获取_source上下文的内容
    sgr_index.fetchSourceContext(new FetchSourceContext(false));
    sgr_index.storedFields("_none_");

    boolean exists = restHighLevelClient.exists(sgr_index, RequestOptions.DEFAULT);
    System.out.println(exists);
    }

    /**
    * 获取文档内容
    * @throws IOException
    */
    @Test
    void testSearchdocument() throws IOException {
    GetRequest sgr_index = new GetRequest("sgr_index", "1");
    GetResponse documentFields = restHighLevelClient.get(sgr_index, RequestOptions.DEFAULT);
    System.out.println(documentFields.getSourceAsString());
    }

    /**
    * 更新文档内容
    * @throws IOException
    */
    @Test
    void testUpdateDocument() throws IOException {
    User user = new User();
    user.setAge(29);
    UpdateRequest sgr_index = new UpdateRequest("sgr_index", "1");
    sgr_index.doc(JSON.toJSONString(user),XContentType.JSON);
    UpdateResponse update = restHighLevelClient.update(sgr_index, RequestOptions.DEFAULT);
    }

    /**
    * 批量插入数据
    * @throws IOException
    */
    @Test
    void testBuldCreateDocument() throws IOException {
    BulkRequest sgr_index = new BulkRequest();
    ArrayList<User> userList = new ArrayList<>();
    userList.add(new User("sgr",3));
    userList.add(new User("sgrs",4));
    userList.add(new User("sgrsl",5));

    for (int i = 0; i < userList.size(); i++) {
    IndexRequest sgr_index1 = new IndexRequest("sgr_index").id((i + 2) + "").source(JSON.toJSONString(userList.get(i)), XContentType.JSON);
    sgr_index.add(sgr_index1);
    }
    BulkResponse bulk = restHighLevelClient.bulk(sgr_index, RequestOptions.DEFAULT);
    System.out.println(bulk.hasFailures());
    }

    /**
    * 测试查询
    * @throws IOException
    */
    @Test
    void testSearchdocumnt() throws IOException {
    SearchRequest searchRequest = new SearchRequest("sgr_index");
    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("name", "sgr");
    searchSourceBuilder.query(termQueryBuilder);
    //searchSourceBuilder.highlighter();
    searchRequest.source(searchSourceBuilder);
    SearchResponse search = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
    for (int i = 0; i < search.getHits().getHits().length; i++) {
    System.out.println(search.getHits().getHits()[i].toString());
    }
    }

    整合测试

      @Autowired
        private RestHighLevelClient restHighLevelClient;
    
      //批量插入
    public Boolean parseContent(String keyword) throws IOException { List<Content> javaList = new HtmlParseUtil().getList("java"); BulkRequest bulkRequest = new BulkRequest(); for (Content content : javaList) { IndexRequest goods_index = new IndexRequest("goods_index"); goods_index.source(JSON.toJSONString(content), XContentType.JSON); bulkRequest.add(goods_index); } BulkResponse bulk = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT); return !bulk.hasFailures(); }
       //查询
    public List<Map<String,Object>> searchList(String keyword,int pageNo,int pageSize) throws IOException { SearchRequest goods_index = new SearchRequest("goods_index"); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); MatchQueryBuilder title = QueryBuilders.matchQuery("title", keyword); searchSourceBuilder.query(title); searchSourceBuilder.from(pageNo); searchSourceBuilder.size(pageSize); goods_index.source(searchSourceBuilder); SearchResponse search = restHighLevelClient.search(goods_index, RequestOptions.DEFAULT); List<Map<String, Object>> maps =new ArrayList<Map<String,Object>>(); SearchHit[] hits = search.getHits().getHits(); for (SearchHit hit : hits) { maps.add(hit.getSourceAsMap()); } return maps; }
  • 相关阅读:
    学习Java的第八天
    学习Java的第七天
    学习Java的第六天
    学习Java的第五天
    学习Java的第四天
    学习Java的第三天
    学习Java的第二天
    学习Java的第一天
    第九天
    第八次
  • 原文地址:https://www.cnblogs.com/sgrslimJ/p/13740757.html
Copyright © 2011-2022 走看看