zoukankan      html  css  js  c++  java
  • SpringBoot+Elasticsearch7.7.0实战

    一:什么是Elasticsearch
    1.Elasticsearch是一个基于Lucene的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎,基于RESTful web接口。Elasticsearch是用Java语言开发的,并作为Apache许可条款下的开放源码发布,是一种流行的企业级搜索引擎。Elasticsearch用于云计算中,能够达到实时搜索,稳定,可靠,快速,安装使用方便。官方客户端在Java、.NET(C#)、PHP、Python、Apache Groovy、Ruby和许多其他语言中都是可用的。根据DB-Engines的排名显示,Elasticsearch是最受欢迎的企业搜索引擎,其次是Apache Solr,也是基于Lucene

    1.1什么是ELK?

    ELK=elasticsearch+Logstash+kibana 
    elasticsearch:后台分布式存储以及全文检索 
    logstash: 日志加工、“搬运工” 
    kibana:数据可视化展示。 
    ELK架构为数据分布式存储、可视化查询和日志解析创建了一个功能强大的管理链。 三者相互配合,取长补短,共同完成分布式大数据处理工作。

    2.不仅仅是全文搜索引擎

    • 分布式实时文件存储,并将每一个字段都编入索引,使其可以被搜索。
    • 实时分析的分布式搜索引擎。
    • 可以扩展到上百台服务器,处理PB级别的结构化或非结构化数据。

    二:安装并运行Elasticsearch

    1.官网下载:https://www.elastic.co/cn/downloads/elasticsearch

    2.解压并运行 elasticsearch.bat
    3.修改配置文件elasticsearch.yml,解决与ElasticSearch-head整合跨域问题(开启跨域配置)

    http.cors.enabled: true
    http.cors.allow-origin: "*"

    3:ElasticSearch-head 插件安装:在GitHub上找到head插件,地址:https://github.com/mobz/elasticsearch-head。将其下载/克隆到本地(elasticsearch-head)

    # 安装插件;由于需要下载一些数据,所以可能会比较慢。
    npm install
    # 启动
    npm run start

    4:ES运行成功页面

     

    5:运行Kibana(进入config下的kibana.yml 修改配置为)

    # 本机的端口
    server.port: 5601
    # 本机IP地址
    server.host: "127.0.0.1"
    # ES的服务IP+端口
    elasticsearch.hosts: ["http://localhost:9200"]

    启动命令

    .kibana.bat

    6:Kibana运行成功页面(http://localhost:5601/app/kibana#/home)


    7:运行Logstash(进入E:ELKlogstash-7.7.0config 新建配置文件logstash.conf 修改内容如下)

    input {
        file {
            type => "nginx_access"
            path => "E:/ELK/logs/logstash.log"
        }
    }
    output {
        elasticsearch {
            hosts => ["127.0.0.1:9200"]
            index => "access-%{+YYYY.MM.dd}"
        }
        stdout {
            codec => json_lines
        }
    }

    8:启动logstash

    .logstash.bat -f ../config/logstash.conf

    9:logback_spring.xml

    <?xml version="1.0" encoding="UTF-8"?>
    
    <!--scan: 为true时,配置文档如果发生改变,将会被重新加载,默认值true
        scanPeriod: 设置监测配置文档是否有修改的时间间隔,如果没有给出单位,默认毫秒。 为true时生效,默认时间间隔1分钟
        debug: 为true时,将打印出logback内部日志信息,实时查看logback运行状态。默认值false -->
    <configuration scan="true" scanPeriod="10 seconds" debug="false">
    
        <contextName>logback</contextName>
    
        <!-- 属性值会被插入到上下文中,在下面配置中可以使“${}”来使用 -->
        <!-- 格式化输出:%d表示日期,%thread表示线程名,%-5level:级别从左显示5个字符宽度 %msg:日志消息,%n是换行符 -->
        <!-- <property name="LOG.PATTERN" value="[%d{yyyy-MM-dd HH:mm:ss.SSS}] [%-5level] [%15.15thread] [%45.45file{15}(%-4line\) - %-45.45logger{15}] : %msg%n" /> -->
        <property name="LOG.PATTERN" value="[%d{yyyy-MM-dd HH:mm:ss.SSS}] [%-5level] %15.15thread -- %-45.45logger{15} : %msg%n"/>
        <!-- 日志文件 -->
        <property name="LOG.PATH" value="E:/ELK/logs/logstash.log"/>
        <property name="LOG.FILE.DEBUG" value="${LOG.PATH}/debug.log"/>
        <property name="LOG.FILE.INFO" value="${LOG.PATH}/info.log"/>
        <property name="LOG.FILE.WARN" value="${LOG.PATH}/warn.log"/>
        <property name="LOG.FILE.ERROR" value="${LOG.PATH}/error.log"/>
        <property name="LOG.FILE.NAME" value="springboot.%d{yyyy-MM-dd}.%i.log"/>
    
        <!-- 输出日志到控制台 -->
        <appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
            <encoder>
                <Pattern>${LOG.PATTERN}</Pattern>
                <charset>UTF-8</charset>
            </encoder>
            <filter class="ch.qos.logback.classic.filter.ThresholdFilter">
                <level>INFO</level>
            </filter>
        </appender>
    
    
        <!-- 输出日志到文件:DEBUG -->
        <appender name="FILE-DEBUG" class="ch.qos.logback.core.rolling.RollingFileAppender">
            <file>${LOG.FILE.DEBUG}</file>
            <append>true</append>
            <encoder>
                <Pattern>${LOG.PATTERN}</Pattern>
                <charset>UTF-8</charset>
            </encoder>
    
            <!-- 日志记录级别 -->
            <filter class="ch.qos.logback.classic.filter.LevelFilter">
                <level>DEBUG</level>
                <onMatch>ACCEPT</onMatch>
                <onMismatch>DENY</onMismatch>
            </filter>
    
            <!-- 时间滚动策略 -->
            <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
                <timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
                    <maxFileSize>100MB</maxFileSize>
                </timeBasedFileNamingAndTriggeringPolicy>
    
                <!-- 保留天数 -->
                <maxHistory>7</maxHistory>
    
                <fileNamePattern>${LOG.FILE.DEBUG}/${LOG.FILE.NAME}</fileNamePattern>
            </rollingPolicy>
        </appender>
    
        <!-- 输出日志到文件:INFO -->
        <appender name="FILE-INFO" class="ch.qos.logback.core.rolling.RollingFileAppender">
            <file>${LOG.FILE.INFO}</file>
            <append>true</append>
            <encoder>
                <Pattern>${LOG.PATTERN}</Pattern>
                <charset>UTF-8</charset>
            </encoder>
    
            <!-- 日志记录级别 -->
            <filter class="ch.qos.logback.classic.filter.LevelFilter">
                <level>INFO</level>
                <onMatch>ACCEPT</onMatch>
                <onMismatch>DENY</onMismatch>
            </filter>
    
            <!-- 时间滚动策略 -->
            <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
                <timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
                    <maxFileSize>100MB</maxFileSize>
                </timeBasedFileNamingAndTriggeringPolicy>
    
                <!-- 保留天数 -->
                <maxHistory>7</maxHistory>
    
                <fileNamePattern>${LOG.FILE.INFO}/${LOG.FILE.NAME}</fileNamePattern>
            </rollingPolicy>
        </appender>
    
        <!-- 输出日志到文件:ERROR -->
        <appender name="FILE-ERROR" class="ch.qos.logback.core.rolling.RollingFileAppender">
            <file>${LOG.FILE.ERROR}</file>
            <append>true</append>
            <encoder>
                <Pattern>${LOG.PATTERN}</Pattern>
                <charset>UTF-8</charset>
            </encoder>
    
            <!-- 日志记录级别 -->
            <filter class="ch.qos.logback.classic.filter.LevelFilter">
                <level>ERROR</level>
                <onMatch>ACCEPT</onMatch>
                <onMismatch>DENY</onMismatch>
            </filter>
    
            <!-- 时间滚动策略 -->
            <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
                <timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
                    <maxFileSize>100MB</maxFileSize>
                </timeBasedFileNamingAndTriggeringPolicy>
    
                <!-- 保留天数 -->
                <maxHistory>7</maxHistory>
    
                <fileNamePattern>${LOG.FILE.ERROR}/${LOG.FILE.NAME}</fileNamePattern>
            </rollingPolicy>
        </appender>
    
        <!-- 输出日志到文件:WARN -->
        <appender name="FILE-WARN" class="ch.qos.logback.core.rolling.RollingFileAppender">
            <file>${LOG.FILE.WARN}</file>
            <append>true</append>
            <encoder>
                <Pattern>${LOG.PATTERN}</Pattern>
                <charset>UTF-8</charset>
            </encoder>
    
            <!-- 日志记录级别 -->
            <filter class="ch.qos.logback.classic.filter.LevelFilter">
                <level>WARN</level>
                <onMatch>ACCEPT</onMatch>
                <onMismatch>DENY</onMismatch>
            </filter>
    
            <!-- 时间滚动策略 -->
            <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
                <timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
                    <maxFileSize>100MB</maxFileSize>
                </timeBasedFileNamingAndTriggeringPolicy>
    
                <!-- 保留天数 -->
                <maxHistory>7</maxHistory>
    
                <fileNamePattern>${LOG.FILE.WARN}/${LOG.FILE.NAME}</fileNamePattern>
            </rollingPolicy>
        </appender>
    
        <!-- 下面配置一些第三方包的日志过滤级别,用于避免刷屏-->
        <Logger name="org.springframework" level="INFO"/>
        <Logger name="org.springframework.beans.factory.aspectj" level="WARN"/>
        <Logger name="org.springframework.transaction.interceptor" level="WARN"/>
        <Logger name="org.springframework.beans.factory.support" level="WARN"/>
        <Logger name="org.springframework.web.servlet" level="DEBUG"/>
        <Logger name="org.springframework.transaction" level="WARN"/>
        <Logger name="org.springframework.jdbc.datasource.DataSourceTransactionManager" level="WARN"/>
        <Logger name="org.springframework.transaction.support.AbstractPlatformTransactionManager" level="WARN"/>
        <Logger name="org.springframework.security" level="WARN"/>
        <Logger name="org.apache.commons" level="WARN"/>
        <Logger name="org.apache.kafka" level="WARN"/>
        <Logger name="org.apache.http" level="ERROR"/>
        <Logger name="org.logicalcobwebs" level="WARN"/>
        <Logger name="httpclient" level="ERROR"/>
        <Logger name="net.sf.ehcache" level="WARN"/>
        <Logger name="org.apache.zookeeper" level="WARN"/>
        <Logger name="org.I0Itec" level="WARN"/>
        <Logger name="com.meetup.memcached" level="WARN"/>
        <Logger name="org.mongodb.driver" level="INFO"/>
        <Logger name="org.quartz.core" level="INFO"/>
        <Logger name="io.netty" level="INFO"/>
        <Logger name="net.rubyeye.xmemcached" level="WARN"/>
        <Logger name="com.google" level="WARN"/>
        <logger name="com.ibatis" level="DEBUG"/>
        <logger name="com.ibatis.common.jdbc.SimpleDataSource" level="DEBUG"/>
        <logger name="com.ibatis.common.jdbc.ScriptRunner" level="DEBUG"/>
        <logger name="com.ibatis.sqlmap.engine.impl.SqlMapClientDelegate" level="DEBUG"/>
        <logger name="java.sql.Connection" level="DEBUG"/>
        <logger name="java.sql.Statement" level="DEBUG"/>
        <logger name="java.sql.PreparedStatement" level="DEBUG"/>
        <logger name="com.es.elsaticsearch" level="DEBUG"/>
    
        <!-- 生产环境下,将此级别配置为适合的级别,以免日志文件太多或影响程序性能
            level取值:TRACE < DEBUG < INFO < WARN < ERROR < FATAL 和 OFF -->
        <root level="INFO">
            <appender-ref ref="FILE-DEBUG"/>
            <appender-ref ref="FILE-INFO"/>
            <appender-ref ref="FILE-ERROR"/>
            <appender-ref ref="FILE-WARN"/>
    
            <!-- 生产环境将请console去掉 -->
            <appender-ref ref="CONSOLE"/>
        </root>
    
    </configuration>

    ES核心概念

    索引

    字段类型(mapping)

    文档(document)

    elasticsearch是面向文档,一切数据都是json,与关系型数据库对比

    DB ElasticSearch
    数据库(database) 索引
    表(table) types
    行(rows) documents
    字段(columns) fields

    三:springboot整合ElasticSearch7.7.0

    1.添加maven依赖

       <properties>
            <java.version>1.8</java.version>
            <elasticsearch.version>7.7.0</elasticsearch.version>
        </properties>
    
        <dependencies>
            <dependency>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
            </dependency>
        </dependencies>    

    2.编写配置类(ES可以作为一个独立的单个搜索服务器。不过,为了处理大型数据集,实现容错和高可用性,ES可以运行在许多互相合作的服务器上【集群】),本案例是单个服务器

    /**
     * @author : ywb
     * @createdDate : 2020/5/31
     * @updatedDate
     */
    @Configuration
    public class ElasticSearchConfig {
        @Bean
        public RestHighLevelClient restHighLevelClient() {
            RestHighLevelClient restHighLevelClient = new RestHighLevelClient(
                    RestClient.builder(
                            new HttpHost("localhost", 9200, "http")
                    )
            );
            return restHighLevelClient;
        }
    }

    3:ES对索引、文档进行增删改查(java对ES操作的基本API)

    @RunWith(SpringRunner.class)
    @SpringBootTest
    public class ElasticsearchApplicationTests {
    
    
        @Autowired
        private RestHighLevelClient restHighLevelClient;
    
        //创建索引
        @Test
        public void testCreateIndex() throws IOException {
            CreateIndexRequest createIndexRequest = new CreateIndexRequest("ywb");
            CreateIndexResponse response = restHighLevelClient.indices().create(createIndexRequest, RequestOptions.DEFAULT);
            System.out.println(response);
        }
    
        /**
         * 测试索引是否存在
         *
         * @throws IOException
         */
        @Test
        public void testExistIndex() throws IOException {
            GetIndexRequest request = new GetIndexRequest("ywb");
            boolean exists = restHighLevelClient.indices().exists(request, RequestOptions.DEFAULT);
            System.out.println(exists);
        }
    
        /**
         * 删除索引
         */
        @Test
        public void deleteIndex() throws IOException {
            DeleteIndexRequest deleteIndexRequest = new DeleteIndexRequest("ywb");
            AcknowledgedResponse delete = restHighLevelClient.indices().delete(deleteIndexRequest, RequestOptions.DEFAULT);
            System.out.println(delete.isAcknowledged());
        }
    
        /**
         * 测试添加文档
         *
         * @throws IOException
         */
        @Test
        public void createDocument() throws IOException {
            User user = new User("ywb", 18);
            IndexRequest request = new IndexRequest("ywb");
            request.id("1");
            request.timeout(TimeValue.timeValueSeconds(1));
            request.timeout("1s");
            //将我们的数据放入请求,json
            request.source(JSON.toJSONString(user), XContentType.JSON);
            //客服端发送请求
            IndexResponse index = restHighLevelClient.index(request, RequestOptions.DEFAULT);
            System.out.println(index.toString());
            //对应我们的命令返回状态
            System.out.println(index.status());
        }
    
        //判断是否存在文档
        @Test
        public void testIsExist() throws IOException {
            GetRequest getRequest = new GetRequest("ywb", "1");
            //不获取返回的source的上下文
            getRequest.fetchSourceContext(new FetchSourceContext(false));
            getRequest.storedFields("_none_");
            boolean exists = restHighLevelClient.exists(getRequest, RequestOptions.DEFAULT);
            System.out.println(exists);
        }
    
        //获取文档信息
        @Test
        public void testGetDocument() throws IOException {
            GetRequest getRequest = new GetRequest("ywb", "1");
            GetResponse response = restHighLevelClient.get(getRequest, RequestOptions.DEFAULT);
            //打印文档信息
            System.out.println(response.getSourceAsString());
            System.out.println(response);
        }
    
        //更新文档信息
        @Test
        public void testUpdateDocument() throws IOException {
            UpdateRequest request = new UpdateRequest("ywb", "1");
            request.timeout("1s");
            User user = new User("ywb java", 19);
            request.doc(JSON.toJSONString(user), XContentType.JSON);
            UpdateResponse update = restHighLevelClient.update(request, RequestOptions.DEFAULT);
            System.out.println(update);
            System.out.println(update.status());
        }
        //删除文档
        @Test
        public void testDeleteDocument() throws IOException {
            DeleteRequest request = new DeleteRequest("ywb", "1");
            request.timeout("10s");
            User user = new User("ywb java", 19);
            DeleteResponse update = restHighLevelClient.delete(request, RequestOptions.DEFAULT);
            System.out.println(update.status());
        }
        //批量插入数据
        @Test
        public void testBulkRequest() throws IOException {
            BulkRequest bulkRequest = new BulkRequest();
            bulkRequest.timeout("10s");
            ArrayList<User> users = new ArrayList<>();
            users.add(new User("zhangsan", 1));
            users.add(new User("lishi", 12));
            users.add(new User("wangwu", 13));
            users.add(new User("zhaoliu", 14));
            users.add(new User("tianqi", 15));
            for (int i = 0; i < users.size(); i++) {
                bulkRequest.add(
                        new IndexRequest("ywb")
                                .id("" + i + 1)
                                .source(JSON.toJSONString(users.get(i)), XContentType.JSON)
                );
            }
            BulkResponse bulk = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);
            System.out.println(bulk);
    
        }
    }

    四:1实战(爬取京东网页数据,添加至ElasticSearch)

      <!--        解析网页-->
            <dependency>
                <groupId>org.jsoup</groupId>
                <artifactId>jsoup</artifactId>
                <version>1.10.2</version>
            </dependency>

    2:工具类

    /**
     * @author : ywb
     * @createdDate : 2020/5/31
     * @updatedDate
     */
    @Component
    public class HtmlParseUtils {
    
        /**
         * 爬取京东商城搜索数据
         * @param keyWork
         * @return
         * @throws IOException
         */
        public List<Content> listGoods(String keyWork) throws IOException {
            String url = "https://search.jd.com/Search?keyword="+keyWork;
            Document document = Jsoup.parse(new URL(url),30000);
            Element element = document.getElementById("J_goodsList");
            Elements li = element.getElementsByTag("li");
            List<Content> list = new ArrayList<>();
            for (Element elements : li) {
                String img = elements.getElementsByTag("img").eq(0).attr("src");
                String price = elements.getElementsByClass("p-price").eq(0).text();
                String title = elements.getElementsByClass("p-name").eq(0).text();
                Content content = new Content();
                content.setImg(img);
                content.setPrices(price);
                content.setTitle(title);
                list.add(content);
            }
            return list;
        }
    }

    3:service(查询京东数据,存储到ES,并显示页面)

    /**
     * @author : ywb
     * @createdDate : 2020/5/31
     * @updatedDate
     */
    @Service
    public class ContentService {
        @Autowired
        private HtmlParseUtils htmlParseUtils;
    
        @Autowired
        @Qualifier("restHighLevelClient")
        private RestHighLevelClient client;
    
        /**
         * 插叙数据
         * @param keyWords
         * @return
         * @throws IOException
         */
        public Boolean parseContent(String keyWords) throws IOException {
            List<Content> contents = htmlParseUtils.listGoods("java");
            BulkRequest bulkRequest = new BulkRequest();
            bulkRequest.timeout("2m");
            for (int i = 0; i < contents.size(); i++) {
                bulkRequest.add(
                        new IndexRequest("jd")
                        .source(JSON.toJSONString(contents.get(i)), XContentType.JSON));
    
            }
            BulkResponse bulk = client.bulk(bulkRequest, RequestOptions.DEFAULT);
            return !bulk.hasFailures();
        }
    
        /**
         * 分页查询ElasticSearch数据,并高亮显示
         * @param keyWords
         * @param pageNum
         * @param pageSize
         * @return
         * @throws IOException
         */
        public List<Map<String,Object>> searchPage(String keyWords,Integer pageNum,Integer pageSize) throws IOException {
            if(pageNum<=1){
                pageNum=1;
            }
            SearchRequest searchRequest = new SearchRequest("jd");
            SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
            //分页
            sourceBuilder.from(pageNum);
            sourceBuilder.size(pageSize);
            TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("title",keyWords);
            sourceBuilder.query(termQueryBuilder);
            sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
    
            //高亮
            HighlightBuilder highlightBuilder = new HighlightBuilder();
            highlightBuilder.field("title");
            //多个高亮显示设置为true
            highlightBuilder.requireFieldMatch(false);
            highlightBuilder.preTags("<span style = 'color:red'>");
            highlightBuilder.postTags("</span>");
            sourceBuilder.highlighter(highlightBuilder);
            //执行搜索
            searchRequest.source(sourceBuilder);
            SearchResponse search = client.search(searchRequest, RequestOptions.DEFAULT);
            //解析结果
            List<Map<String, Object>> maps = new ArrayList<>();
            for (SearchHit hit : search.getHits().getHits()) {
                Map<String, HighlightField> highlightFields = hit.getHighlightFields();
                HighlightField title = highlightFields.get("title");
                //原来的结果
                Map<String, Object> sourceAsMap = hit.getSourceAsMap();
                //解析高亮字段 ,将原来的字段换成高亮字段显示
                if(title!=null){
                    Text[] fragments = title.fragments();
                    String newTitle = "";
                    for (Text fragment : fragments) {
                        newTitle+=fragment;
                    }
                    //高亮字段替换原来内容
                    sourceAsMap.put("title",newTitle);
                }
                maps.add(hit.getSourceAsMap());
            }
            return maps;
        }
    }

    结果:

     

  • 相关阅读:
    asp.net core 2.0的认证和授权
    数据库性能优化详解
    StringUtils.defaultIfBlank
    SQL优化(二) 快速计算Distinct Count
    SQL语句中Left join,right join,inner join用法
    sql中的limit关键字
    多线程之间的资源共享
    面试长谈的String,StringBuffer,StringBuilder三兄弟有啥区别
    关于java中的值传递与引用传递遇到的问题
    Struts1和Struts2的区别和对比:
  • 原文地址:https://www.cnblogs.com/ywbmaster/p/13026742.html
Copyright © 2011-2022 走看看