zoukankan      html  css  js  c++  java
  • Lucene-高效高亮搜索技术

      环境:jdk8或更高版本

      参考链接:1.   how2j-lucene

            2  . txt导入mysql数据 

      实现效果:能高效搜索(较普通数据库搜索,然后将关键字标红,效果可以放到html中查看)

      性能对比:1.能将不同相关度的结果都查询出来,而like模糊查询就做不到这一点

           2.如果数据量很大,比如下面我拿了14万条数据对比,时间差距还是很大的

     

    Java测试文件:

     1.mysql数据   :下载链接 (里面包含txt数据和mysql建表源代码)

     2. jar包下载:  https://files.cnblogs.com/files/meditation5201314/lucene-lib.rar 

     3.Java文件:

     1 package com.empirefree.lucene;
     2 /**
     3 * @author Empirefree 胡宇乔:
     4 * @version 创建时间:2020年3月31日 下午5:48:13
     5 */
     6 public class Product {
     7     int id;
     8     String name;
     9     String category;
    10     float price;
    11     String place;
    12  
    13     String code;
    14     public int getId() {
    15         return id;
    16     }
    17     public void setId(int id) {
    18         this.id = id;
    19     }
    20     public String getName() {
    21         return name;
    22     }
    23     public void setName(String name) {
    24         this.name = name;
    25     }
    26     public String getCategory() {
    27         return category;
    28     }
    29     public void setCategory(String category) {
    30         this.category = category;
    31     }
    32     public float getPrice() {
    33         return price;
    34     }
    35     public void setPrice(float price) {
    36         this.price = price;
    37     }
    38     public String getPlace() {
    39         return place;
    40     }
    41     public void setPlace(String place) {
    42         this.place = place;
    43     }
    44  
    45     public String getCode() {
    46         return code;
    47     }
    48     public void setCode(String code) {
    49         this.code = code;
    50     }
    51     @Override
    52     public String toString() {
    53         return "Product [id=" + id + ", name=" + name + ", category=" + category + ", price=" + price + ", place="
    54                 + place + ", code=" + code + "]";
    55     }
    56 }
    Product.java
      1 package com.empirefree.lucene;
      2 
      3 import java.io.File;
      4 import java.io.IOException;
      5 import java.sql.Connection;
      6 import java.sql.DriverManager;
      7 import java.sql.SQLException;
      8 import java.sql.Statement;
      9 import java.util.ArrayList;
     10 import java.util.List;
     11 
     12 import org.apache.commons.io.FileUtils;
     13 import com.empirefree.lucene.JdbcConnection;
     14 import com.mysql.jdbc.ResultSet;
     15 
     16 /**
     17 * @author Empirefree 胡宇乔:
     18 * @version 创建时间:2020年3月31日 下午5:49:56
     19 */
     20 public class ProductUtil {
     21     private static final String URL="jdbc:mysql://127.0.0.1:3306/campus_system?useUnicode=true&characterEncoding=utf-8";
     22     private static final String USER="root";
     23     private static final String PASSWORD="root";
     24     
     25     private static Connection connection=null;
     26     
     27     static {
     28         try {
     29             //1.加载驱动程序
     30             Class.forName("com.mysql.jdbc.Driver");
     31             //2.获得数据库的连接
     32             connection=DriverManager.getConnection(URL, USER, PASSWORD);
     33         } catch (ClassNotFoundException e) {
     34             e.printStackTrace();
     35         } catch (SQLException e) {
     36             e.printStackTrace();
     37         }
     38     }
     39     
     40     
     41     public static Product lineproduct(String line) {
     42         Product p = new Product();
     43         String[] fields = line.split(",");
     44         p.setId(Integer.parseInt(fields[0]));
     45         p.setName(fields[1]);
     46         p.setCategory(fields[2]);
     47         p.setPrice(Float.parseFloat(fields[3]));
     48         p.setPlace(fields[4]);
     49         p.setCode(fields[5]);
     50         
     51         return p;
     52     }
     53     
     54     public static List<Product> filelist(String filename) throws IOException {
     55         File file = new File(filename);
     56         List<String> lines = FileUtils.readLines(file, "UTF-8");
     57         List<Product> products = new ArrayList<>();
     58         for(String line : lines){
     59             Product p = lineproduct(line);
     60             products.add(p);
     61         }
     62         return products;
     63     }
     64     public static List<Product> mysqllist(){
     65 //        Connection connection = new JdbcConnection().getConnection();
     66         Statement statement = null;
     67         List<Product>products = new ArrayList<>();
     68 
     69         try {
     70             //执行数据库操作语句(注意是包sql,不是mysql)
     71             statement = connection.createStatement();
     72             
     73             String sql = "select * from product";
     74             ResultSet resultSet = (ResultSet) statement.executeQuery(sql);
     75             while (resultSet.next()) {
     76                 Product product = new Product();
     77                 product.setId(resultSet.getInt("id"));
     78                 product.setName(resultSet.getString("name"));
     79                 product.setCategory(resultSet.getString("category"));
     80                 product.setPrice(resultSet.getFloat("price"));
     81                 product.setPlace(resultSet.getString("place"));
     82                 product.setCode(resultSet.getString("code"));
     83                 products.add(product);
     84             }
     85             
     86         } catch (SQLException e) {
     87             // TODO Auto-generated catch block
     88             e.printStackTrace();
     89         } finally {
     90             //数据库连接关闭:先关闭statement,后关闭connection
     91             if (statement != null) {
     92                 try {
     93                     statement.close();
     94                 } catch (SQLException e2) {
     95                     // TODO: handle exception
     96                     e2.printStackTrace();
     97                 }
     98             }
     99             if (connection != null) {
    100                 try {
    101                     connection.close();
    102                 } catch (SQLException e2) {
    103                     // TODO: handle exception
    104                     e2.printStackTrace();
    105                 }
    106             }
    107         }
    108         return products;
    109     }
    110     
    111     public static List<Product> mysqllist2(String searchname){
    112 //        Connection connection = new JdbcConnection().getConnection();
    113         Statement statement = null;
    114         List<Product>products = new ArrayList<>();
    115 
    116         try {
    117             //执行数据库操作语句(注意是包sql,不是mysql)
    118             statement = connection.createStatement();
    119             
    120             String sql = "select * from product where name like  '%" + searchname + "%'";
    121             ResultSet resultSet = (ResultSet) statement.executeQuery(sql);
    122             while (resultSet.next()) {
    123                 Product product = new Product();
    124                 product.setId(resultSet.getInt("id"));
    125                 product.setName(resultSet.getString("name"));
    126                 product.setCategory(resultSet.getString("category"));
    127                 product.setPrice(resultSet.getFloat("price"));
    128                 product.setPlace(resultSet.getString("place"));
    129                 product.setCode(resultSet.getString("code"));
    130                 products.add(product);
    131             }
    132             
    133         } catch (SQLException e) {
    134             // TODO Auto-generated catch block
    135             e.printStackTrace();
    136         } finally {
    137             //数据库连接关闭:先关闭statement,后关闭connection
    138             if (statement != null) {
    139                 try {
    140                     statement.close();
    141                 } catch (SQLException e2) {
    142                     // TODO: handle exception
    143                     e2.printStackTrace();
    144                 }
    145 //            }
    146 //            if (connection != null) {
    147 //                try {
    148 //                    connection.close();
    149 //                } catch (SQLException e2) {
    150 //                    // TODO: handle exception
    151 //                    e2.printStackTrace();
    152 //                }
    153             }
    154         }
    155         return products;
    156     }
    157     public static void deleteconnection() throws SQLException {
    158         connection.close();
    159     }
    160     
    161     public static void main(String[] args) throws IOException {
    162         String filename = "140k_products.txt";
    163 //        List<Product> products = filelist(filename);
    164         List<Product> products = mysqllist();
    165         for(Product name : products){
    166             System.out.println(name);
    167         }
    168 //        System.out.println(products.size());
    169         
    170     }
    171 }
    ProductUtil.java(与mysql的连接,单独写成一个文件,方便以后调用)
      1 package com.empirefree.lucene;
      2 /**
      3 * @author Empirefree 胡宇乔:
      4 * @version 创建时间:2020年3月31日 下午5:45:39
      5 */
      6 
      7 import java.io.IOException;
      8 import java.io.StringReader;
      9 import java.util.List;
     10 import java.util.Scanner;
     11  
     12 import org.apache.lucene.analysis.TokenStream;
     13 import org.apache.lucene.document.Document;
     14 import org.apache.lucene.document.Field;
     15 import org.apache.lucene.document.TextField;
     16 import org.apache.lucene.index.DirectoryReader;
     17 import org.apache.lucene.index.IndexReader;
     18 import org.apache.lucene.index.IndexWriter;
     19 import org.apache.lucene.index.IndexWriterConfig;
     20 import org.apache.lucene.index.IndexableField;
     21 import org.apache.lucene.queryparser.classic.QueryParser;
     22 import org.apache.lucene.search.IndexSearcher;
     23 import org.apache.lucene.search.Query;
     24 import org.apache.lucene.search.ScoreDoc;
     25 import org.apache.lucene.search.highlight.Highlighter;
     26 import org.apache.lucene.search.highlight.QueryScorer;
     27 import org.apache.lucene.search.highlight.SimpleHTMLFormatter;
     28 import org.apache.lucene.store.Directory;
     29 import org.apache.lucene.store.RAMDirectory;
     30 import org.wltea.analyzer.lucene.IKAnalyzer;
     31 
     32 
     33 public class TestLucene2 {
     34     
     35     private static Directory createIndex(IKAnalyzer analyzer) throws IOException {
     36         Directory index = new RAMDirectory();
     37         IndexWriterConfig config = new IndexWriterConfig(analyzer);
     38         IndexWriter writer = new IndexWriter(index, config);
     39         String fileName = "140k_products.txt";
     40         
     41 //        List<Product> products = ProductUtil.filelist(fileName);
     42         List<Product> products = ProductUtil.mysqllist();
     43         int total = products.size();
     44         int count = 0;
     45         int per = 0;
     46         int oldPer = 0;
     47         for (Product p : products) {
     48             addDoc(writer, p);
     49             count++;
     50             per = count*100/total;
     51             if(per!=oldPer){
     52                 oldPer = per;
     53                 System.out.printf("索引中,总共要添加 %d 条记录,当前添加进度是: %d%% %n",total,per);
     54             }
     55         }
     56         writer.close();
     57         return index;
     58     }
     59  
     60     private static void addDoc(IndexWriter w, Product p) throws IOException {
     61         Document doc = new Document();
     62 //        doc.add(new TextField("id", String.valueOf(p.getId()), Field.Store.YES));
     63         doc.add(new TextField("name", p.getName(), Field.Store.YES));
     64 //        doc.add(new TextField("category", p.getCategory(), Field.Store.YES));
     65 //        doc.add(new TextField("price", String.valueOf(p.getPrice()), Field.Store.YES));
     66 //        doc.add(new TextField("place", p.getPlace(), Field.Store.YES));
     67 //        doc.add(new TextField("code", p.getCode(), Field.Store.YES));
     68         w.addDocument(doc);
     69     }
     70     
     71     private static void showSearchResults(IndexSearcher searcher, ScoreDoc[] hits, Query query, IKAnalyzer analyzer) throws Exception {
     72         System.out.println("找到 " + hits.length + " 个命中.");
     73  
     74         SimpleHTMLFormatter simpleHTMLFormatter = new SimpleHTMLFormatter("<span style='color:red'>", "</span>");
     75         Highlighter highlighter = new Highlighter(simpleHTMLFormatter, new QueryScorer(query));
     76  
     77         System.out.println("找到 " + hits.length + " 个命中.");
     78         System.out.println("序号	匹配度得分	结果");
     79         for (int i = 0; i < hits.length; ++i) {
     80             ScoreDoc scoreDoc= hits[i];
     81             int docId = scoreDoc.doc;
     82             Document d = searcher.doc(docId);
     83             List<IndexableField> fields= d.getFields();
     84             System.out.print((i + 1) );
     85             System.out.print("	" + scoreDoc.score);
     86             for (IndexableField f : fields) {
     87  
     88                 if("name".equals(f.name())){
     89                     TokenStream tokenStream = analyzer.tokenStream(f.name(), new StringReader(d.get(f.name())));
     90                     String fieldContent = highlighter.getBestFragment(tokenStream, d.get(f.name()));
     91                     System.out.print("	"+fieldContent);
     92                     System.out.print("?????????
    ");
     93                 }
     94                 else{
     95                     System.out.print("	"+d.get(f.name()));
     96                 }
     97             }
     98             System.out.println("<br>");
     99         }
    100     }
    101  
    102     
    103     
    104     public static void main(String[] args) throws Exception {
    105         Scanner s = new Scanner(System.in);
    106         System.out.print("请输入查询关键字:");
    107         String keyword = s.nextLine();
    108         System.out.println("当前关键字是:"+keyword);
    109         long startTime = System.currentTimeMillis();
    110         List<Product> products = ProductUtil.mysqllist2(keyword);
    111         long endTime = System.currentTimeMillis();
    112         System.out.println("Like程序运行时间:" + (endTime - startTime) + "ns");
    113         
    114         for(Product name : products){
    115             System.out.println(name.getName());
    116         }
    117        
    118         /******************************************************************************/
    119         // 1. 准备中文分词器
    120         IKAnalyzer analyzer = new IKAnalyzer();
    121         // 2. 索引
    122         Directory index = createIndex(analyzer);
    123         
    124         // 3. 查询器
    125         s = new Scanner(System.in);
    126         System.out.print("请输入查询关键字:");
    127         keyword = s.nextLine();
    128         System.out.println("当前关键字是:"+keyword);
    129         Query query = new QueryParser("name", analyzer).parse(keyword);
    130         
    131         startTime = System.currentTimeMillis();
    132         // 4. 搜索
    133         IndexReader reader = DirectoryReader.open(index);
    134         IndexSearcher searcher=new IndexSearcher(reader);
    135         int numberPerPage = 10;
    136         ScoreDoc[] hits = searcher.search(query, numberPerPage).scoreDocs;
    137         endTime = System.currentTimeMillis();
    138         System.out.println("Lucene程序运行时间:" + (endTime - startTime) + "ns");
    139         
    140         // 5. 显示查询结果
    141         showSearchResults(searcher, hits,query,analyzer);
    142         // 6. 关闭查询
    143         reader.close();
    144        
    145         ProductUtil.deleteconnection();
    146     }
    147 }
    TestLucene2.java-数据库

    TestLucene2.java注意点:

    1.我将Product全提取出来了,如果只需要查name(或者username等更改即可),dou.add就注释掉其他内容

    2.dou.add(中,p.getID()是int就要转成String)

    3.最后输出结果可以用List保存下来,然后前端EL表达式显示即可(也可以控制标题显示数目)

      Lucene讲解:

        1.addDou():将Product赋值,方便后面查询

        2.createIndex():创建索引,同时调用mysqllist()连接数据库(存储数据)和addDou,完成存储数据

        3.showSearchResults():在上面存储数据返回的结果中搜索数据,然后标红.

        详细过程:先是创建内存索引(createIndex()函数,普通like是数据库查询,而Lucene是先加载到内存中,然后再查询,就是加载一次,到处查询的样子),创建内存索引Directory的时候,

    将查询对象属性Product全加载到Document中(这样后面无论查Product的什么内容都可以查,只需要修改name成别的就行)。

    ----------------------------------------------------扩展知识--------------------------------------------------------

    1.mysql连接:普通mysql就是连接,然后close,但是开发时候很多次都要查询,所以就写成static,然后调用deleteconnection就可以删除连接了

    (详细过程见ProductUtil.java)

    2.

     txt导入数据到mysql表中:

    LOAD DATA INFILE 'E:/xxx.txt' 
    REPLACE INTO TABLE test FIELDS TERMINATED BY ',' LINES TERMINATED BY '
    '

    txt数据格式应该如下所示

  • 相关阅读:
    WSP部署错误—SharePoint管理框架中的对象“SPSolutionLanguagePack Name=0”依赖其他不存在的对象
    Elevate Permissions To Modify User Profile
    Error with Stsadm CommandObject reference not set to an instance of an object
    ASP.NET MVC3添加Controller时没有Scaffolding options
    测试使用Windows Live Writer写日志
    配置TFS 2010出现错误—SQL Server 登录的安全标识符(SID)与某个指定的域或工作组帐户冲突
    使用ADO.NET DbContext Generator出现错误—Unable to locate file
    CSS
    HTML DIV标签
    数据库
  • 原文地址:https://www.cnblogs.com/meditation5201314/p/12612057.html
Copyright © 2011-2022 走看看