zoukankan      html  css  js  c++  java
  • Lucene全文检索的【增、删、改、查】 实例

      创建索引

    Lucene在进行创建索引时,根据前面一篇博客,已经讲完了大体的流程,这里再简单说下:

    Directory directory = FSDirectory.open("/tmp/testindex");
    IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_CURRENT, analyzer);
    IndexWriter iwriter = new IndexWriter(directory, config);
    Document doc = new Document();
    String text = "This is the text to be indexed.";
    doc.add(new Field("fieldname", text, TextField.TYPE_STORED)); iwriter.close();

      1 创建Directory,获取索引目录

      2 创建词法分析器,创建IndexWriter对象

      3 创建document对象,存储数据

      4 关闭IndexWriter,提交

    /**
         * 建立索引
         * 
         * @param args
         */
        public static void index() throws Exception {
            
            String text1 = "hello,man!";
            String text2 = "goodbye,man!";
            String text3 = "hello,woman!";
            String text4 = "goodbye,woman!";
            
            Date date1 = new Date();
            analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
            directory = FSDirectory.open(new File(INDEX_DIR));
    
            IndexWriterConfig config = new IndexWriterConfig(
                    Version.LUCENE_CURRENT, analyzer);
            indexWriter = new IndexWriter(directory, config);
    
            Document doc1 = new Document();
            doc1.add(new TextField("filename", "text1", Store.YES));
            doc1.add(new TextField("content", text1, Store.YES));
            indexWriter.addDocument(doc1);
            
            Document doc2 = new Document();
            doc2.add(new TextField("filename", "text2", Store.YES));
            doc2.add(new TextField("content", text2, Store.YES));
            indexWriter.addDocument(doc2);
            
            Document doc3 = new Document();
            doc3.add(new TextField("filename", "text3", Store.YES));
            doc3.add(new TextField("content", text3, Store.YES));
            indexWriter.addDocument(doc3);
            
            Document doc4 = new Document();
            doc4.add(new TextField("filename", "text4", Store.YES));
            doc4.add(new TextField("content", text4, Store.YES));
            indexWriter.addDocument(doc4);
            
            indexWriter.commit();
            indexWriter.close();
    
            Date date2 = new Date();
            System.out.println("创建索引耗时:" + (date2.getTime() - date1.getTime()) + "ms
    ");
        }

      增量添加索引

    Lucene拥有增量添加索引的功能,在不会影响之前的索引情况下,添加索引,它会在何时的时机,自动合并索引文件。

    /**
         * 增加索引
         * 
         * @throws Exception
         */
        public static void insert() throws Exception {
            String text5 = "hello,goodbye,man,woman";
            Date date1 = new Date();
            analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
            directory = FSDirectory.open(new File(INDEX_DIR));
    
            IndexWriterConfig config = new IndexWriterConfig(
                    Version.LUCENE_CURRENT, analyzer);
            indexWriter = new IndexWriter(directory, config);
    
            Document doc1 = new Document();
            doc1.add(new TextField("filename", "text5", Store.YES));
            doc1.add(new TextField("content", text5, Store.YES));
            indexWriter.addDocument(doc1);
    
            indexWriter.commit();
            indexWriter.close();
    
            Date date2 = new Date();
            System.out.println("增加索引耗时:" + (date2.getTime() - date1.getTime()) + "ms
    ");
        }

      删除索引

    Lucene也是通过IndexWriter调用它的delete方法,来删除索引。我们可以通过关键字,删除与这个关键字有关的所有内容。如果仅仅是想要删除一个文档,那么最好就顶一个唯一的ID域,通过这个ID域,来进行删除操作。

    /**
         * 删除索引
         * 
         * @param str 删除的关键字
         * @throws Exception
         */
        public static void delete(String str) throws Exception {
            Date date1 = new Date();
            analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
            directory = FSDirectory.open(new File(INDEX_DIR));
    
            IndexWriterConfig config = new IndexWriterConfig(
                    Version.LUCENE_CURRENT, analyzer);
            indexWriter = new IndexWriter(directory, config);
            
            indexWriter.deleteDocuments(new Term("filename",str));  
            
            indexWriter.close();
            
            Date date2 = new Date();
            System.out.println("删除索引耗时:" + (date2.getTime() - date1.getTime()) + "ms
    ");
        }

      更新索引

    Lucene没有真正的更新操作,通过某个fieldname,可以更新这个域对应的索引,但是实质上,它是先删除索引,再重新建立的。

    /**
         * 更新索引
         * 
         * @throws Exception
         */
        public static void update() throws Exception {
            String text1 = "update,hello,man!";
            Date date1 = new Date();
             analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
             directory = FSDirectory.open(new File(INDEX_DIR));
    
             IndexWriterConfig config = new IndexWriterConfig(
                     Version.LUCENE_CURRENT, analyzer);
             indexWriter = new IndexWriter(directory, config);
             
             Document doc1 = new Document();
            doc1.add(new TextField("filename", "text1", Store.YES));
            doc1.add(new TextField("content", text1, Store.YES));
            
            indexWriter.updateDocument(new Term("filename","text1"), doc1);
            
             indexWriter.close();
             
             Date date2 = new Date();
             System.out.println("更新索引耗时:" + (date2.getTime() - date1.getTime()) + "ms
    ");
        }

      通过索引查询关键字

    Lucene的查询方式有很多种,这里就不做详细介绍了。它会返回一个ScoreDoc的集合,类似ResultSet的集合,我们可以通过域名获取想要获取的内容。

    /**
         * 关键字查询
         * 
         * @param str
         * @throws Exception
         */
        public static void search(String str) throws Exception {
            directory = FSDirectory.open(new File(INDEX_DIR));
            analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
            DirectoryReader ireader = DirectoryReader.open(directory);
            IndexSearcher isearcher = new IndexSearcher(ireader);
    
            QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, "content",analyzer);
            Query query = parser.parse(str);
    
            ScoreDoc[] hits = isearcher.search(query, null, 1000).scoreDocs;
            for (int i = 0; i < hits.length; i++) {
                Document hitDoc = isearcher.doc(hits[i].doc);
                System.out.println(hitDoc.get("filename"));
                System.out.println(hitDoc.get("content"));
            }
            ireader.close();
            directory.close();
        }

      全部代码

    package test;
    
    import java.io.File;
    import java.util.Date;
    import java.util.List;
    
    import org.apache.lucene.analysis.Analyzer;
    import org.apache.lucene.analysis.standard.StandardAnalyzer;
    import org.apache.lucene.document.Document;
    import org.apache.lucene.document.LongField;
    import org.apache.lucene.document.TextField;
    import org.apache.lucene.document.Field.Store;
    import org.apache.lucene.index.DirectoryReader;
    import org.apache.lucene.index.IndexWriter;
    import org.apache.lucene.index.IndexWriterConfig;
    import org.apache.lucene.index.Term;
    import org.apache.lucene.queryparser.classic.QueryParser;
    import org.apache.lucene.search.IndexSearcher;
    import org.apache.lucene.search.Query;
    import org.apache.lucene.search.ScoreDoc;
    import org.apache.lucene.store.Directory;
    import org.apache.lucene.store.FSDirectory;
    import org.apache.lucene.util.Version;
    
    public class TestLucene {
        // 保存路径
        private static String INDEX_DIR = "D:\luceneIndex";
        private static Analyzer analyzer = null;
        private static Directory directory = null;
        private static IndexWriter indexWriter = null;
    
        public static void main(String[] args) {
            try {
    //            index();
                search("man");
    //            insert();
    //            delete("text5");
    //            update();
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
        /**
         * 更新索引
         * 
         * @throws Exception
         */
        public static void update() throws Exception {
            String text1 = "update,hello,man!";
            Date date1 = new Date();
             analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
             directory = FSDirectory.open(new File(INDEX_DIR));
    
             IndexWriterConfig config = new IndexWriterConfig(
                     Version.LUCENE_CURRENT, analyzer);
             indexWriter = new IndexWriter(directory, config);
             
             Document doc1 = new Document();
            doc1.add(new TextField("filename", "text1", Store.YES));
            doc1.add(new TextField("content", text1, Store.YES));
            
            indexWriter.updateDocument(new Term("filename","text1"), doc1);
            
             indexWriter.close();
             
             Date date2 = new Date();
             System.out.println("更新索引耗时:" + (date2.getTime() - date1.getTime()) + "ms
    ");
        }
        /**
         * 删除索引
         * 
         * @param str 删除的关键字
         * @throws Exception
         */
        public static void delete(String str) throws Exception {
            Date date1 = new Date();
            analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
            directory = FSDirectory.open(new File(INDEX_DIR));
    
            IndexWriterConfig config = new IndexWriterConfig(
                    Version.LUCENE_CURRENT, analyzer);
            indexWriter = new IndexWriter(directory, config);
            
            indexWriter.deleteDocuments(new Term("filename",str));  
            
            indexWriter.close();
            
            Date date2 = new Date();
            System.out.println("删除索引耗时:" + (date2.getTime() - date1.getTime()) + "ms
    ");
        }
        /**
         * 增加索引
         * 
         * @throws Exception
         */
        public static void insert() throws Exception {
            String text5 = "hello,goodbye,man,woman";
            Date date1 = new Date();
            analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
            directory = FSDirectory.open(new File(INDEX_DIR));
    
            IndexWriterConfig config = new IndexWriterConfig(
                    Version.LUCENE_CURRENT, analyzer);
            indexWriter = new IndexWriter(directory, config);
    
            Document doc1 = new Document();
            doc1.add(new TextField("filename", "text5", Store.YES));
            doc1.add(new TextField("content", text5, Store.YES));
            indexWriter.addDocument(doc1);
    
            indexWriter.commit();
            indexWriter.close();
    
            Date date2 = new Date();
            System.out.println("增加索引耗时:" + (date2.getTime() - date1.getTime()) + "ms
    ");
        }
        /**
         * 建立索引
         * 
         * @param args
         */
        public static void index() throws Exception {
            
            String text1 = "hello,man!";
            String text2 = "goodbye,man!";
            String text3 = "hello,woman!";
            String text4 = "goodbye,woman!";
            
            Date date1 = new Date();
            analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
            directory = FSDirectory.open(new File(INDEX_DIR));
    
            IndexWriterConfig config = new IndexWriterConfig(
                    Version.LUCENE_CURRENT, analyzer);
            indexWriter = new IndexWriter(directory, config);
    
            Document doc1 = new Document();
            doc1.add(new TextField("filename", "text1", Store.YES));
            doc1.add(new TextField("content", text1, Store.YES));
            indexWriter.addDocument(doc1);
            
            Document doc2 = new Document();
            doc2.add(new TextField("filename", "text2", Store.YES));
            doc2.add(new TextField("content", text2, Store.YES));
            indexWriter.addDocument(doc2);
            
            Document doc3 = new Document();
            doc3.add(new TextField("filename", "text3", Store.YES));
            doc3.add(new TextField("content", text3, Store.YES));
            indexWriter.addDocument(doc3);
            
            Document doc4 = new Document();
            doc4.add(new TextField("filename", "text4", Store.YES));
            doc4.add(new TextField("content", text4, Store.YES));
            indexWriter.addDocument(doc4);
            
            indexWriter.commit();
            indexWriter.close();
    
            Date date2 = new Date();
            System.out.println("创建索引耗时:" + (date2.getTime() - date1.getTime()) + "ms
    ");
        }
    
        /**
         * 关键字查询
         * 
         * @param str
         * @throws Exception
         */
        public static void search(String str) throws Exception {
            directory = FSDirectory.open(new File(INDEX_DIR));
            analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
            DirectoryReader ireader = DirectoryReader.open(directory);
            IndexSearcher isearcher = new IndexSearcher(ireader);
    
            QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, "content",analyzer);
            Query query = parser.parse(str);
    
            ScoreDoc[] hits = isearcher.search(query, null, 1000).scoreDocs;
            for (int i = 0; i < hits.length; i++) {
                Document hitDoc = isearcher.doc(hits[i].doc);
                System.out.println(hitDoc.get("filename"));
                System.out.println(hitDoc.get("content"));
            }
            ireader.close();
            directory.close();
        }
    }
  • 相关阅读:
    apache的httpclient进行http的交互处理
    Java 基础篇之反射
    死磕 java线程系列之创建线程的8种方式
    Spring Boot(三) 使用Lombok
    Spring Boot (七): Mybatis极简配置
    Spring Boot Thymeleaf 实现国际化
    微项目:一步一步带你使用SpringBoot入门(二)
    SSM框架手动实现分页逻辑(非PageHelper)
    Java 基础篇之集合
    一起来学Java注解(Annotation)
  • 原文地址:https://www.cnblogs.com/chen-lhx/p/5586327.html
Copyright © 2011-2022 走看看