lucene文章 - 走看看

zoukankan html css js c++ java

lucene文章

创建lucene的代码为：

public static void index() throws CorruptIndexException,

                 LockObtainFailedException, IOException {

      File file=new File(“c://index”) //表示创建索引的目录

           if (!file.exists()) {    //如果文件夹不存在

                 file.mkdir();   //创建文件夹

           }

           IndexWriter writer = new IndexWriter(file, new PaodingAnalyzer(),true); //创建一个写索引的类 表示索引写到file目录下使用庖丁分词器（第三方分词器）true 表示如果存在索引就覆盖

           Document document = new Document();//创建一个document对象 Document就相当于一个对象

           String s = new String("好人一个");

           Field field = new Field("name", s, Field.Store.YES,

                      Field.Index.ANALYZED);//表示定义一个字段名字为name值为 String类型的s



           Document document1 = new Document();//再次创建一个对象

           String s1 = new String("abcdef好人吗jjj");

           Field field1 = new Field("name", s1, Field.Store.YES,

                      Field.Index.ANALYZED); //同时在对象中也方法字符串S1

           document.add(field); 将字段1添加到document

           document1.add(field1); 将字段2添加到document1

           writer.addDocument(document);

           writer.addDocument(document1);

将document对象写入文件 并且关闭 writer

           writer.close();

      }



读取lucene的代码为：

IndexSearcher searcher=new IndexSearcher("c://index");//创建一个索引查询类指向查询的目录

           QueryParser queryParser = new QueryParser("name",new PaodingAnalyzer()); //查询需要一个分析器去调用分词器去查找索引里面所有是name的Field

           Query query = queryParser.parse("l* OR h*");

     //查询的结果是name的Field的值必须是以l开头或者以h开头

           Term word1 = new Term("name", "*好");

           WildcardQuery query1 = new WildcardQuery(word1);

           TopDocs doc=searcher.search(query1,20);

           ScoreDoc[] docArray=doc.scoreDocs;

           for(ScoreDoc d :docArray)

           {

                 Document document=searcher.doc(d.doc);

                 System.out.println(document.getField("name").stringValue());

           }

分词器 目前比较流行的有 IkAnalyzer，和paoding两种

分词器分解出来的次必须是 有意见的词

比如

我是一头狼

分词器就会分解如下

我 | 是 | 一头 | 狼     ---1

如果你搜索 ‘我’ 或则 ‘是’ 或则 ‘一头’或则‘狼’ 可以查出上面的结果

如果 你输入的是‘头狼’ 那么久没有结果输出 因为分词器 分解的

---1的结果 然后再去匹配

查看全文

相关阅读:
how to .bson file into mongodb
celery 源生框架项目
 @property装饰器将getter方法变成了属性
 类继承实现--停车场可用车位
 vue 基于 webpack 中使用 ECharts
windows10 docker 从C盘迁移到其他盘
 python 布隆过滤器的下载使用
 深入系统同步锁机制 GIL
男神鹏：ubantu 18.0.4 安装go 1.10.4 和环境变量的配置以及卸载
 男神鹏：命令 'ls' 可在 '/bin/ls' 处找到

原文地址：https://www.cnblogs.com/liaomin416100569/p/9332068.html