zoukankan      html  css  js  c++  java
  • java lucene2.0

    最近因工作需要,需要学习java lucene2.0,刚开始学习,长话短说,记录下来!!

    1、下载lucene2.0
    http://lucene.apache.org/

    http://archive.apache.org/dist/lucene/java/

    lucene-2.0.0.zip
    lucene-core-2.0.0.jar

    2、设置环境变量CLASSPATH

    /home/tomcat/lucene-core-2.0.0.jar

    3、写个创建索引小程序练习,写程序,编译程序,运行程序,一气呵成。

    #vi IndexTest.java

    import java.io.File;
    import java.io.FileReader;
    import java.io.Reader;
    import java.io.IOException;
    import java.util.Date;
    import org.apache.lucene.analysis.Analyzer;
    import org.apache.lucene.analysis.standard.StandardAnalyzer;
    import org.apache.lucene.document.Document;
    import org.apache.lucene.document.Field;
    import org.apache.lucene.index.IndexWriter;


    public class IndexTest {
            public static void main(String[] args) throws Exception{
                    Date start = new Date();

                    try {
                            IndexWriter indexWriter = new IndexWriter("/lucene_test/data/", new StandardAnalyzer(), true);

                            System.out.println("Indexing file /lucene_test/doc/1.txt");

                            Document document = new Document();
                            Reader reader = new FileReader("/lucene_test/doc/1.txt");

                            String path = "/lucene_test/doc/1.txt";

                            //document.add(new Field("path", path));
                            document.add(new Field("path", path, Field.Store.YES, Field.Index.NO));
                            document.add(new Field("contents", reader));

                            indexWriter.addDocument(document);
                            indexWriter.optimize();
                            indexWriter.close();
                    } catch(IOException e) {
                            e.printStackTrace();
                    }

                    Date end = new Date();

                    System.out.print(end.getTime() - start.getTime());
                    System.out.println(" total milliseconds");
            }
    }

    #javac IndexTest.java

    #java IndexTest

    4、写一个搜索小程序

    #vi SearchTest.java

    import java.io.File;
    import org.apache.lucene.document.Document;
    import org.apache.lucene.index.Term;
    import org.apache.lucene.search.Hits;
    import org.apache.lucene.search.IndexSearcher;
    import org.apache.lucene.search.TermQuery;
    import org.apache.lucene.store.FSDirectory;

    public class SearchTest {
            public static void main(String[] args) throws Exception {
                    File indexDir = new File("/root/coffee/lucene_test/data/");
                    FSDirectory directory = FSDirectory.getDirectory(indexDir, false);
                    IndexSearcher searcher = new IndexSearcher(directory);
                    if(!indexDir.exists()) {
                            System.out.println("index is not exist");
                            return;
                    }

                    Term term = new Term("contents", "twinkle");
                    TermQuery query = new TermQuery(term);
                    Hits hits = searcher.search(query);
                    for(int i=0; i<hits.length(); i++) {
                            Document document = hits.doc(i);
                            System.out.println("File: " + document.get("path") + " " + String.valueOf(document.getBoost()));
                    }
            }
    }

    5、本周没时间进一步深入学习了,计划下周好好学习下lucene文档分数setBoost的控制

    参考站点:

    http://www.chedong.com/tech/lucene.html Lucene:基于Java的全文检索引擎简介

    http://www.javaeye.com/topic/33241 lucene 入门

    http://kb.cnblogs.com/b/243888/ Lucene源码分析笔记之[org.apache.lucene.document](四)

    征服Ajax+Lucene—构建搜索引擎

  • 相关阅读:
    Fibonacci数列的递推公式为:Fn=Fn-1+Fn-2,其中F1=F2=1。
    ps中的中英文对照
    2019.6.27 oracle复习 表空间
    pthon学习笔记 2020/4/6
    运维岗位发展方向
    sql server复习重点
    linux的shell script
    linux知识扫盲
    Android Studio 三、软件学习教程-知识点
    Android Studio 二、github项目下载 2019.8.23
  • 原文地址:https://www.cnblogs.com/coffee_cn/p/1631447.html
Copyright © 2011-2022 走看看