zoukankan      html  css  js  c++  java
  • Lucene实战之初体验

    前言

         最早做非结构化数据搜索时用的还是lucene.net,一直说在学习java的同时把lucene这块搞一搞,这拖了2年多了,终于开始搞这块了。

    开发环境

        idea2016、lucene6.0、jdk1.8

    使用lucene准备条件

    1、pom.xml

    2、测试数据。  我从博客园首页拿了几条数据,直接写了个伪Dao返回一个List<Blog>

    public List<Blog> listBlogs(){
            List<Blog> list = new ArrayList<Blog>();
    
            Blog b1=new Blog(6538488,"风过无痕-唐","你真的了解volatile吗,关于volatile的那些事","http://www.cnblogs.com/tangyanbo/p/6538488.html");
            Blog b2=new Blog(6541312,"布鲁克石","TypeScript设计模式之职责链、状态","http://www.cnblogs.com/brookshi/p/6541312.html");
            Blog b7=new Blog(6547215,"嵘么么","轻松理解JavaScript闭包","http://www.cnblogs.com/rongmm/p/6547215.html");
            Blog b3=new Blog(6541220,"一线码农","使用Task的一些知识优化了一下同事的多线程协作取消的一串代码","http://www.cnblogs.com/huangxincheng/p/6541220.html");
    
            Blog b4=new Blog(6516387,"xybaby","并发与同步、信号量与管程、生产者消费者问题","http://www.cnblogs.com/xybaby/p/6516387.html");
            Blog b5=new Blog(6532713,"王福朋","深入理解 JavaScript 异步系列(4)—— Generator","http://www.cnblogs.com/wangfupeng1988/p/6532713.html");
            Blog b6=new Blog(6517788,"欢醉","入坑系列之HAProxy负载均衡","http://www.cnblogs.com/zhangs1986/p/6517788.html");
    
            list.add(b1);
            list.add(b2);
            list.add(b3);
            list.add(b4);
            list.add(b5);
            list.add(b6);
            list.add(b7);
    
            return list;
        }
    

    IndexSearcher搜索

    1、搜索体验,基于servlet开发,直接接受路径参数key

    2、先写入索引

    public class IndexWriterServlet extends HttpServlet {
    
        @Override
        protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
            String index_path="d:\indexDir";
            Directory dir=null;
            Analyzer analyzer=null;
            IndexWriterConfig config=null;
            IndexWriter indexWriter=null;
            try {
                dir =new SimpleFSDirectory(Paths.get(index_path));
    
                analyzer=new StandardAnalyzer();
                config =new IndexWriterConfig(analyzer);
    
                indexWriter =new IndexWriter(dir,config);
                BlogDao blogDao=new BlogDao();
                List<Blog> list = blogDao.listBlogs();
    
                for (Blog blog:list){
    
                    Document document=new Document();
                    document.add(new Field("id",blog.getId().toString(),TextField.TYPE_STORED));
                    document.add(new Field("author",blog.getAuthor(),TextField.TYPE_STORED));
                    document.add(new Field("title",blog.getTitle(),TextField.TYPE_STORED));
                    document.add(new Field("url",blog.getUrl(),TextField.TYPE_STORED));
    
                    indexWriter.addDocument(document);
                }
                indexWriter.commit();
                System.out.println("====索引创建完成====");
            }catch (Exception ex){
                ex.printStackTrace();
            }finally {
                if(indexWriter !=null){
                    indexWriter.close();
                }
            }
        }
    
        @Override
        protected void doPost(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
            doGet(req,resp);
        }
    }
    

    2、查询索引数据

    public class IndexReaderServlet extends HttpServlet {
        @Override
        protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
            Directory dir =null;
            DirectoryReader reader =null;
            IndexSearcher searcher=null;
            String index_path="d:\indexDir";
            try{
                //dir =new RAMDirectory();
                dir =new SimpleFSDirectory(Paths.get(index_path));
                reader = DirectoryReader.open(dir);
                searcher=new IndexSearcher(reader);
    
                String key = req.getParameter("key");
                Query query = new TermQuery(new Term("title",key));
    
                TopDocs rs=searcher.search(query,100);
                resp.setContentType("text/html");
                resp.setCharacterEncoding("UTF-8");
                PrintWriter out = resp.getWriter();
    
                StringBuilder strHtml =new StringBuilder();
                for (int i = 0; i < rs.scoreDocs.length; i++) {
                    Document firstHit = searcher.doc(rs.scoreDocs[i].doc);
                    strHtml.append("title:"+firstHit.getField("title").stringValue()+"<br>");
                    strHtml.append("author:"+firstHit.getField("author").stringValue()+"<br>");
                    strHtml.append("<a href='"+firstHit.getField("url").stringValue()+"'>"+ firstHit.getField("url").stringValue()+"</a><br>");
                    strHtml.append("====================================================<br>");
                }
    
    
                out.write(strHtml.toString());
    
            }catch (Exception ex){
                ex.printStackTrace();
            }finally {
                reader.close();
            }
        }
    
        @Override
        protected void doPost(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
            doGet(req,resp);
        }
    }
    

      

    总结

         读写索引用到的几个关键类:IndexWriter、IndexSearcher、Directory、DirectoryReader、Document、TermQuery、Field

  • 相关阅读:
    从零开始——PowerShell应用入门(全例子入门讲解)
    详解C# Tuple VS ValueTuple(元组类 VS 值元组)
    How To Configure VMware fencing using fence_vmware_soap in RHEL High Availability Add On——RHEL Pacemaker中配置STONITH
    DB太大?一键帮你收缩所有DB文件大小(Shrink Files for All Databases in SQL Server)
    SQL Server on Red Hat Enterprise Linux——RHEL上的SQL Server(全截图)
    SQL Server on Ubuntu——Ubuntu上的SQL Server(全截图)
    微软SQL Server认证最新信息(17年5月22日更新),感兴趣的进来看看哟
    Configure Always On Availability Group for SQL Server on RHEL——Red Hat Enterprise Linux上配置SQL Server Always On Availability Group
    3分钟带你了解PowerShell发展历程——PowerShell各版本资料整理
    由Find All References引发的思考。,
  • 原文地址:https://www.cnblogs.com/sword-successful/p/6547965.html
Copyright © 2011-2022 走看看