zoukankan      html  css  js  c++  java
  • lucene vs zoie

    前段时间使用zoie的perf包内的性能测试代码对lucene和zoie的实时搜索部分做了对比测试,结果出乎我意料,从数据上看,lucene比zoie更适合于一般实时搜索的场景。

    zoie的perf从四个方面来评测:search lancenty, indexing lancenty, indexing event rate, indexing event size。图1为zoie的评测结果,图2为lucene nrt的评测结果。

    Zoie Perf Console 2012-10-09 17-32-50

    图1 zoie测试数据

    Zoie Perf Console 2012-10-09 17-34-29

    图2 lucene nrt 测试数据

    从数据上很容易看出,lucene在搜索响应时间上胜出,而zoie在索引数据时有更好的表现。Mike McCandless在他的一篇博客Lucene's near-real-time search is fast!后的评论回复中解释了nrt和zoie的差别:“

    The biggest difference is that Zoie aims for immediate consistency
    (reopen after every index change & next query), which I think very few
    apps really require, given how fast NRT is.
    Also, NRTCachingDir (caching small segments in RAM) achieves the
    biggest (in my opinion) benefit of Zoie, but with substantially less
    added complexity. Reducing complexity is important because it means
    less risk of bugs; for example, Zoie had some scary corruption bugs,
    which took quite some time to track down; see
    https://issues.apache.org/jira/browse/LUCENE-2729
    The other part of Zoie I remember is deferring resolving deletions to
    Lucene docIDs, and instead using a bloom filter to post-filter
    collected documents. While I understand the motivation for this
    ("immediate consistency") I think it's the wrong tradeoff since it
    necessarily slows down all searching (checking a bloom filter is more
    costly than Lucene's checking a bit set), not to mention the added RAM
    required for the bloom filter.
    Ie, it's better to spend more time during reopen to resolve the
    deletions, so that searches don't slow down.

    总的来说就是zoie的强一致性,推迟删除的特性导致了搜索响应时间比lucene长,而且zoie的特殊设计增加了代码的复杂性,bug难于追踪,而且对使用者来说,文档缺乏且阅读代码费时费力,我猜这也是它没能流行起来的原因之一。类似linkedin这样的频繁更新数据的搜索场景很少见,更一般的情况,lucene nrt足以胜任,所以真心觉得cntv和网易大可不用zoie……

  • 相关阅读:
    Oracle VM VirtualBox安装centos8
    HTML5 离线缓存manifest
    ES6 Proxy函数和对象的增强
    ES6 Map数据结构
    ES6 Set和WeakSet
    ES6Symbol在对象中的应用
    ==,===,与ES6中is()的区别
    ES6对象操作
    ES6函数和数组补漏
    ES6箭头函数
  • 原文地址:https://www.cnblogs.com/nanpo/p/2731713.html
Copyright © 2011-2022 走看看