zoukankan      html  css  js  c++  java
  • [转]Lucene 性能优化带数据

    虽然是很久了的数据,还是有很好的参考价值的:

    lucene.commit.batch.size=0
    lucene.commit.time.interval=0

    These properties allow commits in batch, you can either set how many document changes a batch will contain (commit will happen after X docs are modified) or set a time interval in milliseconds (commit will happen every X milliseconds).

    lucene.buffer.size=16

    This will call IndexWriter's setRAMBufferSizeMB method, this the max memory in megabytes used by Lucene before flushing documents, a higher number means less disk writes.

    These 3 new properties will make reindexing faster, I've made stress tests and I could perceive a 30% improvement in certain configurations, I also learned the following while tweaking Lucene properties:

    A higher lucene.buffer.size helps a lot during reindex, but there is a up bound limit, over that limit, reindex won't get faster, for example, setting to 32MB or 48MB gave the same results, but 32MB was much better than the default, 16MB.

    It's a bad idea to set lucene.merge.factor to a very high number, there will be much more disk write accesses and this will degrade performance, keep it at 10.

    Set lucene.autocommit.documents.interval to the number of documents you have in the index, this means only 1 commit will happen. I thought this property would bring better performance results, but it made it faster around 15% only.

    Setting a higher lucene.optimize.interval can make some improvement, but since the reindex process also make searches, it's important that you also optimize the index often during reindex, you need to find a balance.

    My stress tests consisted in reindexing 30.000 blog entries, I tested on a Intel Quadcore 2.66GHz, 4MB of RAM, Ubuntu 32 bits. The best result was around 4 minutes with this configuration:

    lucene.commit.time.interval=30000
    lucene.merge.factor=10
    lucene.optimize.interval=10000
    lucene.buffer.size=48

    The worst result was 7:35 minutes (not considering the one I set lucene.merge.factor to 1000):

    lucene.commit.time.interval=0
    lucene.merge.factor=50
    lucene.optimize.interval=1000
    lucene.buffer.size=16

    Here are the other results:

    lucene.commit.time.interval=0
    lucene.merge.factor=15
    lucene.optimize.interval=30000
    lucene.buffer.size=16
    7:30 minutes

    lucene.commit.time.interval=0
    lucene.merge.factor=10
    lucene.optimize.interval=100
    lucene.buffer.size=16
    7:18 minutes

    lucene.commit.time.interval=10000
    lucene.merge.factor=10
    lucene.optimize.interval=100
    lucene.buffer.size=16
    06:23 minutes

    lucene.commit.time.interval=1000
    lucene.merge.factor=10
    lucene.optimize.interval=100
    lucene.buffer.size=16
    6:00 minutes

    lucene.commit.time.interval=30000
    lucene.merge.factor=10
    lucene.optimize.interval=100
    lucene.buffer.size=32
    5:00 minutes

    lucene.commit.time.interval=30000
    lucene.merge.factor=10
    lucene.optimize.interval=100
    lucene.buffer.size=48
    5:00 minutes

    lucene.commit.time.interval=30000
    lucene.merge.factor=10
    lucene.optimize.interval=50000
    lucene.buffer.size=48
    5:00 minutes

    lucene.commit.time.interval=15000
    lucene.merge.factor=10
    lucene.optimize.interval=10000
    lucene.buffer.size=48
    5:00 minutes

    lucene.commit.time.interval=30000
    lucene.merge.factor=10
    lucene.optimize.interval=1000
    lucene.buffer.size=48
    4:30 minutes

     

  • 相关阅读:
    视觉三维重建中不同三角网格视角的选择
    最小二乘求解常数k使得kx=y(x,y为列向量)
    STL常用
    2D-2D:对极几何 基础矩阵F 本质矩阵E 单应矩阵H
    Ubuntu常用软件
    ubuntu linux 安装分区
    单向链表
    1.ssm web项目中的遇到的坑--自定义JQuery插件(slide menu)
    模板引擎freemarker的使用(二)
    模板引擎freemarker的使用(一)
  • 原文地址:https://www.cnblogs.com/jinzhao/p/2444440.html
Copyright © 2011-2022 走看看