zoukankan      html  css  js  c++  java
  • sphinx增量索引

    首先建立一个计数表,保存数据表的最新记录ID

    CREATE TABLE `sph_counter` (
      `id` int(11) unsigned NOT NULL,
      `max_id` int(11) unsigned NOT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='sphinx增量表最大记录数';

    #定义主索引源

    source test
    {
        type                    = mysql
        sql_host                = localhost
        sql_user                = root
        sql_pass                = 8888
        sql_db                    = test
        sql_port                = 3306
        sql_query_pre            = SET NAMES utf8
        sql_query_pre           = REPLACE INTO sph_counter SELECT 1, MAX(id) FROM test where status=1 #取最大记录数

        sql_query = select id from test where id<(select max_id from sph_counter where id=1) and  status = 1

     ##如果这里不加id<的条件,合并索引时会报字段数不匹配的错误

     #FATAL: failed to merge index 'test_delta' into index 'test': fulltext fields count mismatch (me=/usr/local/sphinx/var/data/test, in=/usr/local/sphinx/var/data/test_delta, myfields=4, infields=5)
        sql_query_info = select * from test where id = $id
    }

    #增量索引数据源定义
    source test_delta : test
    {
            sql_query_pre = SET NAMES utf8
            sql_query = select * from test  where id>=(select max_id from sph_counter where id=1) and status = 1
            sql_query_info = select * from test where id = $id

    }

    #定义主索引

    index test
    {
        source            = test            #对应的source名称
        path            = /usr/local/sphinx/var/data/test #请修改为实际使用的绝对路径,例如:/usr/local/coreseek/var/...
        docinfo            = extern
        mlock            = 0
        morphology        = none
        min_word_len        = 2
        html_strip                = 1

        #中文分词配置,详情请查看:http://www.coreseek.cn/products-install/coreseek_mmseg/
        charset_dictpath = /usr/local/mmseg/etc/ #BSD、Linux环境下设置,/符号结尾
        #charset_dictpath = etc/                             #Windows环境下设置,/符号结尾,最好给出绝对路径,例如:C:/usr/local/coreseek/etc/...
        charset_type        = zh_cn.utf-8
    }
    #定义增量索引
    index test_delta:test
    {
        source            = test_delta            #对应的source名称
        path            = /usr/local/sphinx/var/data/test_delta #请修改为实际使用的绝对路径,例如:/usr/local/coreseek/var/...
        docinfo            = extern
        mlock            = 0
        morphology        = none
        min_word_len        = 2
        html_strip                = 1

        #中文分词配置,详情请查看:http://www.coreseek.cn/products-install/coreseek_mmseg/
        charset_dictpath = /usr/local/mmseg/etc/ #BSD、Linux环境下设置,/符号结尾
        #charset_dictpath = etc/                             #Windows环境下设置,/符号结尾,最好给出绝对路径,例如:C:/usr/local/coreseek/etc/...
        charset_type        = zh_cn.utf-8
    }

    #全局index定义
    indexer
    {
        mem_limit            = 128M
    }

    #searchd服务定义
    searchd
    {
        listen                  =   9312
        read_timeout        = 5
        max_children        = 30
        max_matches            = 1000
        seamless_rotate        = 0
        preopen_indexes        = 0
        unlink_old            = 1
        pid_file = /usr/local/sphinx/var/log/searchd_mysql.pid  #请修改为实际使用的绝对路径,例如:/usr/local/coreseek/var/...
        log = /usr/local/sphinx/var/log/searchd_mysql.log        #请修改为实际使用的绝对路径,例如:/usr/local/coreseek/var/...
        query_log = /usr/local/sphinx/var/log/query_mysql.log #请修改为实际使用的绝对路径,例如:/usr/local/coreseek/var/...
        binlog_path =                                #关闭binlog日志
    }

    保存配置文件后退出,先停止searchd进程再启动,然后重新生成索引。

    停止进程
    /usr/local/sphinx/bin/searchd -c /usr/local/sphinx/etc/csft.conf --stop

    启动进程
    /usr/local/sphinx/bin/searchd -c /usr/local/sphinx/etc/csft.conf

    重新生成所有索引
    /usr/local/sphinx/bin/indexer -c /usr/local/sphinx/etc/csft.conf --all --rotate
    增量索引
    /usr/local/sphinx/bin/indexer -c /usr/local/sphinx/etc/csft.conf test_delta --rotate
    合并索引
    /usr/local/sphinx/bin/indexer -c /usr/local/sphinx/etc/csft.conf --merge test test_delta --rotate

    如果合并索引时出现下面问题:

    FATAL: failed to merge index 'test_delta' into index 'test': source index preload failed: failed to open /usr/local/sphinx/var/data/test_delta.sph: No such file or directory

    停止searchd进程,然后重新启动searchd进程。

    增量索引可以放在crontab里根据需要设置几分钟运行一次,然后执行索引合并,至于主索引重建可以选择在访问量不大或者半夜运行。

    ##每5分钟运行增量索引

    */5 * * * /usr/local/sphinx/bin/indexer -c /usr/local/sphinx/etc/csft.conf test_delta --rotate > /dev/null 2>&1

    ##每10分钟执行一次增量索引合并

    */10 * * * /usr/local/sphinx/bin/indexer -c /usr/local/sphinx/etc/csft.conf --merge test test_delta --rotate

    ##凌晨0点5分重新建立主索引

    5 0 * * * /usr/local/sphinx/bin/indexer -c /usr/local/sphinx/etc/csft.conf --all --rotate > /dev/null 2>&1

  • 相关阅读:
    Laravel自定义分页样式
    mysql中 key 、primary key 、unique key 和 index 有什么不同
    PHP RSA公私钥的理解和示例说明
    PHP操作Excel – PHPExcel 基本用法
    Yii 1.1 常规框架部署和配置
    阿里云服务器 Ubuntu 安装 LNMP
    全国地区sql表
    十道海量数据处理面试题与十个方法大总结
    Hibernate中对象的三种状态以及Session类中saveOrUpdate方法与merge方法的区别
    乐观锁与悲观锁——解决并发问题
  • 原文地址:https://www.cnblogs.com/latma/p/6019834.html
Copyright © 2011-2022 走看看