zoukankan      html  css  js  c++  java
  • sphinx 快速使用

    1. 建立配置文件 例可以参照之前的模板新建一个配置文件 sphinx/etc目录
      #MySQL数据源配置,详情请查看:http://www.coreseek.cn/products-install/mysql/
      #请先将var/test/documents.sql导入数据库,并配置好以下的MySQL用户密码数据库
      
      #源定义
      source mysql
      {
          type                    = mysql
      
          sql_host                = localhost
          sql_user                = root
          sql_pass                = 
          sql_db                    = test
          sql_port                = 3306
          sql_query_pre            = SET NAMES utf8
      
          sql_query                = SELECT id, group_id, UNIX_TIMESTAMP(date_added) AS date_added, title, content FROM documents
                                                                    #sql_query第一列id需为整数
                                                                    #title、content作为字符串/文本字段,被全文索引
          sql_attr_uint            = group_id             #从SQL读取到的值必须为整数
          sql_attr_timestamp        = date_added          #从SQL读取到的值必须为整数,作为时间属性
      
          sql_query_info_pre      = SET NAMES utf8                                        #命令行查询时,设置正确的字符集
          sql_query_info            = SELECT * FROM documents WHERE id=$id                #命令行查询时,从数据库读取原始数据信息
      }
      
      #index定义
      index mysql
      {
          source            = mysql                   #对应的source名称
          path            = C:wampappssphinxvar   #请修改为实际使用的绝对路径,例如:/usr/local/coreseek/var/...
          docinfo            = extern
          mlock            = 0
          morphology        = none
          min_word_len        = 1
          html_strip                = 0
      
          #中文分词配置,详情请查看:http://www.coreseek.cn/products-install/coreseek_mmseg/
          #charset_dictpath = /usr/local/mmseg3/etc/ #BSD、Linux环境下设置,/符号结尾
          charset_dictpath = C:wampappssphinxetc                        #Windows环境下设置,/符号结尾,最好给出绝对路径,例如:C:/usr/local/coreseek/etc/...
          charset_type        = zh_cn.utf-8
      }
      
      #全局index定义
      indexer
      {
          mem_limit            = 128M
      }
      
      #searchd服务定义
      searchd
      {
          listen                  =   9312
          read_timeout        = 5
          max_children        = 30
          max_matches            = 1000
          seamless_rotate        = 0
          preopen_indexes        = 0
          unlink_old            = 1
          pid_file = C:wampappssphinxvarlogsearchd_mysql.pid    #请修改为实际使用的绝对路径,例如:/usr/local/coreseek/var/...
          log = C:wampappssphinxvarlogsearchd_mysql.log         #请修改为实际使用的绝对路径,例如:/usr/local/coreseek/var/...
          query_log = C:wampappssphinxvarlogquery_mysql.log     #请修改为实际使用的绝对路径,例如:/usr/local/coreseek/var/...
          binlog_path =                                               #关闭binlog日志
      }
    2. 把 searchd 服务安装成一个Windows服务:

      c:wampappssphinxin>searchd --install --config c:wampappssphinxetcsphinx_mysql.conf

       

        Coreseek Fulltext 4.0 [ Sphinx 1.11-dev (r2540)]
        Copyright (c) 2007-2011,
        Beijing Choice Software Technologies Inc (http://www.coreseek.com)

        Installing service...
        Service 'searchd' installed succesfully.

        这样 searchd 服务应该出现在“控制面板->系统管理->服务”的列表中了,但还没有被启动,因为在启动它之前,我们还需要做些配置并用indexer 建立索引 . 这些可以参考 快速入门教程.

      3.建立索引

      c:wampappssphinxin>indexer --all --config C:wampappssphinxetcsphinx.conf

      

      indexing index 'mysql'...
      collected 3 docs, 0.0 MB
      sorted 0.0 Mhits, 100.0% done
      total 3 docs, 7545 bytes
      total 0.026 sec, 287198 bytes/sec, 114.19 docs/sec
      total 2 reads, 0.000 sec, 4.2 kb/call avg, 0.0 msec/call avg
      total 9 writes, 0.000 sec, 2.2 kb/call avg, 0.0 msec/call avg

      indexer 直接打入命令,可以查看,帮助选项

    --config <file>         read configuration from specified file
                            (default is csft.conf)
    --all                   reindex all configured indexes
    --quiet                 be quiet, only print errors
    --verbose               verbose indexing issues report
    --noprogress            do not display progress
                            (automatically on if output is not to a tty)
    --buildstops <output.txt> <N>
                            build top N stopwords and write them to given file
    --buildfreqs            store words frequencies to output.txt
                            (used with --buildstops only)
    --merge <dst-index> <src-index>
                            merge 'src-index' into 'dst-index'
                            'dst-index' will receive merge result
                            'src-index' will not be modified
    --merge-dst-range <attr> <min> <max>
                            filter 'dst-index' on merge, keep only those documents
                            where 'attr' is between 'min' and 'max' (inclusive)
    --merge-klists
    --merge-killlists       merge src and dst kill-lists (default is to
                            apply src kill-list to dst index)
    --dump-rows <FILE>      dump indexed rows into FILE
    
    Examples:

      4.进行索引

      c:wampappssphinxin>search.exe -c c:wampappssphinxetcsphinx.conf twitter

      指定配置文件,搜索twitter

      5.集成到Php

     1 <?php
     2 /*
     3  * test sphinx
     4  */
     5 include_once('sphinxapi.php');
     6 
     7 $sp = new SphinxClient();
     8 
     9 $sp ->SetServer('127.0.0.1',9312);  //connect server
    10 $sp ->SetConnectTimeout(5);  //connection timeout
    11 $sp ->SetLimits(0,10); //($min,$max)
    12 
    13 $keywords = $_POST['kw']?trim($_POST['kw']):'';     //search keywords
    14 
    15 $res = $sp ->Query($keywords,'*');   // *:all index name can use specific index
    16 print_r($res);

    返回结果如下,其中Matches数组是搜索到匹配的结果,其中key是搜索的的结果主键。

    提取出key,使用in ,在连接数据库既可以取出匹配的结果。

    Array
    (
        [error] => 
        [warning] => 
        [status] => 0
        [fields] => Array
            (
                [0] => title
                [1] => content
            )
    
        [attrs] => Array
            (
                [group_id] => 1
                [date_added] => 2
            )
    
        [matches] => Array
            (
                [2] => Array
                    (
                        [weight] => 2
                        [attrs] => Array
                            (
                                [group_id] => 3
                                [date_added] => 1270135548
                            )
    
                    )
    
                [1] => Array
                    (
                        [weight] => 1
                        [attrs] => Array
                            (
                                [group_id] => 2
                                [date_added] => 1270131607
                            )
    
                    )
    
            )
    
        [total] => 2
        [total_found] => 2
        [time] => 0.001
        [words] => Array
            (
                [twitter] => Array
                    (
                        [docs] => 2
                        [hits] => 5
                    )
    
            )
    
    )
  • 相关阅读:
    布局重用 include merge ViewStub
    AS 常用插件 MD
    AS 2.0新功能 Instant Run
    AS .ignore插件 忽略文件
    AS Gradle构建工具与Android plugin插件【大全】
    如何开通www国际域名个人网站
    倒计时实现方案总结 Timer Handler
    AS 进行单元测试
    RxJava 设计理念 观察者模式 Observable lambdas MD
    retrofit okhttp RxJava bk Gson Lambda 综合示例【配置】
  • 原文地址:https://www.cnblogs.com/gophper/p/4398518.html
Copyright © 2011-2022 走看看