zoukankan      html  css  js  c++  java
  • 利用acts_as_ferret实现全文检索

    acts_as_ferret是在Rails中实现全文检索的插件,它的实现基于Ferret,Ferret是Apache Lucene的ruby接口。有关acts_as_ferret的介绍网上很多,教程也很多,是早期rails最重要的全文检索插件,不过要老外支持中文检索可谓是天荒地潭,javaeye中讲述中文支持的实现也不尽人意,而且都因年代久远,日益失去参考价值了。鉴此,我在放弃使用acts_as_ferret之前,详细介绍一下如何利用acts_as_ferret实现中文的全文检索吧,算是一个备案,或许未来会有用到它的时候。

    独立安装gem

    不要以插件方式安装,也不要unpack到你的项目下,原因不大清楚,但这是我血的教训总结出来的!

    gem install ferret -v=0.11.5 --platform mswin32
    gem install acts_as_ferret
    gem install rmmseg
    

    然后把C:\ruby\lib\ruby\gems\1.8\gems\ferret-0.11.5-x86-mswin32\ext目录下的ferret_ext.so复制到C:\ruby\lib\ruby\gems\1.8\gems\ferret-0.11.5-x86-mswin32\lib下!

    关联gem

    在你的项目的environment.rb添加

      config.gem 'ferret'
      config.gem 'rmmseg'
      config.gem 'acts_as_ferret'
    

    为模型添加索引支持

    
    require 'rmmseg'
    require 'rmmseg/ferret'
    class Topic < ActiveRecord::Base
      #……………………………………………………其他实现………………………………………………………………
      #如果想重新建立索引,只需要删除对应的文件夹,并重启服务,也可以使用Model.rebuild_index方法
      #=======================搜索部分=====================
      acts_as_ferret({
          :fields => {
            :title => {
              :store => :yes,
              :boost=> 20 #设置权重
            },
            :body => {
              :boost=> 1,
              :store => :yes,
              :term_vector => :with_positions_offsets
            },
            :author => {:store => :yes},
            :created_at_s => {:index => :untokenized,:store => :yes},
            :updated_at_s => {:index => :untokenized,:store => :yes}
          },
          :store_class_name=>true,
          :analyzer => RMMSeg::Ferret::Analyzer.new
        })
      def created_at_s
        created_at.to_s(:db)
      end
    
      def updated_at_s
        updated_at.to_s(:db)
      end
    
      def body
        first_post.body
      end
    #……………………………………………………其他实现………………………………………………………………
    end
    
    require 'rmmseg'
    require 'rmmseg/ferret'
    class Post < ActiveRecord::Base
      #……………………………………………………其他实现………………………………………………………………
      delegate :title, :to => :topic
      
      #如果想重新建立索引,只需要删除对应的文件夹,并重启服务,也可以使用Model.rebuild_index方法
      #=======================搜索部分=====================
      acts_as_ferret({
          :fields => {
            :title => {
              :store => :yes,
              :boost=> 20 #设置权重
            },
            :body => {
              :boost=> 1,
              :store => :yes,
              :term_vector => :with_positions_offsets
            },
            :author => {:store => :yes},
            :created_at_s => {:index => :untokenized,:store => :yes},
            :updated_at_s => {:index => :untokenized,:store => :yes}
          },
          :store_class_name => true,
          :analyzer => RMMSeg::Ferret::Analyzer.new
        })
    
      def created_at_s
        created_at.to_s(:db)
      end
    
      def updated_at_s
        updated_at.to_s(:db)
      end
    #……………………………………………………其他实现………………………………………………………………
    end
    

    其中 :analyzer => RMMSeg::Ferret::Analyzer.new为我们添加了中文分词的能力。

    建立Search模块

    ruby script/generate controller search index
    

    添加路由规则。

     map.online '/seach', :controller => 'seach', :action => 'index'
    

    修改search_controller。

    class SearchController < ApplicationController
      def index
        @class = params[:class] || "topic"
        @query = params[:query] || ''
        unless @query.blank?
          if @class == "topic"
            @results = Topic.find_with_ferret @query
          else
            @results = Post.find_with_ferret @query
          end
        end
      end
    end
    
    

    修改对应视图:

    <% form_tag '/search', :method => :get ,:style => "margin-left:40%" do %>
      <input type="radio" name="class" value = "topic" <%= @class == "topic"? 'checked="checked"':'' %>>仅主题贴
      <input type="radio" name="class" value = "post" <%= @class == "post"? 'checked="checked"':'' %>>所有贴子<br>
      <p>
        <%= text_field_tag :query, @query %>
        <%= submit_tag "搜索", :name => nil %>
      </p>
    <% end %>
    
    <% if defined? @results %>
      <style type="text/css">
        .hilite{
          color:#0042BD;
          background:#F345CC;
        }
      </style>
    
      <div id="search_result">
        <% @results.each do |result| %>
          <h3>
            <%= result.highlight(@query,:field => :title,:pre_tag => "<span class='hilite'>",:post_tag => "</span>")%>
          </h3>
          <div><%= result.highlight(@query,:field => :body,:num_excerpts => 3,:excerpt_length => 250) %></div>
          <p>作者:<%=result.author %>  发表时间 <%= result.created_at_s %></p>
        <% end %>
      </div>
    <% end %>
    

    另一个高亮方案。

      def hilight(a,b)
        #a为要高亮的字符串,b为高亮部分,默认高亮后的样式为hilite
        highlight a,b, '\1'
      end
    
    
      
    <% @results.each do |result| %>

    <%= hilight h(result.title),@query %>

    <%= hilight simple_format(truncate(result.body,:length => 250)), @query %>

    作者:<%= hilight h(result.author),@query %> 发表时间 <%= result.created_at_s %>

    <% end %>

    分页

    application_controller.rb添加

      def pages_for(result,options = {})
        page, per_page, total = (options[:page] || 1),(options[:per_page] || 30),(result.total_hits || 0)
        page_total = page * per_page
        index = (page.to_i - 1) * per_page
        returning WillPaginate::Collection.new(page, per_page, total) do |pager|
          pager.replace result[index,per_page]
        end
      end
    

    修改控制器:

    class SearchController < ApplicationController
      def index
        @class = params[:class] || "topic"
        @query = params[:query] || ''
        unless @query.blank?
          if @class == "topic"
            results = Topic.find_with_ferret @query
            @results = pages_for(results  ,:per_page => 3,:page=> (params[:page] || 1))
          else
            results = Post.find_with_ferret @query
            @results = pages_for(results  ,:per_page => 3,:page=> (params[:page] || 1))
          end
        end
      end
    end
    

    对应视图的最下方添加一句(用到will_paginate插件)

    <% form_tag '/search', :method => :get ,:style => "margin-left:40%" do %>
      <input type="radio" name="class" value = "topic" <%= @class == "topic"? 'checked="checked"':'' %>>仅主题贴
      <input type="radio" name="class" value = "post" <%= @class == "post"? 'checked="checked"':'' %>>所有贴子<br>
      <p>
        <%= text_field_tag :query, @query %>
        <%= submit_tag "搜索", :name => nil %>
      </p>
    <% end %>
    
    <% if defined? @results %>
      <style type="text/css">
        .hilite{
          color:#0042BD;
          background:#F345CC;
        }
      </style>
    
      <div id="search_result">
        <% @results.each do |result| %>
          <h3>
            <%= result.highlight(@query,:field => :title,:pre_tag => "<span class='hilite'>",:post_tag => "</span>")%>
          </h3>
          <div><%= result.highlight(@query,:field => :body,:num_excerpts => 3,:excerpt_length => 250) %></div>
          <p>作者:<%=result.author %>  发表时间 <%= result.created_at_s %> 相关度 <= number_to_percentage result.ferret_score*100,:precision => 2 %><p>
        <% end %>
      </div>
      <%= will_paginate @results ,:class => "non_ajax"%>
    <% end %>
    

    产品环境

    • model里acts_as_ferret :remote=>true指定remote为true
    • 把vendor/plugins/acts_as_ferret/config/目录下的ferret_server.yml copy到 config/下
    • ruby script/runner vendor/plugins/acts_as_ferret/script/ferret_server -e production

    一些有用的链接

    http://www.pluitsolutions.com/2007/07/30/acts-as-ferret-drbserver-win32-service/

    http://ferret.davebalmain.com/api/classes/Ferret/Index.html

  • 相关阅读:
    redis 数据类型 Hash
    redis有序集合类型sort set
    redis数据类型set
    redis的 list
    redis的key
    centos安装redis
    input聚焦事件
    width(),innerWidth(),outerWidth(),outerWidth(true)
    jq 选择器
    详解CSS中:nth-child的用法_大前端
  • 原文地址:https://www.cnblogs.com/rubylouvre/p/1528544.html
Copyright © 2011-2022 走看看