zoukankan      html  css  js  c++  java
  • [ solr入门 ] 在schema.xml中加入中文分词(IKAnalyzer)

    http://www.cnblogs.com/huangfox/archive/2012/02/08/2342881.html

    一文中介绍的怎么将solr发布到eclipse中,现在就在原有的基础上将IKAnalyzer加入。

    1.下载IKAnalyzer的源码,将其复制到solr3.5项目中,如下图:

    2.在schema.xml配置IKAnalyzer

    <!-- IKAnalyzer3.2.8 中文分词-->
    	<fieldType name="text" class="solr.TextField">
    		<analyzer type="index">
    			<tokenizer class="org.wltea.analyzer.solr.IKTokenizerFactory"  isMaxWordLength="false"/>
    				<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
                    <filter class="solr.LowerCaseFilterFactory"/>
            </analyzer>
            <analyzer type="query">
    			<tokenizer class="org.wltea.analyzer.solr.IKTokenizerFactory" isMaxWordLength="true"/>
    				<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
                    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
                    <filter class="solr.LowerCaseFilterFactory"/>
    		</analyzer>   
        </fieldType>
    

    3.启动solr进行验证

    在field中选择type,并输入test,在field value中输入一段中文,Analyze既可以看到分词效果。

    verbose output 选项可以查看分词详细信息。

    具体的schema.xml的配置可以查看solr wiki:

    http://wiki.apache.org/solr/SchemaXml

    Data Types
    
    The <types> section allows you to define a list of <fieldtype> declarations you wish to use in your schema, along with the underlying Solr class that should be used for that type, as well as the default options you want for fields that use that type.
    
    Any subclass of FieldType may be used as a field type class, using either its full package name, or the "solr" alias if it is in the default Solr package. For common numeric types (integer, float, etc...) there are multiple implementations provided depending on your needs, please see SolrPlugins for information on how to ensure that your own custom Field Types can be loaded into Solr.
    
    Common options that field types can have are...
    sortMissingLast=true|false
    sortMissingFirst=true|false
    indexed=true|false
    stored=true|false
    multiValued=true|false
    omitNorms=true|false
    omitTermFreqAndPositions=true|false  Solr1.4
    omitPositions|false  Solr3.4
    positionIncrementGap=N
    TextFields can also support Analyzers with highly configurable Tokenizers and Token Filters.
    
    Field types that store text (TextField, StrField) support compression of stored contents:
    
    compressed=true|false
    compressThreshold=<integer>
    compressThreshold is the minimum length required for text compression to be invoked. This applies only if compressed=true; a common pattern is to set compressThreshold on the field type definition, and turn compression on and off in the individual field definitions.
    

      

  • 相关阅读:
    5.0、Android Studio调试你的应用
    4.4、Android Studio在命令行运行Gradle
    4.3、Android Studio突破64K方法限制
    4.2、Android Studio压缩你的代码和资源
    4.1、Android Stuido配置你的Build Variant
    【java多线程系列】java中的volatile的内存语义
    【java多线程系列】java内存模型与指令重排序
    4.0、Android Studio配置你的构建
    HashMap
    zk常用命令
  • 原文地址:https://www.cnblogs.com/huangfox/p/2342915.html
Copyright © 2011-2022 走看看