zoukankan      html  css  js  c++  java
  • Solr特性:Schemaless Mode(自动往Schema中添加field)

    WiKi:https://cwiki.apache.org/confluence/display/solr/Schemaless+Mode

    介绍:

    Schemaless Mode is a set of Solr features that, when used together, allow users to rapidly construct an effective schema by simply indexing sample data, without having to manually edit the schema. These Solr features, all specified in solrconfig.xml, are:

    1. Managed schema: Schema modifications are made through Solr APIs rather than manual edits - see Managed Schema Definition in SolrConfig.
    2. Field value class guessing: Previously unseen fields are run through a cascading set of value-based parsers, which guess the Java class of field values - parsers for Boolean, Integer, Long, Float, Double, and Date are currently available.
    3. Automatic schema field addition, based on field value class(es): Previously unseen fields are added to the schema, based on field value Java classes, which are mapped to schema field types - see Solr Field Types

    配置:

    1.Enable Managed Schema

    As described in the section Managed Schema Definition in SolrConfig, changing the schemaFactory will allow the schema to be modified by the Schema API. Your solrconfig.xml should have a section like the one below (and the ClassicIndexSchemaFactory should be commented out or removed).

    <schemaFactory class="ManagedIndexSchemaFactory">
      <bool name="mutable">true</bool>
      <str name="managedSchemaResourceName">managed-schema</str>
    </schemaFactory>

    2.Define an UpdateRequestProcessorChain

    The UpdateRequestProcessorChain allows Solr to guess field types, and you can define the default field type classes to use. To start, you should define it as follows (see the javadoc links below for update processor factory documentation):

    <updateRequestProcessorChain name="add-unknown-fields-to-the-schema">
      <!-- UUIDUpdateProcessorFactory will generate an id if none is present in the incoming document -->
      <processor class="solr.UUIDUpdateProcessorFactory" />
      <processor class="solr.LogUpdateProcessorFactory"/>
      <processor class="solr.DistributedUpdateProcessorFactory"/>
      <processor class="solr.RemoveBlankFieldUpdateProcessorFactory"/>
      <processor class="solr.FieldNameMutatingUpdateProcessorFactory">
        <str name="pattern">[^w-.]</str>
        <str name="replacement">_</str>
      </processor>
      <processor class="solr.ParseBooleanFieldUpdateProcessorFactory"/>
      <processor class="solr.ParseLongFieldUpdateProcessorFactory"/>
      <processor class="solr.ParseDoubleFieldUpdateProcessorFactory"/>
      <processor class="solr.ParseDateFieldUpdateProcessorFactory">
        <arr name="format">
          <str>yyyy-MM-dd'T'HH:mm:ss.SSSZ</str>
          <str>yyyy-MM-dd'T'HH:mm:ss,SSSZ</str>
          <str>yyyy-MM-dd'T'HH:mm:ss.SSS</str>
          <str>yyyy-MM-dd'T'HH:mm:ss,SSS</str>
          <str>yyyy-MM-dd'T'HH:mm:ssZ</str>
          <str>yyyy-MM-dd'T'HH:mm:ss</str>
          <str>yyyy-MM-dd'T'HH:mmZ</str>
          <str>yyyy-MM-dd'T'HH:mm</str>
          <str>yyyy-MM-dd HH:mm:ss.SSSZ</str>
          <str>yyyy-MM-dd HH:mm:ss,SSSZ</str>
          <str>yyyy-MM-dd HH:mm:ss.SSS</str>
          <str>yyyy-MM-dd HH:mm:ss,SSS</str>
          <str>yyyy-MM-dd HH:mm:ssZ</str>
          <str>yyyy-MM-dd HH:mm:ss</str>
          <str>yyyy-MM-dd HH:mmZ</str>
          <str>yyyy-MM-dd HH:mm</str>
          <str>yyyy-MM-dd</str>
        </arr>
      </processor>
      <processor class="solr.AddSchemaFieldsUpdateProcessorFactory">
        <str name="defaultFieldType">strings</str>
        <lst name="typeMapping">
          <str name="valueClass">java.lang.Boolean</str>
          <str name="fieldType">booleans</str>
        </lst>
        <lst name="typeMapping">
          <str name="valueClass">java.util.Date</str>
          <str name="fieldType">tdates</str>
        </lst>
        <lst name="typeMapping">
          <str name="valueClass">java.lang.Long</str>
          <str name="valueClass">java.lang.Integer</str>
          <str name="fieldType">tlongs</str>
        </lst>
        <lst name="typeMapping">
          <str name="valueClass">java.lang.Number</str>
          <str name="fieldType">tdoubles</str>
        </lst>
      </processor>
      <processor class="solr.RunUpdateProcessorFactory"/>
    </updateRequestProcessorChain>

    3.Make the UpdateRequestProcessorChain the Default for the UpdateRequestHandler

    Once the UpdateRequestProcessorChain has been defined, you must instruct your UpdateRequestHandlers to use it when working with index updates (i.e., adding, removing, replacing documents). Here is an example using InitParams to set the defaults on all /updaterequest handlers:

    <initParams path="/update/**">
      <lst name="defaults">
        <str name="update.chain">add-unknown-fields-to-the-schema</str>
      </lst>
    </initParams>
  • 相关阅读:
    IntelliJ IDEA + Maven + Tomcat 本地开发、部署、调试。
    IntelliJ IDEA 修改IDE字体、代码字体。
    IntelliJ IDEA 自动导入包的问题
    jersey中的 404 Not Found 错误。
    IntelliJ IDEA 创建maven项目一次后,然后删除,再次保存到此目录下,提供此目录已经被占用的问题。
    修饰符(字符篇)
    [讨论] 2015-8-13日 主题:关于指针和堆栈
    HDFS异构存储
    HDFS异构存储
    HDFS副本放置策略
  • 原文地址:https://www.cnblogs.com/lvfeilong/p/345dgdfgf.html
Copyright © 2011-2022 走看看