zoukankan      html  css  js  c++  java
  • ES设置查询的相似度算法

    similarity

    Elasticsearch allows you to configure a scoring algorithm or similarity per field. The similaritysetting provides a simple way of choosing a similarity algorithm other than the default BM25, such as TF/IDF.

    Similarities are mostly useful for text fields, but can also apply to other field types.

    Custom similarities can be configured by tuning the parameters of the built-in similarities. For more details about this expert options, see the similarity module.

    The only similarities which can be used out of the box, without any further configuration are:

    BM25
    The Okapi BM25 algorithm. The algorithm used by default in Elasticsearch and Lucene. See Pluggable Similarity Algorithms for more information.
    classic
    The TF/IDF algorithm which used to be the default in Elasticsearch and Lucene. See Lucene’s Practical Scoring Function for more information.
    boolean
    A simple boolean similarity, which is used when full-text ranking is not needed and the score should only be based on whether the query terms match or not. Boolean similarity gives terms a score equal to their query boost.

    The similarity can be set on the field level when a field is first created, as follows:

    PUT my_index
    {
      "mappings": {
        "my_type": {
          "properties": {
            "default_field": { 
              "type": "text"
            },
            "classic_field": {
              "type": "text",
              "similarity": "classic" 
            },
            "boolean_sim_field": {
              "type": "text",
              "similarity": "boolean" 
            }
          }
        }
      }
    }

    The default_field uses the BM25 similarity.

    The classic_field uses the classic similarity (ie TF/IDF).

    The boolean_sim_field uses the boolean similarity.

    Default and Base Similarities

    By default, Elasticsearch will use whatever similarity is configured as default. However, the similarity functions queryNorm() and coord() are not per-field. Consequently, for expert users wanting to change the implementation used for these two methods, while not changing the default, it is possible to configure a similarity with the name base. This similarity will then be used for the two methods.

    You can change the default similarity for all fields in an index when it is created:

    PUT /my_index
    {
      "settings": {
        "index": {
          "similarity": {
            "default": {
              "type": "classic"
            }
          }
        }
      }
    }

    If you want to change the default similarity after creating the index you must close your index, send the follwing request and open it again afterwards:

    PUT /my_index/_settings
    {
      "settings": {
        "index": {
          "similarity": {
            "default": {
              "type": "classic"
            }
          }
        }
      }
    }

    from:https://www.elastic.co/guide/en/elasticsearch/reference/5.4/index-modules-similarity.html
  • 相关阅读:
    onkeypress事件.onkeydown事件.onkeyup事件
    汉诺塔递归算法拙见
    《编写可读代码的艺术》读后总结
    select下拉菜单反显不可改动,且submit能够提交数据
    Freemarker list 的简单使用
    Freemarker导出带格式的word的使用
    Freemarker导出word的简单使用
    Freemarker取list集合中数据(将模板填充数据后写到客户端HTML)
    struts2在配置文件与JSP中用OGNL获取Action属性
    Web下文件上传下载的路径问题
  • 原文地址:https://www.cnblogs.com/bonelee/p/7451929.html
Copyright © 2011-2022 走看看