zoukankan      html  css  js  c++  java
  • elasticsearch 深入 —— normalizer

    keyword字段的normalizer属性类似于分析器,只是它保证分析链生成单个token。

    索引关键字之前,以及在通过诸如match查询之类的查询解析器或者通过诸如term查询之类的术语级查询搜索keyword字段时的搜索,应用规范化器——normalizer。

    PUT index
    {
      "settings": {
        "analysis": {
          "normalizer": {
            "my_normalizer": {
              "type": "custom",
              "char_filter": [],
              "filter": ["lowercase", "asciifolding"]
            }
          }
        }
      },
      "mappings": {
        "_doc": {
          "properties": {
            "foo": {
              "type": "keyword",
              "normalizer": "my_normalizer"
            }
          }
        }
      }
    }
    
    PUT index/_doc/1
    {
      "foo": "BÀR"
    }
    
    PUT index/_doc/2
    {
      "foo": "bar"
    }
    
    PUT index/_doc/3
    {
      "foo": "baz"
    }
    
    POST index/_refresh
    
    GET index/_search
    {
      "query": {
        "term": {
          "foo": "BAR"
        }
      }
    }
    
    GET index/_search
    {
      "query": {
        "match": {
          "foo": "BAR"
        }
      }
    }

    上述查询与文档1和2匹配,因为在索引和查询时都将BÀR转换为bar 。

    {
      "took": $body.took,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped" : 0,
        "failed": 0
      },
      "hits": {
        "total": 2,
        "max_score": 0.2876821,
        "hits": [
          {
            "_index": "index",
            "_type": "_doc",
            "_id": "2",
            "_score": 0.2876821,
            "_source": {
              "foo": "bar"
            }
          },
          {
            "_index": "index",
            "_type": "_doc",
            "_id": "1",
            "_score": 0.2876821,
            "_source": {
              "foo": "BÀR"
            }
          }
        ]
      }
    }

    此外,关键字在索引之前被转换的事实也意味着聚合返回归一化值:

    GET index/_search
    {
      "size": 0,
      "aggs": {
        "foo_terms": {
          "terms": {
            "field": "foo"
          }
        }
      }
    }

    返回:

    {
      "took": 43,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped" : 0,
        "failed": 0
      },
      "hits": {
        "total": 3,
        "max_score": 0.0,
        "hits": []
      },
      "aggregations": {
        "foo_terms": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "key": "bar",
              "doc_count": 2
            },
            {
              "key": "baz",
              "doc_count": 1
            }
          ]
        }
      }
    }
  • 相关阅读:
    设计模式的原则
    命令模式
    访问者模式
    策略模式
    外观模式
    组合模式
    原型模式
    合并有序数组
    判断二叉树是否对称
    中序遍历二叉树
  • 原文地址:https://www.cnblogs.com/gmhappy/p/11864041.html
Copyright © 2011-2022 走看看