zoukankan      html  css  js  c++  java
  • HTML Strip Char Filter

    The html_strip character filter strips HTML elements from the text and replaces HTML entities with their decoded value (e.g. replacing & with &).

    Example outputedit

    POST _analyze
    {
      "tokenizer":      "keyword", 
      "char_filter":  [ "html_strip" ],
      "text": "<p>I&apos;m so <b>happy</b>!</p>"
    }

    The keyword tokenizer returns a single term.

    The above example returns the term:

    [ 
    I'm so happy!
     ]

    The same example with the standard tokenizer would return the following terms:

    [ I'm, so, happy ]

    Configurationedit

    The html_strip character filter accepts the following parameter:

    escaped_tags

    An array of HTML tags which should not be stripped from the original text.

    Example configurationedit

    In this example, we configure the html_strip character filter to leave <b> tags in place:

    PUT my_index
    {
      "settings": {
        "analysis": {
          "analyzer": {
            "my_analyzer": {
              "tokenizer": "keyword",
              "char_filter": ["my_char_filter"]
            }
          },
          "char_filter": {
            "my_char_filter": {
              "type": "html_strip",
              "escaped_tags": ["b"]
            }
          }
        }
      }
    }
    
    POST my_index/_analyze
    {
      "analyzer": "my_analyzer",
      "text": "<p>I&apos;m so <b>happy</b>!</p>"
    }

    The above example produces the following term:

    [ 
    I'm so <b>happy</b>!
     ]


    源文:https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-htmlstrip-charfilter.html#analysis-htmlstrip-charfilter
  • 相关阅读:
    缺少一个=出现的问题
    快速排序+归并排序
    ACwing简单题(14)
    浅谈#ifndef
    fstream 使用详解
    _stat函数的使用
    关于文件结构体的使用
    new的使用
    ACwing13题目
    ACwing13题
  • 原文地址:https://www.cnblogs.com/a-du/p/7278302.html
Copyright © 2011-2022 走看看