zoukankan      html  css  js  c++  java
  • flask实现基于elasticsearch的关键词搜索建议

    1、实现效果

    2、fuzzy模糊查询和suggest查询

    • fuzzy模糊查询
     1 GET chaxun/job/_search
     2 {
     3   "query": {
     4     "fuzzy": {
     5       "title": {
     6         "value": "pythn",
     7         "fuzziness": 2,
     8         "prefix_length": 2
     9       }
    10     }
    11   }
    12 }

    注释:“fuzziness”为“编辑距离”,“编辑距离”是一种字符串之间相似程度的计算方法。即两个字符串之间的编辑距离等于使一个字符串变成另外一个字符串而进行的插入、删除、替换或相邻字符交换位置而进行操作的最少次数。“prefix_length”,前缀相同长度。

    • suggest查询
     1 POST chaxun/_search?pretty
     2 {
     3     "suggest": {
     4         "suggest" : {
     5             "prefix" : "python",
     6             "completion" : {
     7                 "field" : "suggest",
     8                 "fuzzy" : {
     9                     "fuzziness" : 2
    10                 }
    11             }
    12         }
    13     }
    14 }

    3、建立模型

     1 from elasticsearch_dsl import Document, Completion, Text, Date, Keyword, Integer
     2 from elasticsearch_dsl.analysis import CustomAnalyzer as _CustomAnalyzer
     3 
     4 class CustomAnalyzer(_CustomAnalyzer):
     5     def get_analysis_definition(self):
     6         return {}
     7 
     8 ik_analyzer = CustomAnalyzer("ik_max_word", filter=["lowercase"])
     9 
    10 class BoleAtricle(Document):
    11     suggest = Completion(analyzer=ik_analyzer)
    12     title = Text(analyzer="ik_max_word")
    13     create_date = Date()
    14     url = Keyword()
    15     url_object_id = Keyword()
    16     front_image_url = Keyword()
    17     praise_nums = Integer()
    18     comments_nums = Integer()
    19     fav_nums = Integer()
    20     tags = Text(analyzer="ik_max_word")
    21     content = Text(analyzer="ik_max_word")
    22 
    23     class Index:
    24         name = "jobbole"
    25 
    26     class Meta:
    27         doc_type = "article"

    4、视图函数

     1 from app.models import BoleAtricle
     2 import json
     3 
     4 
     5 @main.route("/suggest/<text>")
     6 def get_suggest_phrase(text):
     7     s = BoleAtricle.search()
     8     suggest = s.suggest("suggest", text, completion={
     9         "field": "suggest", "fuzzy": {
    10             "fuzziness": 2
    11         },
    12         "size": 10
    13     })
    14     suggestions = suggest.execute()
    15     suggest_phrase = []
    16     li_suggest = suggestions.suggest.to_dict().get("suggest", [])
    17     if li_suggest:
    18         for item in li_suggest[0].get("options", []):
    19             suggest_phrase.append(item["_source"]["title"])
    20     return json.dumps(suggest_phrase)
  • 相关阅读:
    颜色透明度16进制对照表
    爬取代理IP
    Python中匹配IP的正则表达式
    IP地址正则表达式的写法
    每日一练 11.23
    每日一练 11.22
    每日一练
    pycharm使用教程
    周总结博客16
    周总结博客15
  • 原文地址:https://www.cnblogs.com/dowi/p/10174174.html
Copyright © 2011-2022 走看看