zoukankan      html  css  js  c++  java
  • Django和elasticsearch搜索引擎网站后端功能实现

    一、输入框智能提示(es提供了接口 )
    修改type
    需要在mapping中设置一个字段 suggest:{“type”:“completion”}
    所以要修改我们定义的type:
    在type中新增一个字段:suggest,由于es-dsl源码有一些问题,所以这样定义是会报错的,要自己定义一个CustomAnalyzer,再声明一个自定义的对象,ik_analyzer,再把对象赋给type中的suggest :

    ...
    from elasticsearch_dsl.analysis import CustomAnalyzer as _CustomAnalyzer

    class CustomAnalyzer(_CustomAnalyzer):

    def get_analysis_definition(self):
    # 这里什么都不做,只是为了避免报错的问题
    return ()


    # 声明一个自定义的对象,传递ik_max_word并且做大小写转换
    ik_analyzer = CustomAnalyzer('ik_max_word', filter=['lowercase'])


    class DuowanType(DocType):
    ...
    # 定义suggest是为了完成自动补全功能。
    # 由于es-dsl源码有一些问题,所以这样定义是会报错的,要自己定义一个CustomAnalyzer
    suggest = Completion(analyzer=ik_analyzer)

    生成suggest值
    在save_to_es里面生成搜索建议
    要通过生成suggest的接口来生成自己的结构。
    在items类中定义一个全局函数gen_suggests,传递index和info_tuple用于weight信息,新建一个set用于去重,一个suggest数组用于保存返回的内容。遍历info_tuple,如果text字符串不为空,则调用es的analyze接口来分析字符串,再整理好需要返回的结构

    def gen_suggests(index, info_tuple): # 用tuple 就可以传递多个weight信息 并且还可以按顺序
    # 根据字符串生成搜索建议数组
    uesd_words = set() # 用于去重
    suggests = [] # 用于返回
    for text, weight in info_tuple:
    if text: # 排除空字符串
    # 调用es的analyze接口来分析字符串
    words = es.indices.analyze(index=index, analyzer='ik_max_word', params={'filter': ['lowercase']}, body=text)
    anylyzed_words = set(r["token"] for r in words["tokens"] if len(r["token"]) > 1) # 用来过滤单个字
    new_words = anylyzed_words - uesd_words # 去重
    else:
    new_words = set()
    if new_words:
    suggests.append({'input': list(new_words), 'weight': weight})
    return suggests

    然后在save_to_es中调用这个函数:

    info_tuple = ((duowan.title, 10), (duowan.author, 7))
    duowan.suggest = (gen_suggests(DuowanType._doc_type.index, info_tuple))

    搭建django搜索网站
    创建新的虚拟环境
    进入虚拟环境并安装django包 pip install -i https://pypi.douban.com/simple/ django
    然后用pycharm新建一个django的项目,直接运行,可以在log中看到服务器地址。
    再新建一个static目录,把css,html,js文件粘贴进去,把html文件粘贴到templates目录下。
    在urls文件中新增一个url

    from django.contrib import admin
    from django.urls import path
    from django.conf.urls import url
    from django.views.generic import TemplateView

    urlpatterns = [
    path('admin/', admin.site.urls),
    url(r'^$', TemplateView.as_view(template_name='index.html'), name='index'),
    ]

    在settings中添加一行设置;

    # 这里也可以用tuple 用tuple的话路径后面要加逗号
    STATICFILES_DIRS = [
    os.path.join(BASE_DIR, 'static') # 可以传递多个
    ]

    在index.html中把<link href="css/style.css" rel="stylesheet" type="text/css" />之类的导入css和js的语句改为

    {% load staticfiles %}
    <head>
    ...
    <link href="{% static 'css/style.css'%}" rel="stylesheet" type="text/css" />
    ...</head>

    这样就可以把settings中的static_url join到‘ ’ 内容前面,这样就可以找得到html文件了。

    搜索建议
    在虚拟环境中安装同版本的es-dsl
    f模糊搜索
    fuzzy:

    GET duowan/video/_search
    {
    "query": {
    "fuzzy": {
    "title": {
    "value": "军团骑士",
    "fuzziness": 2,
    "prefix_length": 3
    }
    }
    },
    "_source": ["title"]
    }

    fuzziness:编辑距离
    prefix_length:前面的不参与变换的词的长度
    “_source”: [“title”]:指明字段

    suggest:

    POST duowan/video/_search
    {
    "suggest": {
    "my-suggest": {
    "text":"PVQ",
    "completion": {
    "field": "suggest",
    "fuzzy": {
    "fuzziness":1
    }
    }
    }
    },
    "_source": ["title"]
    }

    my-suggest可以自定义,field不能变,

    在index.html文件中嵌入了js脚本,绑定了input事件,当里面的内容发生变化时,向服务器发送请求,参数包括input内容,和type类型

    $(function(){
    $('.searchInput').bind(' input propertychange ',function(){
    var searchText = $(this).val();
    var tmpHtml = ""
    $.ajax({
    cache: false,
    type: 'get', //get方法获取
    dataType:'json',
    url:suggest_url+"?s="+searchText+"&s_type="+$(".searchItem.current").attr('data-type'),
    async: true,
    success: function(data) {
    for (var i=0;i<data.length;i++){
    tmpHtml += '<li><a href="'+search_url+'?q='+data[i]+'">'+data[i]+'</a></li>'
    }
    $(".dataList").html("")
    $(".dataList").append(tmpHtml);
    if (data.length == 0){
    $('.dataList').hide()
    }else {
    $('.dataList').show()
    }
    }
    });
    } );
    })

    在urls中新增:

    url(r'^suggest/$', TemplateView.as_view(template_name='index.html'), name='index')
    1
    然后把爬虫文件中es-type中的内容复制到django项目的models中
    再编辑views文件:

    import json
    from django.shortcuts import render
    from django.views.generic.base import View
    from search.models import DuowanType
    from django.http import HttpResponse


    # Create your views here.
    # 继承 View
    class SearchSuggest(View):
    def get(self, request):
    key_words = request.GET.get('s', '') # 用request获取传过来的参数s 默认值为空
    re_dates = [] # 用来保存搜索建议返回来的title
    if key_words:
    s = DuowanType.search()
    # 写好查询语句
    s = s.suggest('my_suggest', key_words, completion={
    "field": "suggest",
    "fuzzy": {
    "fuzziness": 2
    },
    "size": 10
    })
    # 执行并获取结果
    suggestions = s.execute_suggest()
    for match in suggestions.my_suggest[0].options:
    source = match._source
    re_dates.append(source['title'])
    # 用HttpResponse来返回结果,把数组转成json返回
    return HttpResponse(json.dumps(re_dates), content_type='application/json')

    把urls中的

    url(r'^suggest/$', TemplateView.as_view(template_name='index.html'), name='index')
    改为

    url(r'^suggest/$', SearchSuggest.as_view(), name='suggest')

    记得是SearchSuggest.as_view(),不是SearchSuggest.as_view,否则会报错如下:
    TypeError: as_view() takes 1 positional argument but 2 were given

    二 搜索功能
    urls中:

    from search.views import SearchSuggest, SearchView
    ...
    url(r'^search/$', SearchView.as_view(), name='search')

    在views中添加一个 SearchView(View):
    接收传过来的查询关键词参数和页码参数,
    创建一个client连接es服务器,使用client.search可以执行原始的语句,使用client.search来执行查询语句,在接收返回来的值,把返回来的结果取出来存放到list中,最后用render返回给页面,
    查询时间:记录client.search运行前后的时间,再做减法

    from elasticsearch import Elasticsearch
    from datetime import datetime
    client = Elasticsearch(hosts='127.0.0.1')
    .......
    class SearchView(View):
    def get(self, request):
    key_words = request.GET.get('q', '')
    pagesize = 10
    page = request.GET.get('p', '')
    try:
    page = int(page)
    except:
    page = 1

    # client.search允许像最原始的写法一样写
    body = {
    "query": {
    "multi_match": {
    "query": key_words,
    "fields": ["title", "author"]
    }
    },
    "from": (page-1)*pagesize,
    "size": pagesize,
    # 高亮 返回来的值会把高亮的内容放到highlight字段中
    "highlight": {
    # 可以指明想要加进去的html tag tag里面可以写想知道的值
    "pre_tags": ["<span class='keyword'>"],
    "post_tags": ["</span>"],
    "fields": {
    "title": {},
    "content": {}
    }
    }
    }
    start_time = datetime.now()
    response = client.search(
    index="duowan",
    body=body
    )
    end_time = datetime.now()
    last_seconds = (end_time-start_time).total_seconds()
    # 不管分不分页都有的总数量
    total_nums = response['hits']['total']
    if (page % 10) > 0:
    page_nums = int(total_nums/10)+1
    else:
    page_nums = int(page/10)
    # 构造一些值 传到数组 在返回给html
    hit_list = []
    for hit in response['hits']['hits']:
    hit_dict = {}
    if 'title' in hit['highlight']:
    hit_dict['title'] = hit['highlight']['title'][0]
    else:
    # 截取长度 hit_dict['title'] = hit['_source']['title'][:100
    hit_dict['title'] = hit['_source']['title']
    hit_dict['len'] = hit['_source']['len']
    hit_dict['tag'] = hit['_source']['tag']
    hit_dict['update_time'] = hit['_source']['update_time']
    hit_dict['author'] = hit['_source']['author']
    hit_dict['playnum_text'] = hit['_source']['playnum_text']
    hit_dict['url'] = hit['_source']['url']
    hit_list.append(hit_dict)
    return render(request, 'result.html', {'page': page,
    'total_nums': total_nums,
    'all_hits': hit_list,
    'key_words': key_words,
    'page_nums': page_nums,
    'last_seconds': last_seconds})

    在页面中:
    找到item的div,用{% for hit in all_hits %} <div>...</div> {% endfor %}来使用for循环,遍历传过来的查询结果list all_hits.在页面中填充值

    {% for hit in all_hits %}
    <div class="resultItem">
    <div class="itemHead">
    <a href="{{ hit.url }}" target="_blank" class="title">{{ hit.title }}</a>
    <span class="divsion">-</span>
    <span class="fileType">
    <span class="label">分类:</span>
    <span class="value">{{ hit.tag }}</span>
    </span>
    <span class="dependValue">
    <span class="label">播放次数:</span>
    <span class="value">{{ hit.playnum_text }}</span>
    </span>
    </div>
    <div class="itemBody">

    </div>
    <div class="itemFoot">
    <span class="info">
    <label>网站:</label>
    <span class="value">伯乐在线</span>
    </span>
    <span class="info">
    <label>发布时间:</label>
    <span class="value">{{ hit.update_time }}</span>
    </span>
    </div>
    </div>
    {% endfor %}

    用js实现搜索记录:
    点击搜索按钮的时候触发add_search()方法,获取关键词,再用KillRepeat()给搜索记录去重,去重后把数组存储到浏览器localStorage,然后再把搜索内容显示出来

    //点击搜索的时候触发
    function add_search(){
    var val = $(".searchInput").val();
    if (val.length>=2){
    //点击搜索按钮时,去重
    KillRepeat(val);
    //去重后把数组存储到浏览器localStorage
    localStorage.search = searchArr;
    //然后再把搜索内容显示出来
    MapSearchArr();
    }

    window.location.href=search_url+'?q='+val+"&s_type="+$(".searchItem.current").attr('data-type')

    }

    function MapSearchArr(){
    var tmpHtml = "";
    var arrLen = 0
    if (searchArr.length >= 5){
    arrLen = 5
    }else {
    arrLen = searchArr.length
    }
    //把数组内容拼接成html内容
    for (var i=0;i<arrLen;i++){
    tmpHtml += '<a href="'+search_url+'?q='+searchArr[i]+'">'+searchArr[i]+'</a>'
    }
    $(".mysearch .all-search").html(tmpHtml);
    }
    //去重 把以前搜索过的记录删除,并把本次搜索词放在前面
    function KillRepeat(val){
    var kill = 0;
    for (var i=0;i<searchArr.length;i++){
    //判断这个词是否存在历史搜索记录里
    if(val===searchArr[i]){
    kill ++;
    }
    }
    if(kill<1){//不存在
    //放到队列头部
    searchArr.unshift(val);
    }else {//存在
    //把原来的值删除
    removeByValue(searchArr, val)
    searchArr.unshift(val)
    }
    }

    原文:https://blog.csdn.net/qq_40916110/article/details/87855502

  • 相关阅读:
    Parameter Binding in ASP.NET Web API
    Which HTTP methods match up to which CRUD methods?
    ErrorHandling in asp.net web api
    HttpStatusCode
    Autofac Getting Started(默认的构造函数注入)
    Autofac Controlling Scope and Lifetime
    luvit 被忽视的lua 高性能框架(仿nodejs)
    undefined与null的区别
    VsCode中使用Emmet神器快速编写HTML代码
    字符串匹配---KMP算法
  • 原文地址:https://www.cnblogs.com/hanzeng1993/p/11280518.html
Copyright © 2011-2022 走看看