Elasticsearch安装中文分词插件ik

zoukankan html css js c++ java

Elasticsearch安装中文分词插件ik
Elasticsearch默认提供的分词器，会把每一个汉字分开，而不是我们想要的依据关键词来分词。比如：

curl -XPOST "http://localhost:9200/userinfo/_analyze?analyzer=standard&pretty=true&text=我是中国人"
我们会得到这种结果：

{ tokens: [ { token: text start_offset: 2 end_offset: 6 type: <ALPHANUM> position: 1 } { token: 我 start_offset: 9 end_offset: 10 type: <IDEOGRAPHIC> position: 2 } { token: 是 start_offset: 10 end_offset: 11 type: <IDEOGRAPHIC> position: 3 } { token: 中 start_offset: 11 end_offset: 12 type: <IDEOGRAPHIC> position: 4 } { token: 国 start_offset: 12 end_offset: 13 type: <IDEOGRAPHIC> position: 5 } { token: 人 start_offset: 13 end_offset: 14 type: <IDEOGRAPHIC> position: 6 } ] }
正常情况下。这不是我们想要的结果，比方我们更希望 “中国人”，“中国”，“我”这种分词。这样我们就须要安装中文分词插件，ik就是实现这个功能的。

elasticsearch-analysis-ik 是一款中文的分词插件，支持自己定义词库。

安装步骤：

1、到github站点下载源码。站点地址为：https://github.com/medcl/elasticsearch-analysis-ik

右側下方有一个button“Download ZIP"。点击下载源码elasticsearch-analysis-ik-master.zip。

2、解压文件elasticsearch-analysis-ik-master.zip，进入下载文件夹，运行命令：

unzip elasticsearch-analysis-ik-master.zip

3、将解压文件夹文件里config/ik文件夹拷贝到ES安装文件夹config文件夹下。
4、由于是源码。此处须要使用maven打包，进入解压文件夹中，运行命令：
mvn clean package
5、将打包得到的jar文件elasticsearch-analysis-ik-1.2.8-sources.jar拷贝到ES安装文件夹的lib文件夹下。
6、在ES的配置文件config/elasticsearch.yml中添加ik的配置。在最后添加：

index: analysis: analyzer: ik: alias: [ik_analyzer] type: org.elasticsearch.index.analysis.IkAnalyzerProvider ik_max_word: type: ik use_smart: false ik_smart: type: ik use_smart: true
或
index.analysis.analyzer.ik.type : “ik”
7、又一次启动elasticsearch服务，这样就完毕配置了，收入命令：
curl -XPOST "http://localhost:9200/userinfo/_analyze?analyzer=ik&pretty=true&text=我是中国人"
測试结果例如以下：
{ tokens: [ { token: text start_offset: 2 end_offset: 6 type: ENGLISH position: 1 } { token: 我 start_offset: 9 end_offset: 10 type: CN_CHAR position: 2 } { token: 中国人 start_offset: 11 end_offset: 14 type: CN_WORD position: 3 } { token: 中国 start_offset: 11 end_offset: 13 type: CN_WORD position: 4 } { token: 国人 start_offset: 12 end_offset: 14 type: CN_WORD position: 5 } ] }
说明：

1、ES安装插件本来使用使用命令plugin来完毕。可是我本机安装ik时一直不成功，所以就使用源码打包安装了。

2、自己定义词库的方式，请參考 https://github.com/medcl/elasticsearch-analysis-ik
查看全文

相关阅读:
Python：dict用法
 Ubuntu无法识别显示器情况下，高分辨率的设置
 select节点clone全解析
 js控制frameset的rows
jQuery中事情的动态绑定 (转)
jQuery动态添加表格1
使用ajax，后台传回的数据处理
 Spring Boot 之构建Hello Word项目
 linux防火墙基本操作
 Vmware虚拟机中安装cnetOS7详细图解步骤

原文地址：https://www.cnblogs.com/jzssuanfa/p/6855654.html