一.概念
1.1 基础概念
ELK: 是ElasticSearch,LogStash以及Kibana三个产品的首字母缩写
lucene : apache 的全文搜索引擎工具包
elasticsearch : ElasticSearch是一个基于全文检索引擎lucene实现的一个面向文档的schema free的数据库。所有对数据库的配置、监控及操作都通过Restful接口完成。数据格式为json。默认支持节点自动发现,数据自动复制,自动分布扩展,自动负载均衡。适合处理最大千万级别的数据的检索。处理效率非常高。可以理解为elasticSearch是一个在lucene基础上增加了restful接口及分布式技术的整合。
elasticsearch : http协议访问默认使用9200端口
elasticsearch : tcp协议访问默认使用9300端口
操作elasticsearch的四种方式:
Kibana:使用http
原始的api:使用tcp
RestAPI:使用http
Sde(SpringDataElasticsearch): 使用tcp
tcp传输效率比http高
1.2 elasticsearch概念
Index:存储数据的逻辑区域,类似关系型数据库中的database,是文档的命名空间。如下图的湖蓝色部分所示,Index为twitter。
Type:类似关系型数据库中的Table,是包含一系列field的json数据。储存一系列类似的field。如下图的黄色部分所示,Type为tweet。不同document里面同名的field一定要是相同类型的。
Document:存储的实体数据,类似关系型数据库中的Row,是具体的包含一组filed的资料。如下图橙色部分所示,包含user,post_data,message三个field。
Field:即关系型数据库中Column, Document的一个组成部分,有两个部分组成,name和value。如下图紫色部分所示 post_date及其具体的值就是一个field。
Mapping:存储field的相关映射信息,不同document type会有不同的mapping。
Term:不可分割的单词,搜索最小单元。不同的分析器对同样的内容的分析结果是不同的。也就得到不同的term。
Token:一个Term呈现方式,包含这个Term的内容,在文档中的起始位置,以及类型。
Node:对应这关系型数据库中的数据库实例。
Cluster:由多个node组成的一组服务实例。
Shard:关系型数据库中无此概念,是Lucene搜索的最小单元。一个index可能会存在于多个shards,不同shards可能在不同nodes。一个lucene index在es中我们称为一个shard,而es中的index则是一系列shard。当es执行search操作,会将请求发送到这个index包含的所有shard上去,然后将没一个shard上的执行结果搜集起来作为最终的结果。shard的个数在创建索引之后不能改变!
Replica:shard的备份,有一个primary shard,其余的叫做replica shards。Elasticsearch采用的是Push Replication模式,当你往 master主分片上面索引一个文档,该分片会复制该文档(document)到剩下的所有 replica副本分片中,这些分片也会索引这个文档
文档的录入时,Elasticsearch通过对docid进行hash来确定其放在哪个shard上面,然后在shard上面进行索引存储。
和数据库的对应:
mysql数据库 |
ES |
Database |
Indices index的复数 |
Table |
Type 一般一个索引库中只有一个type |
数据 |
Document |
约束 列存储什么数据类型之类的 |
Mapping 规定字段什么数据类型、什么分词器 |
Column |
Field |
二.Kibana操作索引库
1. 连接
2. 操作
创建类型并且制定每个字段的属性(数据类型、是否存储、是否索引、哪种分词器 put ahd/_mapping/goods { "properties":{ "goodsName":{ "type":"text", "analyzer":"ik_max_word", "index":"true", "store":"true" }, "price":{ "type":"double", "index":"true", "store":"false" }, "brand":{ "type":"keyword", "index":"true", "store":"true" } } } 查询创建的索引/映射 get ahd/_mapping[/goods] 分片5,副本1 put /heima { "settings":{ "number_of_shards":5, "number_of_replicas":1 } } 创建索影库2 put ahd2 创建索引库及其字段 put ahd2 { "mappings":{ "goods":{ "properties":{ "goodsname":{ "analyzer":"ik_max_word", "type":"text", "store":"true", "index":"true" }, "price":{ "type":"double", "index":"true", "store":"true" }, "brand":{ "type":"text", "index":"true", "store":"true" } }
} } } 添加一条数据:指定id的新增 post ahd/goods/1 { "goodsname":"华为p20手机", "brand":"华为", "price":2299 } 根据id查询记录 get ahd/goods/1 修改, post ahd/goods/1 { "goodsname":"华为p20手机", "brand":"华为", "price":2599 } 不指定id插入一条数据 post ahd/goods { "goodsname":"小米手机6", "brand":"小米", "price":"2500" } 插入数据最好还是使用post,修改数据使用put 使用put和使用post是一样的效果 指定id删除一条数据 delete ahd/goods/IkXNN2wBr0WPOOKNJpRg 自定义模板 1. 首先先添加一个索引库, put ahd3 { "mappings":{ "goods":{ "properties":{ "image":{ "type":"text", "index":"false", "store":"true" }, "goodsname":{ "analyzer":"ik_max_word", "type":"text", "store":"true", "index":"true" }, "price":{ "type":"double", "index":"true", "store":"true" }, "brand":{ "type":"text", "index":"true", "store":"true" } } } } } 在添加的这个索引库基础上添加模板(改动添加语句) put ahd3 { "mappings":{ "goods":{ "properties":{ "image":{ "type":"text", "index":"false", "store":"true" }, "goodsname":{ "analyzer":"ik_max_word", "type":"text", "store":"true", "index":"true" }, "price":{ "type":"double", "index":"true", "store":"true" }, "brand":{ "type":"text", "index":"true", "store":"true" } } , "dynamic_templates":[ { "mystring":{ "match_mapping_type":"string", "mapping":{ "type":"keyword" } } } ]
} } } 新增数据还就只能使用post 在ahd3中新添加一条数据 post ahd3/goods { "goodsname":"小米6X手机", "price":1199, "image":"http://image.im.com/123.jpg", "brand":"小米" } 查询goods document get ahd3/_mapping/goods ===================================================================== ===================================================================== =========================查询(重点)================================== ===================================================================== ===================================================================== 1.查询所有 get ahd3/_search { "query":{ "match_all": {
} } } 2.term查询:精确查询 get ahd3/_search { "query":{ "term":{ "goodsname":"小米" } } } 注意,第一行不能有大括号{ *.在添加一条数据,进行测试, post ahd3/goods { "goodsname":"大米", "brand":"吊牌", "price":200, "image":"http://localhost:8080/a.jpg" } 进行查询测试 get ahd3/_search { "query":{ "term":{ "goodsname": "小米" } } } 插入一条新的记录 post ahd3/goods { "goodsname":"大米手机", "price":20000, "brand":"大米", "image":"http://baidu.com/a.jpg" } 3.分词查询match测试 get ahd3/_search { "query":{ "match": { "brand":"米" } } } 2.4 Range范围查询 get ahd3/_search { "query":{ "range":{ "price":{ "lte":1000, "gte":100 } } } } 新添加一条数据 post ahd3/goods { "goodsname":"appla", "brand":"apple", "price":5000, "image":"http://www.baidu.com/sadf.jpg" } 2.5 Fuzzy容错 get ahd3/goods/_search { "query":{ "fuzzy":{ "goodsname":{ "value": "apple", "fuzziness": 1 } } } } 2.6 Bool组合查询 get ahd3/goods/_search { "query":{ "bool": { "must":{ "match":{ "goodsname":"大米" }
} } } } 测试json书写是否正确 get ahd3/goods/_search { "query":{ "bool": { "must":[{ "match":{ "goodsname":"大米" } },{ "term":{ "brand":"大米" } } ] } } } 显示字段的过滤 只显示goodsname get ahd3/_search { "_source":{ "includes":["goodsname"] } } 排除goodsname get ahd3/_search { "_source":{ "excludes":["goodsname"] } } 3.2 、查询结果的过滤 查询结果的过滤 get ahd3/_search { "query":{ "bool": { "must": { "term":{ "goodsname":"小米" } }, "filter":{ "range": { "price": { "gte": 10, "lte": 20000 } } }
} } } 分页: get ahd3/_search { "query":{ "match_all": {
} }, "from":2, "size":2 } 排序倒序 get ahd3/_search { "query":{ "match_all": {
} }, "sort":{ "price":"desc" } } 高亮 get ahd3/_search { "query":{ "term": { "goodsname": { "value": "小米" } } }, "highlight":{ "pre_tags":"<a href='www.baidu.com'>", "post_tags":"</a>", "fields":{ "goodsname":{} } } } 聚合: get /ahd3/goods/_search { "size":0, "aggs":{ "populor_color":{ "terms": { "field": "price", "size": 10 }
} } } |
三.原始的api操作索引库(tcp:9300)
2.1导入依赖
<dependencies> <dependency> <groupId>com.alibaba</groupId> <artifactId>fastjson</artifactId> <version>1.2.35</version> </dependency> |
2.2原始api操作索引库
TransportClient client = new PreBuiltTransportClient(Settings.EMPTY)
public class EsManager {
private TransportClient
client = null;
@Before
public
void init() throws Exception{
client
= new PreBuiltTransportClient(Settings.EMPTY)
.addTransportAddress(new TransportAddress(InetAddress.getByName("127.0.0.1"), 9300));
}
@After
public
void end(){
client.close();
}
}
第三步:各种查询
@Test
public void queryTest()
throws Exception{
//
QueryBuilder queryBuilder = QueryBuilders.matchAllQuery();
// QueryBuilder queryBuilder =
QueryBuilders.matchQuery("goodsName","小米手机");
// QueryBuilder queryBuilder =
QueryBuilders.termQuery("goodsName","小米");
// FuzzyQueryBuilder queryBuilder
= QueryBuilders.fuzzyQuery("goodsName", "大米");
// queryBuilder.fuzziness(Fuzziness.ONE);
// QueryBuilder queryBuilder =
QueryBuilders.rangeQuery("price").gte(1000).lte(2000);
BoolQueryBuilder
queryBuilder = QueryBuilders.boolQuery();
queryBuilder.must(QueryBuilders.rangeQuery("price").gte(1000).lte(8000));
queryBuilder.mustNot(QueryBuilders.termQuery("goodsName",
"华为"));
SearchResponse searchResponse = client.prepareSearch("heima").setQuery(queryBuilder).get();
SearchHits searchHits =
searchResponse.getHits();
long totalHits
= searchHits.getTotalHits();
System.out.println("总记录数:"+totalHits);
SearchHit[] hits =
searchHits.getHits();
for (SearchHit
hit : hits) {
String sourceAsString =
hit.getSourceAsString();
Goods goods = JSON.parseObject(sourceAsString,
Goods.class);
System.out.println(goods);
}
}
四.RestAPI操作索引库(http:9200)
3.1 坐标
<parent> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>2.1.3.RELEASE</version> </parent> <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-logging</artifactId> </dependency> <dependency> <groupId>com.google.code.gson</groupId> <artifactId>gson</artifactId> <version>2.8.5</version> </dependency> <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-lang3</artifactId> <version>3.8.1</version> </dependency> <dependency> <groupId>org.elasticsearch.client</groupId> <artifactId>elasticsearch-rest-high-level-client</artifactId> <version>6.4.3</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-maven-plugin</artifactId> </plugin> </plugins> </build> |
3.2 RestAPI操作索引库
1.初始化client
private RestHighLevelClient client
= null; |
2.准备pojo对象(使用lombok)
@Data |
// 新增或修改 IndexRequest
Item
item = new Item(1L,"大米6X手机","手机","小米",1199.0,"http.jpg");
String jsonStr = gson.toJson(item);
IndexRequest request = new IndexRequest("item","docs",item.getId().toString());
request.source(jsonStr,
XContentType.JSON);
client.index(request,
RequestOptions.DEFAULT);
修改文档数据
就是使用上面的新增方法,它既是新增也是修改
根据id获取文档数据
GetRequest request = new
GetRequest("item","docs","1");
GetResponse getResponse = client.get(request,
RequestOptions.DEFAULT);
String sourceAsString = getResponse.getSourceAsString();
Item item = gson.fromJson(sourceAsString,
Item.class);
System.out.println(item);
删除文档数据
DeleteRequest deleteRequest = new
DeleteRequest("item","docs","1");
client.delete(deleteRequest,RequestOptions.DEFAULT);
批量新增文档数据
// 准备文档数据:
List<Item> list = new ArrayList<>();
list.add(new Item(1L, "小米手机7", "手机", "小米", 3299.00,"http://image.leyou.com/13123.jpg"));
list.add(new Item(2L, "坚果手机R1", "手机", "锤子", 3699.00,"http://image.leyou.com/13123.jpg"));
list.add(new Item(3L, "华为META10", "手机", "华为", 4499.00,"http://image.leyou.com/13123.jpg"));
list.add(new Item(4L, "小米Mix2S", "手机", "小米", 4299.00,"http://image.leyou.com/13123.jpg"));
list.add(new Item(5L, "荣耀V10", "手机", "华为", 2799.00,"http://image.leyou.com/13123.jpg"));
BulkRequest bulkRequest = new BulkRequest();
for (Item item : list) {
bulkRequest.add(new IndexRequest("item","docs",item.getId().toString()).source(JSON.toJSONString(item),XContentType.JSON)) ;
}
client.bulk(bulkRequest,RequestOptions.DEFAULT);
各种查询
@Test
public void testQuery() throws Exception{
SearchRequest searchRequest = new SearchRequest("item");
SearchSourceBuilder
searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
searchSourceBuilder.query(QueryBuilders.termQuery("title","小米"));
searchSourceBuilder.query(QueryBuilders.matchQuery("title","小米手机"));
searchSourceBuilder.query(QueryBuilders.fuzzyQuery("title","大米").fuzziness(Fuzziness.ONE));
searchSourceBuilder.query(QueryBuilders.rangeQuery("price").gte(3000).lte(4000));
searchSourceBuilder.query(QueryBuilders.boolQuery().must(QueryBuilders.termQuery("title","手机"))
.must(QueryBuilders.rangeQuery("price").gte(3000).lte(3500)));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
SearchHits searchHits =
searchResponse.getHits();
long total
= searchHits.getTotalHits();
System.out.println("总记录数:"+total);
SearchHit[] hits =
searchHits.getHits();
for (SearchHit
hit : hits) {
String sourceAsString =
hit.getSourceAsString();
Item item = JSON.parseObject(sourceAsString,
Item.class);
System.out.println(item);
}
}
过滤
1、属性字段显示的过滤
searchSourceBuilder.fetchSource(new String[]{"title","category"},null);
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
2、查询结果的过滤
searchSourceBuilder.query(QueryBuilders.termQuery("title","手机"));
searchSourceBuilder.postFilter(QueryBuilders.termQuery("brand","小米"));
分页
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
searchSourceBuilder.from(0); //起始位置
searchSourceBuilder.size(3); //每页显示条数
排序
searchSourceBuilder.sort("id", SortOrder.ASC); // 参数1:排序的域名 参数2:顺序
高亮
构建高亮的条件
searchSourceBuilder.query(QueryBuilders.termQuery("title","小米"));
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.preTags("<font
style='color:red'>");
highlightBuilder.postTags("</font>");
highlightBuilder.field("title");
searchSourceBuilder.highlighter(highlightBuilder);
解析高亮的结果
for (SearchHit hit : hits) {
Map<String, HighlightField>
highlightFields = hit.getHighlightFields();
HighlightField highlightField =
highlightFields.get("title");
String title = highlightField.getFragments()[0].toString();
String sourceAsString =
hit.getSourceAsString();
Item item = JSON.parseObject(sourceAsString,
Item.class);
item.setTitle(title);
System.out.println(item);
}
聚合
需求:根据品牌统计数量
构建的条件代码
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
searchSourceBuilder.aggregation(AggregationBuilders.terms("brandAvg").field("brand"));
解析结果:
Aggregations aggregations =
searchResponse.getAggregations();
Terms terms = aggregations.get("brandAvg");
List<? extends Terms.Bucket>
buckets = terms.getBuckets();
for (Terms.Bucket bucket : buckets) {
System.out.println(bucket.getKeyAsString()+":"+bucket.getDocCount());
}
五.SpringDataElasticsearch操作索引库
1. 准备环境
1、添加依赖
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
2、创建引导类
@SpringBootApplication
public class EsApplication {
public
static void main(String[] args) {
SpringApplication.run(EsApplication.class,args);
}
}
3、添加配置文件 application.yml
spring:
data:
elasticsearch:
cluster-name: leyou-elastic
cluster-nodes: 127.0.0.1:9301,127.0.0.1:9302,127.0.0.1:9303
4、创建一个测试类,注入SDE提供的一个模板
@RunWith(SpringRunner.class)
@SpringBootTest
public class SpringDataEsManager {
@Autowired
private ElasticsearchTemplate
elasticsearchTemplate;
}
Kibana:http
原始的api:tcp
RestAPI:http
Sde: tcp
2. 操作索引库和映射
第一步:准备一个pojo,并且构建和索引的映射关系
@Data
@AllArgsConstructor
@NoArgsConstructor
@Document(indexName="leyou",type
= "goods",shards = 3,replicas = 1)
public class Goods implements Serializable{
@Field(type
= FieldType.Long)
private Long
id;
@Field(type
= FieldType.Text,analyzer = "ik_max_word",store = true)
private String
title; //标题
@Field(type = FieldType.Keyword,index = true,store = true)
private String
category;//
分类
@Field(type = FieldType.Keyword,index = true,store = true)
private String
brand; //
品牌
@Field(type = FieldType.Double,index = true,store
= true)
private Double
price; //
价格
@Field(type = FieldType.Keyword,index = false,store = true)
private String
images; //
图片地址
}
第二步:创建索引库和映射
@Test
public void addIndexAndMapping(){
//
elasticsearchTemplate.createIndex(Goods.class); //根据pojo中的注解创建索引库
elasticsearchTemplate.putMapping(Goods.class); //根据pojo中的注解创建映射
}
3. 操作文档
// 新增或修改
// Goods goods = new
Goods(1L,"大米6X手机","手机","小米",1199.0,"http.jpg");
// goodsRespository.save(goods);
//save or update
// 根据id查询
// Optional<Goods> optional
= goodsRespository.findById(1L);
// Goods goods = optional.get();
// System.out.println(goods);
// 删除
//
goodsRespository.deleteById(1L);
// 批量新增
/* List<Goods> list = new
ArrayList<>();
list.add(new Goods(1L, "小米手机7", "手机", "小米",
3299.00,"http://image.leyou.com/13123.jpg"));
list.add(new Goods(2L, "坚果手机R1", "手机", "锤子",
3699.00,"http://image.leyou.com/13123.jpg"));
list.add(new Goods(3L, "华为META10", "手机", "华为",
4499.00,"http://image.leyou.com/13123.jpg"));
list.add(new Goods(4L, "小米Mix2S", "手机", "小米",
4299.00,"http://image.leyou.com/13123.jpg"));
list.add(new Goods(5L, "荣耀V10", "手机", "华为",
2799.00,"http://image.leyou.com/13123.jpg"));
goodsRespository.saveAll(list);*/
4. 查询
4.1 goodsRespository自带的查询
//
Iterable<Goods> goodsList = goodsRespository.findAll(); //查询所有
// Iterable<Goods> goodsList
= goodsRespository.findAll(Sort.by(Sort.Direction.ASC,"price")); //排序
Iterable<Goods>
goodsList = goodsRespository.findAll(PageRequest.of(0,3)); //分页 page页码是从0开始代表第一页 size 5
for (Goods goods : goodsList) {
System.out.println(goods);
}
4.2 自定义查询方法
可以在接口中根据规定定义一些方法就可以直接使用
public interface GoodsRespository extends ElasticsearchRepository<Goods,Long>{
public List<Goods>
findByTitle(String title);
public List<Goods>
findByBrand(String brand);
public List<Goods>
findByTitleOrBrand(String title,String brand);
public List<Goods>
findByPriceBetween(Double low,Double high);
public List<Goods>
findByBrandAndCategoryAndPriceBetween(String title,String categoty,Double
low,Double high);
}
使用:
// List<Goods> goodsList = goodsRespository.findByTitle("手机");
List<Goods>
goodsList = goodsRespository.findByBrandAndCategoryAndPriceBetween("小米","手机",4000.0,5000.0);
for (Goods
goods : goodsList) {
System.out.println(goods);
}
5. SpringDataElasticSearch结合原生api查询
1、结合native查询
@Test
public
void testQuery(){
NativeSearchQueryBuilder
nativeSearchQueryBuilder = new NativeSearchQueryBuilder();
nativeSearchQueryBuilder.withQuery(QueryBuilders.termQuery("title", "小米"));
// nativeSearchQueryBuilder.withQuery(QueryBuilders.matchAllQuery());
//
nativeSearchQueryBuilder.withPageable(PageRequest.of(0,3,Sort.by(Sort.Direction.DESC,"price")));
nativeSearchQueryBuilder.addAggregation(AggregationBuilders.terms("brandAvg").field("brand"));
AggregatedPage<Goods> aggregatedPage = elasticsearchTemplate.queryForPage(nativeSearchQueryBuilder.build(),
Goods.class,new GoodsHighLightResultMapper());
Aggregations aggregations =
aggregatedPage.getAggregations();
Terms terms = aggregations.get("brandAvg");
List<? extends Terms.Bucket>
buckets = terms.getBuckets();
for (Terms.Bucket bucket : buckets) {
System.out.println(bucket.getKeyAsString()+bucket.getDocCount());
}
List<Goods> content = aggregatedPage.getContent();
for (Goods goods : content) {
System.out.println(goods);
}
}
2、自己处理高亮
需要自定一个用来处理高亮的实现类
class GoodsHighLightResultMapper
implements SearchResultMapper{
@Override
public <T> AggregatedPage<T> mapResults(SearchResponse searchResponse, Class<T> aClass, Pageable
pageable) {
List<T> content = new ArrayList<>();
Aggregations aggregations =
searchResponse.getAggregations();
String scrollId =
searchResponse.getScrollId();
SearchHits searchHits =
searchResponse.getHits();
long total = searchHits.getTotalHits();
float maxScore = searchHits.getMaxScore();
for (SearchHit searchHit : searchHits) {
String sourceAsString =
searchHit.getSourceAsString();
T t = JSON.parseObject(sourceAsString, aClass);
Map<String,
HighlightField> highlightFields = searchHit.getHighlightFields();
HighlightField
highlightField = highlightFields.get("title");
String title =
highlightField.getFragments()[0].toString();
try {
BeanUtils.setProperty(t,"title",title);
} catch (Exception e) {
e.printStackTrace();
}
content.add(t);
}
return
new AggregatedPageImpl<T>(content,pageable,total,aggregations,scrollId,maxScore);
//
List<T> content, Pageable pageable, long total, Aggregations
aggregations, String scrollId, float maxScore
}
}
3、使用