zoukankan      html  css  js  c++  java
  • search(16)- elastic4s-内嵌文件:nested and join

       从SQL领域来的用户,对于ES的文件关系维护方式会感到很不习惯。毕竟,ES是分布式数据库只能高效处理独个扁平类型文件,无法支持关系式数据库那样的文件拼接。但是,任何数据库应用都无法避免树型文件关系,因为这是业务模式需要的表现形式。在ES里,无论nested或join类型的数据,父-子关系的数据文件实际上是放在同一个索引index里的。在ES里已经没有数据表(doc_type)的概念。但从操作层面上ES提供了relation类型来支持父-子数据关系操作。所以,nested数据类型一般用来表达比较固定的嵌入数据。因为每次更新都需要重新对文件进行一次索引。join类型的数据则可以对数据关系的两头分别独立进行更新,方便很多。

    下面我们现示范一下nested数据类型的使用。在mapping里可以申明nested数据类型来代表嵌入文件,如下:

      val fruitMapping = client.execute(
        putMapping("fruits").fields(
          KeywordField("code"),
          SearchAsYouTypeField("name")
            .fields(KeywordField("keyword")),
          floatField("price"),
          NestedField("location").fields(
            KeywordField("shopid"),
            textField("shopname"),
            longField("qty"))
          )
      ).await

    这段代码产生了下面的mapping:

    {
      "fruits" : {
        "mappings" : {
          "properties" : {
            "code" : {
              "type" : "keyword"
            },
            "location" : {
              "type" : "nested",
              "properties" : {
                "qty" : {
                  "type" : "long"
                },
                "shopid" : {
                  "type" : "keyword"
                },
                "shopname" : {
                  "type" : "text"
                }
              }
            },
            "name" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword"
                }
              }
            },
            "price" : {
              "type" : "float"
            }
          }
        }
      }
    }

    location是个nested类型字段,内嵌文件格式含shopid,shopname,qty各字段。下面的例子里向fruits索引添加了几个包含了location的文件:

      val f1 = indexInto("fruits").id("f001")
          .fields(
            "code" -> "f001",
            "name" -> "东莞荔枝",
            "price" -> 11.5,
            "location" -> List(Map(
              "shopid" -> "s001",
              "shopname" -> "中心店",
              "qty" -> 500.0
              ),
              Map(
                "shopid" -> "s002",
                "shopname" -> "东门店",
                "qty" -> 0.0
              )
            )
          )
      val f2 = indexInto("fruits").id("f002")
        .fields(
          "code" -> "f002",
          "name" -> "陕西富士苹果",
          "price" -> 11.5,
          "location" -> List(Map(
            "shopid" -> "s001",
            "shopname" -> "中心店",
            "qty" -> 300.0
          ),
            Map(
              "shopid" -> "s003",
              "shopname" -> "龙岗店",
              "qty" -> 200.0
            )
          )
        )
      val f3 = indexInto("fruits").id("f003")
        .fields(
          "code" -> "f003",
          "name" -> "进口菲律宾香蕉",
          "price" -> 5.3,
          "location" -> List(Map(
            "shopid" -> "s001",
            "shopname" -> "中心店",
            "qty" -> 300.0
          ),
            Map(
              "shopid" -> "s003",
              "shopname" -> "龙岗店",
              "qty" -> 200.0
            ),
            Map(
              "shopid" -> "s002",
              "shopname" -> "东门店",
              "qty" -> 200.0
            )
          )
        )
      val newIndex = for {
         _ <- client.execute(f1)
         _ <- client.execute(f2)
         _ <- client.execute(f3)
      } yield ("成功增添三条记录")
    
      newIndex.onComplete {
        case Success(trb) => println(s"${trb}")
        case Failure(err) => println(s"error: ${err.getMessage}")
      }

    用elastic4s可以比较方便的进行nested类型数据更新。下面是个更新nested文件的例子:

      val f002 = client.execute(get("fruits","f002").fetchSourceInclude("location")).await
      val locs: List[Map[String,Any]] = f002.result.source("location").asInstanceOf[List[Map[String,Any]]]
      val newloc = Map("shopid" -> "s004","shopname" -> "宝安店", "qty" -> 23)
      val newlocs = locs.foldLeft(List[Map[String,Any]]()) { (b, m) =>
        if (m("shopid") != newloc("shopid"))
          m :: b
        else b
      }
    
      val newdoc = updateById("fruits","f002")
        .doc(
          Map(
            "location" -> (newloc :: newlocs)
          )
        )

    在上面这个例子里:需要把一条新的嵌入文件s004更新到f002文件里。我们先把f002里原来的location取出,去掉s004节点,然后将新节点加入location清单,再更新update f002文件。

    刚才提到过:join类型实际上还是在同一个索引里实现的。比如我希望记录每个fruit的进货历史,也就是说现在fruit下需要增加一个子文件purchase_history。这个purchase_history也是在同一个mapping里定义的:

      val fruitMapping = client.execute(
        putMapping("fruits").fields(
          KeywordField("code"),
          SearchAsYouTypeField("name")
            .fields(KeywordField("keyword")),
          floatField("price"),
          NestedField("location").fields(
            KeywordField("shopid"),
            textField("shopname"),
            longField("qty")),
    //purchase_history
        keywordField("supplier_code"),
        textField("supplier_name"),
        dateField("purchase_date")
          .ignoreMalformed(true)
          .format("strict_date_optional_time||epoch_millis"),
        joinField("purchase_history")
          .relation("fruit","purchase")
        )
      ).await

    下面是关于上层父文件的索引indexing操作的例子:

      val f1 = indexInto("fruits").id("f001").routing("f001")
          .fields(
            "code" -> "f001",
            "name" -> "东莞荔枝",
            "price" -> 11.5,
            "location" -> List(Map(
              "shopid" -> "s001",
              "shopname" -> "中心店",
              "qty" -> 500.0
              ),
              Map(
                "shopid" -> "s002",
                "shopname" -> "东门店",
                "qty" -> 0.0
              )
            ),
            "purchase_history" -> "fruit"
          )
      val f2 = indexInto("fruits").id("f002").routing("f002")
        .fields(
          "code" -> "f002",
          "name" -> "陕西富士苹果",
          "price" -> 11.5,
          "location" -> List(Map(
            "shopid" -> "s001",
            "shopname" -> "中心店",
            "qty" -> 300.0
          ),
            Map(
              "shopid" -> "s003",
              "shopname" -> "龙岗店",
              "qty" -> 200.0
            )
          ),
          "purchase_history" -> "fruit"
        )
      val f3 = indexInto("fruits").id("f003").routing("f003")
        .fields(
          "code" -> "f003",
          "name" -> "进口菲律宾香蕉",
          "price" -> 5.3,
          "location" -> List(Map(
            "shopid" -> "s001",
            "shopname" -> "中心店",
            "qty" -> 300.0
          ),
            Map(
              "shopid" -> "s003",
              "shopname" -> "龙岗店",
              "qty" -> 200.0
            ),
            Map(
              "shopid" -> "s002",
              "shopname" -> "东门店",
              "qty" -> 200.0
            )
          ),
          "purchase_history" -> "fruit"
        )
      val newIndex = for {
         _ <- client.execute(f1)
         _ <- client.execute(f2)
         _ <- client.execute(f3)
      } yield ("成功增添三条记录")

    elastic4s子文件的索引操作示范如下:

      val h1 = indexInto("fruits").id("h001").routing("f003")
        .fields(
          "supplier_code" -> "v001",
          "supplier_name" -> "百果园",
          "purchase_date" -> "2020-02-09",
          "purchase_history" -> Child("purchase", "f003"))
    
      val h2 = indexInto("fruits").id("h002").routing("f002")
        .fields(
          "supplier_code" -> "v001",
          "supplier_name" -> "百果园",
          "purchase_date" -> "2019-10-11",
          "purchase_history" -> Child("purchase", "f002"))
    
      val h3 = indexInto("fruits").id("h003").routing("f002")
        .fields(
          "supplier_code" -> "v002",
          "supplier_name" -> "华南城花果批发市场",
          "purchase_date" -> "2020-01-23",
          "purchase_history" -> Child("purchase", "f002"))
    
    
      val childIndex = for {
        _ <- client.execute(h1)
        _ <- client.execute(h2)
        _ <- client.execute(h3)
      } yield ("成功增添三条子记录")

    好了,现在这个fruits索引里已经包含了nested,join两种嵌入文件数据。下面我们就试试各种的读取方式。首先nested类型数据可以通过nestedQuery读取:

      val qNested = search("fruits").query(
        nestedQuery("location").query(
          matchQuery("location.shopname","中心")
        )
      )
      println(s"${qNested.show}")
      val nestedResult = client.execute(qNested).await
      if(nestedResult.isSuccess)
        nestedResult.result.hits.hits.foreach(m => println(s"${m.sourceAsMap}"))
      else println(s"Error: ${nestedResult.error.causedBy.getOrElse("unknown")}")
    
    ...
    
    POST:/fruits/_search?
    StringEntity({"query":{"nested":{"path":"location","query":{"match":{"location.shopname":{"query":"中心"}}}}}},Some(application/json))
    HashMap(name -> 东莞荔枝, location -> List(Map(shopid -> s001, shopname -> 中心店, qty -> 500.0), Map(shopid -> s002, shopname -> 东门店, qty -> 0.0)), price -> 11.5, purchase_history -> fruit, code -> f001)
    HashMap(name -> 进口菲律宾香蕉, location -> List(Map(shopid -> s001, shopname -> 中心店, qty -> 300.0), Map(shopid -> s003, shopname -> 龙岗店, qty -> 200.0), Map(shopid -> s002, shopname -> 东门店, qty -> 200.0)), price -> 5.3, purchase_history -> fruit, code -> f003)
    HashMap(name -> 陕西富士苹果, location -> List(Map(shopname -> 宝安店, qty -> 23, shopid -> s004), Map(shopname -> 龙岗店, qty -> 200.0, shopid -> s003), Map(shopname -> 中心店, qty -> 300.0, shopid -> s001)), price -> 11.5, purchase_history -> fruit, code -> f002)

    join类型子文件可以通过子文件的ParentID Query读取:

      val qPid = search("fruits").query(
         ParentIdQuery("purchase","f002")
      )
      println(s"${qPid.show}")
    
      val pidResult = client.execute(qPid).await
      if(pidResult.isSuccess)
        pidResult.result.hits.hits.foreach(m => println(s"${m.sourceAsMap}"))
      else println(s"Error: ${pidResult.error.causedBy.getOrElse("unknown")}")
    
    ...
    
    POST:/fruits/_search?
    StringEntity({"query":{"parent_id":{"type":"purchase","id":"f002"}}},Some(application/json))
    Map(supplier_code -> v001, supplier_name -> 百果园, purchase_date -> 2019-10-11, purchase_history -> Map(name -> purchase, parent -> f002))
    Map(supplier_code -> v002, supplier_name -> 华南城花果批发市场, purchase_date -> 2020-01-23, purchase_history -> Map(name -> purchase, parent -> f002))

    join类型父辈文件可以通过搜索其子文件hasChild获取:

      val qHaschild = search("fruits").query(
         hasChildQuery("purchase",
           matchQuery("supplier_name","百果")
         )
      )
      println(s"${qHaschild.show}")
      val haschildResult = client.execute(qHaschild).await
      if(haschildResult.isSuccess)
        haschildResult.result.hits.hits.foreach(m => println(s"${m.sourceAsMap}"))
      else println(s"Error: ${haschildResult.error.causedBy.getOrElse("unknown")}")
    
    ...
    
    POST:/fruits/_search?
    StringEntity({"query":{"has_child":{"type":"purchase","score_mode":"none","query":{"match":{"supplier_name":{"query":"百果"}}}}}},Some(application/json))
    HashMap(name -> 进口菲律宾香蕉, location -> List(Map(shopid -> s001, shopname -> 中心店, qty -> 300.0), Map(shopid -> s003, shopname -> 龙岗店, qty -> 200.0), Map(shopid -> s002, shopname -> 东门店, qty -> 200.0)), price -> 5.3, purchase_history -> fruit, code -> f003)
    HashMap(name -> 陕西富士苹果, location -> List(Map(shopname -> 宝安店, qty -> 23, shopid -> s004), Map(shopname -> 龙岗店, qty -> 200.0, shopid -> s003), Map(shopname -> 中心店, qty -> 300.0, shopid -> s001)), price -> 11.5, purchase_history -> fruit, code -> f002)

    join类型子文件也可以搜索其父辈文件获取:

     val qHasparent= search("fruits").query(
        hasParentQuery("fruit",
          nestedQuery("location").query(
            matchQuery("location.shopname","中心")
          ),false
        )
      )
      println(s"${qHasparent.show}")
      val hasparentResult = client.execute(qHasparent).await
      if(hasparentResult.isSuccess)
        hasparentResult.result.hits.hits.foreach(m => println(s"${m.sourceAsMap}"))
      else println(s"Error: ${hasparentResult.error.causedBy.getOrElse("unknown")}")
    
    ...
    
    OST:/fruits/_search?
    StringEntity({"query":{"has_parent":{"parent_type":"fruit","query":{"nested":{"path":"location","query":{"match":{"location.shopname":{"query":"中心"}}}}}}}},Some(application/json))
    Map(supplier_code -> v001, supplier_name -> 百果园, purchase_date -> 2020-02-09, purchase_history -> Map(name -> purchase, parent -> f003))
    Map(supplier_code -> v001, supplier_name -> 百果园, purchase_date -> 2019-10-11, purchase_history -> Map(name -> purchase, parent -> f002))
    Map(supplier_code -> v002, supplier_name -> 华南城花果批发市场, purchase_date -> 2020-01-23, purchase_history -> Map(name -> purchase, parent -> f002))

    上面这个例子稍微复杂一点:我们想得出所有子文件,它们的父辈文件里嵌入nested文件包含location.shopname match "中心"。

    这些例子主要展示了如何通过父子关系的一方取获取另一方的数据,如:通过子文件搜索获取对应的父文件或通过父文件获取对应的子文件。也就是说搜索目标和获取目标:父子、子父,不是同一种文件。我们可以通过inner_hits来同时获取符合搜索条件的文件。如nestedQuery.inner():

     val qNested = search("fruits").query(
        nestedQuery("location").query(
          matchQuery("location.shopname","中心")
        ).inner(InnerHit("locations"))
      )
      println(s"${qNested.show}")
      val nestedResult = client.execute(qNested).await
      if(nestedResult.isSuccess) {
        nestedResult.result.hits.hits.foreach{ m =>
            println(s"${m.sourceAsMap}")
            m.innerHits.foreach { i =>
              val n = i._1
              i._2.hits.foreach(h => println(s"$n, ${h.source}"))
            }
        }
      } else println(s"Error: ${nestedResult.error.causedBy.getOrElse("unknown")}")
    
    ...
    
    POST:/fruits/_search?
    StringEntity({"query":{"nested":{"path":"location","query":{"match":{"location.shopname":{"query":"中心"}}},"inner_hits":{"name":"locations"}}}},Some(application/json))
    HashMap(name -> 东莞荔枝, location -> List(Map(shopid -> s001, shopname -> 中心店, qty -> 500.0), Map(shopid -> s002, shopname -> 东门店, qty -> 0.0)), price -> 11.5, purchase_history -> fruit, code -> f001)
    locations, Map(shopid -> s001, shopname -> 中心店, qty -> 500.0)
    HashMap(name -> 进口菲律宾香蕉, location -> List(Map(shopid -> s001, shopname -> 中心店, qty -> 300.0), Map(shopid -> s003, shopname -> 龙岗店, qty -> 200.0), Map(shopid -> s002, shopname -> 东门店, qty -> 200.0)), price -> 5.3, purchase_history -> fruit, code -> f003)
    locations, Map(shopid -> s001, shopname -> 中心店, qty -> 300.0)
    HashMap(name -> 陕西富士苹果, location -> List(Map(shopname -> 宝安店, qty -> 23, shopid -> s004), Map(shopname -> 龙岗店, qty -> 200.0, shopid -> s003), Map(shopname -> 中心店, qty -> 300.0, shopid -> s001)), price -> 11.5, purchase_history -> fruit, code -> f002)
    locations, Map(shopname -> 中心店, qty -> 300.0, shopid -> s001)

    hasChildQuery.innerHit():

      val qHaschild = search("fruits").query(
         hasChildQuery("purchase",
           matchQuery("supplier_name","百果")
         ).innerHit("purchases")
      )
      println(s"${qHaschild.show}")
      val haschildResult = client.execute(qHaschild).await
      if(haschildResult.isSuccess) {
        haschildResult.result.hits.hits.foreach{m =>
          println(s"${m.sourceAsMap}")
          m.innerHits.foreach { i =>
            val n = i._1
            i._2.hits.foreach(h => println(s"$n, ${h.source}"))
          }
        }
      } else println(s"Error: ${haschildResult.error.causedBy.getOrElse("unknown")}")
    
    ...
    
    POST:/fruits/_search?
    StringEntity({"query":{"has_child":{"type":"purchase","score_mode":"none","query":{"match":{"supplier_name":{"query":"百果"}}},"inner_hits":{"name":"purchases"}}}},Some(application/json))
    HashMap(name -> 进口菲律宾香蕉, location -> List(Map(shopid -> s001, shopname -> 中心店, qty -> 300.0), Map(shopid -> s003, shopname -> 龙岗店, qty -> 200.0), Map(shopid -> s002, shopname -> 东门店, qty -> 200.0)), price -> 5.3, purchase_history -> fruit, code -> f003)
    purchases, Map(supplier_code -> v001, supplier_name -> 百果园, purchase_date -> 2020-02-09, purchase_history -> Map(name -> purchase, parent -> f003))
    HashMap(name -> 陕西富士苹果, location -> List(Map(shopname -> 宝安店, qty -> 23, shopid -> s004), Map(shopname -> 龙岗店, qty -> 200.0, shopid -> s003), Map(shopname -> 中心店, qty -> 300.0, shopid -> s001)), price -> 11.5, purchase_history -> fruit, code -> f002)
    purchases, Map(supplier_code -> v001, supplier_name -> 百果园, purchase_date -> 2019-10-11, purchase_history -> Map(name -> purchase, parent -> f002))
    purchases, Map(supplier_code -> v002, supplier_name -> 华南城花果批发市场, purchase_date -> 2020-01-23, purchase_history -> Map(name -> purchase, parent -> f002))

    hasParentQuery.innerHit():

      val qHasparent= search("fruits").query(
        hasParentQuery("fruit",
          nestedQuery("location").query(
            matchQuery("location.shopname","中心")
          ),false
        ).innerHit(InnerHit("fruits"))
      )
      println(s"${qHasparent.show}")
      val hasparentResult = client.execute(qHasparent).await
      if(hasparentResult.isSuccess) {
        hasparentResult.result.hits.hits.foreach{m =>
          println(s"${m.sourceAsMap}")
          m.innerHits.foreach { i =>
            val n = i._1
            i._2.hits.foreach(h => println(s"$n, ${h.source}"))
          }
        }
      } else println(s"Error: ${hasparentResult.error.causedBy.getOrElse("unknown")}")
    
    ...
    
    POST:/fruits/_search?
    StringEntity({"query":{"has_parent":{"parent_type":"fruit","query":{"nested":{"path":"location","query":{"match":{"location.shopname":{"query":"中心"}}}}},"inner_hits":{"name":"fruits"}}}},Some(application/json))
    Map(supplier_code -> v001, supplier_name -> 百果园, purchase_date -> 2020-02-09, purchase_history -> Map(name -> purchase, parent -> f003))
    fruits, HashMap(name -> 进口菲律宾香蕉, location -> List(Map(shopid -> s001, shopname -> 中心店, qty -> 300.0), Map(shopid -> s003, shopname -> 龙岗店, qty -> 200.0), Map(shopid -> s002, shopname -> 东门店, qty -> 200.0)), price -> 5.3, purchase_history -> fruit, code -> f003)
    Map(supplier_code -> v001, supplier_name -> 百果园, purchase_date -> 2019-10-11, purchase_history -> Map(name -> purchase, parent -> f002))
    fruits, HashMap(name -> 陕西富士苹果, location -> List(Map(shopname -> 宝安店, qty -> 23, shopid -> s004), Map(shopname -> 龙岗店, qty -> 200.0, shopid -> s003), Map(shopname -> 中心店, qty -> 300.0, shopid -> s001)), price -> 11.5, purchase_history -> fruit, code -> f002)
    Map(supplier_code -> v002, supplier_name -> 华南城花果批发市场, purchase_date -> 2020-01-23, purchase_history -> Map(name -> purchase, parent -> f002))
    fruits, HashMap(name -> 陕西富士苹果, location -> List(Map(shopname -> 宝安店, qty -> 23, shopid -> s004), Map(shopname -> 龙岗店, qty -> 200.0, shopid -> s003), Map(shopname -> 中心店, qty -> 300.0, shopid -> s001)), price -> 11.5, purchase_history -> fruit, code -> f002)
  • 相关阅读:
    2thweek.training.c。链表典型题
    队列—summer training B.
    栈—summertraining#2.A题
    周题:UVa10736题。Foreign Exchange
    UVA_11877.第三次比赛C题:The Coco Cola
    uva10465 Homer Simpson(水题)
    uva348 Optimal Array Multiplication Sequence(DP)
    uva116 Unidirectional TSP(DP)
    uva 607 Scheduling Lectures(DP)
    uva11598 Optimal Segments(DP 求方案)
  • 原文地址:https://www.cnblogs.com/tiger-xc/p/12941077.html
Copyright © 2011-2022 走看看