zoukankan      html  css  js  c++  java
  • (十四)Exploring Your Data

    Sample Dataset

    Now that we’ve gotten a glimpse of the basics, let’s try to work on a more realistic dataset. I’ve prepared a sample of fictitious JSON documents of customer bank account information. Each document has the following schema:

    现在我们已经了解了基础知识,让我们尝试更真实的数据集。我准备了一份关于客户银行账户信息的虚构JSON文档样本。每个文档都有以下架构:
    {
        "account_number": 0,
        "balance": 16623,
        "firstname": "Bradshaw",
        "lastname": "Mckenzie",
        "age": 29,
        "gender": "F",
        "address": "244 Columbus Place",
        "employer": "Euron",
        "email": "bradshawmckenzie@euron.com",
        "city": "Hobucken",
        "state": "CO"
    }

    For the curious, this data was generated using www.json-generator.com/, so please ignore the actual values and semantics of the data as these are all randomly generated.

    奇怪的是,这些数据是使用www.json-generator.com/生成的,因此请忽略数据的实际值和语义,因为这些都是随机生成的。
     

    Loading the Sample Dataset

    You can download the sample dataset (accounts.json) from here. Extract it to our current directory and let’s load it into our cluster as follows:

    您可以从此处下载示例数据集(accounts.json)。将它解压缩到我们当前的目录,然后将它们加载到我们的集群中,如下所示:
     
    curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_doc/_bulk?pretty&refresh" --data-binary "@accounts.json"
    curl "localhost:9200/_cat/indices?v"

    And the response:

    health status index uuid                   pri rep docs.count docs.deleted store.size pri.store.size
    yellow open   bank  l7sSYV2cQXmu6_4rJWVIww   5   1       1000            0    128.6kb        128.6kb

    Which means that we just successfully bulk indexed 1000 documents into the bank index (under the _doc type).

    这意味着我们只是成功地将1000个文档批量索引到银行索引(在_doc类型下)。
     
  • 相关阅读:
    USACO 2.1 Hamming Codes
    USACO 2.1 Healthy Holsteins
    USACO 2.1 Sorting a Three-Valued Sequence
    USACO 2.1 Ordered Fractions
    USACO 2.1 The Castle
    USACO 1.5 Superprime Rib
    1145: 零起点学算法52——数组中删数II
    1144: 零起点学算法51——数组中删数
    1143: 零起点学算法50——数组中查找数
    1142: 零起点学算法49——找出数组中最大元素的位置(下标值)
  • 原文地址:https://www.cnblogs.com/shuaiandjun/p/10273257.html
Copyright © 2011-2022 走看看