zoukankan      html  css  js  c++  java
  • linux的统计实现

    场景:

    将下面的数据里category里的分类统计计数

    数据源

    es_ip10000.json

    {"_index":"order","_type":"service","_id":"107.151.83.180:22","_score":1,"_source":{"ip":"107.151.83.180","parent_category":["支撑系统"],"category":["其他支撑系统"]}}
    {"_index":"order","_type":"service","_id":"107.151.84.167:22","_score":1,"_source":{"ip":"107.151.84.167","parent_category":["支撑系统"],"category":["其他支撑系统"]}}
    {"_index":"order","_type":"service","_id":"107.151.84.177:22","_score":1,"_source":{"ip":"107.151.84.177","parent_category":["支撑系统"],"category":["其他支撑系统"]}}
    {"_index":"order","_type":"service","_id":"107.152.188.252:1723","_score":1,"_source":{"ip":"107.152.188.252","parent_category":["网络产品"],"category":["路由器"]}}
    {"_index":"order","_type":"service","_id":"107.151.89.125:1025","_score":1,"_source":{"ip":"107.151.89.125"}}
    {"_index":"order","_type":"service","_id":"107.152.58.217:22","_score":1,"_source":{"ip":"107.152.58.217","parent_category":["支撑系统"],"category":["服务"]}}
    {"_index":"order","_type":"subdomain","_id":"107.15.221.83:443","_score":1,"_source":{"ip":"107.15.221.83","parent_category":["办公外设","系统软件"],"category":["打印机","操作系统"]}}
    

    _source下的category字段

    cat es_ip10000.json | jq ._source.category > category.txt

    输出结果

    [
      "其他支撑系统"
    ]
    [
      "其他支撑系统"
    ]
    [
      "其他支撑系统"
    ]
    [
      "路由器"
    ]
    null
    [
      "服务"
    ]
    [
      "打印机",
      "操作系统"
    ]
    
    

    用编辑器,去除 , []

    处理后的结果

    
      "其他支撑系统"
    
    
      "其他支撑系统"
    
    
      "其他支撑系统"
    
    
      "路由器"
    
    null
    
      "服务"
    
    
      "打印机"
      "操作系统"
    
    

    排序 > 去重->统计->再排序

    cat category.txt | sort | uniq -c | sort -n >category_count.txt

    说明:

    uniq -c #去重并统计

    sort -n # 正序排序

    sort -r # 倒序排序

    输出结果:

          1 null
          1   "操作系统"
          1   "打印机"
          1   "服务"
          1   "路由器"
          3   "其他支撑系统"
         12 
    
    [Haima的博客] http://www.cnblogs.com/haima/
  • 相关阅读:
    HTML知识点链接
    Apache和PHP的安装
    MySql的安装
    MY_FIRSH_MODULE
    【PAT甲级】1053 Path of Equal Weight (30 分)(DFS)
    Atcoder Grand Contest 039B(思维,BFS)
    Codeforces Round #589 (Div. 2)E(组合数,容斥原理,更高复杂度做法为DP)
    Codeforces Round #589 (Div. 2)D(思维,构造)
    【PAT甲级】1052 Linked List Sorting (25 分)
    【PAT甲级】1051 Pop Sequence (25 分)(栈的模拟)
  • 原文地址:https://www.cnblogs.com/haima/p/15118877.html
Copyright © 2011-2022 走看看