zoukankan      html  css  js  c++  java
  • linux的统计实现

    场景:

    将下面的数据里category里的分类统计计数

    数据源

    es_ip10000.json

    {"_index":"order","_type":"service","_id":"107.151.83.180:22","_score":1,"_source":{"ip":"107.151.83.180","parent_category":["支撑系统"],"category":["其他支撑系统"]}}
    {"_index":"order","_type":"service","_id":"107.151.84.167:22","_score":1,"_source":{"ip":"107.151.84.167","parent_category":["支撑系统"],"category":["其他支撑系统"]}}
    {"_index":"order","_type":"service","_id":"107.151.84.177:22","_score":1,"_source":{"ip":"107.151.84.177","parent_category":["支撑系统"],"category":["其他支撑系统"]}}
    {"_index":"order","_type":"service","_id":"107.152.188.252:1723","_score":1,"_source":{"ip":"107.152.188.252","parent_category":["网络产品"],"category":["路由器"]}}
    {"_index":"order","_type":"service","_id":"107.151.89.125:1025","_score":1,"_source":{"ip":"107.151.89.125"}}
    {"_index":"order","_type":"service","_id":"107.152.58.217:22","_score":1,"_source":{"ip":"107.152.58.217","parent_category":["支撑系统"],"category":["服务"]}}
    {"_index":"order","_type":"subdomain","_id":"107.15.221.83:443","_score":1,"_source":{"ip":"107.15.221.83","parent_category":["办公外设","系统软件"],"category":["打印机","操作系统"]}}
    

    _source下的category字段

    cat es_ip10000.json | jq ._source.category > category.txt

    输出结果

    [
      "其他支撑系统"
    ]
    [
      "其他支撑系统"
    ]
    [
      "其他支撑系统"
    ]
    [
      "路由器"
    ]
    null
    [
      "服务"
    ]
    [
      "打印机",
      "操作系统"
    ]
    
    

    用编辑器,去除 , []

    处理后的结果

    
      "其他支撑系统"
    
    
      "其他支撑系统"
    
    
      "其他支撑系统"
    
    
      "路由器"
    
    null
    
      "服务"
    
    
      "打印机"
      "操作系统"
    
    

    排序 > 去重->统计->再排序

    cat category.txt | sort | uniq -c | sort -n >category_count.txt

    说明:

    uniq -c #去重并统计

    sort -n # 正序排序

    sort -r # 倒序排序

    输出结果:

          1 null
          1   "操作系统"
          1   "打印机"
          1   "服务"
          1   "路由器"
          3   "其他支撑系统"
         12 
    
    [Haima的博客] http://www.cnblogs.com/haima/
  • 相关阅读:
    Neo4j图形数据库备份
    Linux中Tomcat 自动设置CATALINA_HOME方法
    VNC viewer 无法打开oracle 11g图形界面方案
    CYPHER 语句(Neo4j)
    Tomcat部署时war和war exploded区别
    java中不能使用小数点(.)来作为分隔符
    做一个完整的Java Web项目需要掌握的技能
    从零讲Java,给你一条清晰地学习道路!该学什么就学什么!
    MYSQL数据库表排序规则不一致导致联表查询,索引不起作用问题
    chrome浏览器的跨域设置——包括版本49前后两种设置
  • 原文地址:https://www.cnblogs.com/haima/p/15118877.html
Copyright © 2011-2022 走看看