zoukankan      html  css  js  c++  java
  • [svc]sort-uniq

    uniq - report or omit repeated lines

    sort
        -r
        -t
    
    uniq
        -r
        -c
    

    uniq的作用: 去除相邻重复行

    [root@n1 data]# cat ip.txt
    10.0.0.9
    10.0.0.8
    10.0.0.7
    10.0.0.7
    10.0.0.8
    10.0.0.8
    10.0.0.9
    
    [root@n1 data]# uniq ip.txt
    10.0.0.9
    10.0.0.8
    10.0.0.7
    10.0.0.8
    10.0.0.9
    

    sort作用: 让通的行相邻

    - 让相同的行相邻
    [root@n1 data]# sort ip.txt
    10.0.0.7
    10.0.0.7
    10.0.0.8
    10.0.0.8
    10.0.0.8
    10.0.0.9
    10.0.0.9
    
    - 去掉相邻重复的行: 方法1
    [root@n1 data]# sort ip.txt |uniq
    10.0.0.7
    10.0.0.8
    10.0.0.9
    
    - 方法2:
    [root@n1 data]# sort -u ip.txt
    10.0.0.7
    10.0.0.8
    10.0.0.9
    

    去重+统计次数

    [root@n1 data]# sort ip.txt |uniq -c
    2 10.0.0.7
    3 10.0.0.8
    2 10.0.0.9
    

    题目:[百度搜狐面试题] 统计url出现次数

    maotai.log
    
    http://www.maotai.com/index.html
    http://www.maotai.com/1.html
    http://post.maotai.com/index.html
    http://mp3.maotai.com/3.html
    http://www.maotai.com/1.html
    http://post.maotai.com/2.html
    
    - 过滤url
    [root@n1 data]# awk -F / '{print $3}' url.txt
    www.maotai.com
    www.maotai.com
    post.maotai.com
    mp3.maotai.com
    www.maotai.com
    post.maotai.com
     
    - sourt+uniq降序排列
    [root@n1 data]# awk -F / '{print $3}' url.txt|sort|uniq -c
    1 mp3.maotai.com
    2 post.maotai.com
    3 www.maotai.com
    
    - 降序排序:
    
    方法1: awk
    [root@n1 data]# awk -F / '{print $3}' url.txt|sort|uniq -c|sort -r
    3 www.maotai.com
    2 post.maotai.com
    1 mp3.maotai.com
    
    方法2: cut
    [root@n1 data]# cut -d / -f3 url.txt |sort|uniq -c|sort -r
    3 www.maotai.com
    2 post.maotai.com
    1 mp3.maotai.com
    
    优化:
    [root@n1 data]# cut -d / -f3 url.txt |sort -r|uniq -c
    3 www.maotai.com
    2 post.maotai.com
    1 mp3.maotai.com
    

    对第二列排序

    sort
        -t 分隔符, 类似awk的-F,取字段用$1 $2或cut的-d,取字段f数字.
        –k 第几列
    
    [root@n1 test]# cat ip.txt
    10.0.0.9 o
    10.0.0.9 a
    10.0.0.8 z
    10.0.0.8 k
    10.0.0.8 c
    10.0.0.7 n
    10.0.0.7 f
    
    [root@n1 test]# sort -t " " -k2 ip.txt
    10.0.0.9 a
    10.0.0.8 c
    10.0.0.7 f
    10.0.0.8 k
    10.0.0.7 n
    10.0.0.9 o
    10.0.0.8 z
    注: 分隔符默认是空格,因此 –t 可以省略
    
    [root@n1 test]# sort -k2 ip.txt
    [root@n1 test]# sort -rk2 ip.txt #倒序排列
    
    sort –runtk
        -r --reverse             倒序
        –u --unique              去重
        –n --numeric-sort        按数字排序
        -t --field-separator=SEP 分隔
        –k --key=KEYDEF          通过key排序
    
    uniq
        –c --count
    

    题目:要求对ip的第三列降序排序,如果第三列相同,那就第四列按照降序排序.

    [root@n1 test]# cat arp.txt
    192.168.0.3 00:e0:4c:41:d2:a5
    192.168.2.2 00:e0:4c:41:d1:7d
    192.168.3.7 00:50:bf:11:94:60
    192.168.3.5 00:e0:4c:43:a3:46
    192.168.2.4 00:0a:eb:6d:08:10
    192.168.1.2 00:01:6c:99:37:47
    192.168.4.9 00:0a:e6:b5:d1:4b
    192.168.0.4 00:0e:1f:51:74:24
    192.168.6.7 00:1d:72:40:b2:e1
    192.168.8.4 00:01:6c:36:5d:64
    192.168.1.22 00:e0:4c:41:ce:73
    192.168.0.15 00:e0:4c:41:d7:0e
    192.168.2.9 00:e0:4c:41:d1:8b
    192.168.0.122 00:16:ec:c5:46:45
    192.168.9.115 00:01:6c:98:f7:07
    192.168.7.111 00:17:31:b6:6e:a9
    
    sort -t. -k3.1,3.1nr -k4.1,4.3nr arp.txt
        -k多少列
        -k3.1,3.3 第三列第一个字符到第三列第一个字符
        -k4.1,4.3 第四列第一个字符,第四列第三个字符
    
    [root@n1 test]# sort -t. -k3.1,3.1nr -k4.1,4.3nr arp.txt
    192.168.9.115 00:01:6c:98:f7:07
    192.168.8.4 00:01:6c:36:5d:64
    192.168.7.111 00:17:31:b6:6e:a9
    192.168.6.7 00:1d:72:40:b2:e1
    192.168.4.9 00:0a:e6:b5:d1:4b
    192.168.3.7 00:50:bf:11:94:60
    192.168.3.5 00:e0:4c:43:a3:46
    192.168.2.9 00:e0:4c:41:d1:8b
    192.168.2.4 00:0a:eb:6d:08:10
    192.168.2.2 00:e0:4c:41:d1:7d
    192.168.1.22 00:e0:4c:41:ce:73
    192.168.1.2 00:01:6c:99:37:47
    192.168.0.122 00:16:ec:c5:46:45
    192.168.0.15 00:e0:4c:41:d7:0e
    192.168.0.4 00:0e:1f:51:74:24
    192.168.0.3 00:e0:4c:41:d2:a5
    

    题目:[百度搜狐面试题] 统计url出现次数 ---awk解决

    maotai.log
    http://www.maotai.com/index.html
    http://www.maotai.com/1.html
    http://post.maotai.com/index.html
    http://mp3.maotai.com/3.html
    http://www.maotai.com/1.html
    http://post.maotai.com/2.html
    
  • 相关阅读:
    nginx与uwsgi介绍
    Pycharm快捷键
    短信和图片验证码
    linux部署Django脱坑指南
    面试题汇总(七)
    面试题汇总(六)
    面试题汇总(五)
    面试题汇总(四)
    面试题汇总(二)
    iOS Block的本质(四)
  • 原文地址:https://www.cnblogs.com/iiiiher/p/8570415.html
Copyright © 2011-2022 走看看