zoukankan      html  css  js  c++  java
  • awk对某个字段分割处理

    工作中遇到要根据文件中某个字段分割成多行文本的处理,想到用awk处理,这里记录下:

    问题:

        原文件:假设一共2个字段,用“|”分割,其中第二个字段用“#”分割,但该字段中也有不含“#”的值和空值

        要求:根据第二个字段,若含#,将这条数据根据#分割成多条数据,无#和无值的行不变

    202143108500|#0_1000_VOICE#0_1000_VOICE#0_1000_VOICE#0_TRAFFIC#0_TRAFFIC#0_TRAFFIC
    202121366359|#0_1000_VOICE#0_TRAFFIC
    202143108500|#0_1000_VOICE#0_1000_VOICE#0_1000_VOICE#0_TRAFFIC#0_TRAFFIC#0_TRAFFIC
    202121366359|#0_1000_VOICE#0_TRAFFIC
    202113492312|W_GH_YYM
    202132164529|

    用awk解决:

    1、将含“#”的一行变多行

    awk -F "|"  -vOFS="|"  '{l=split($2,arr,"#");for(i=1;i<l;i++){$2=arr[i+1];print}}' ./test.txt

    结果:

    202143108500|0_1000_VOICE
    202143108500|0_1000_VOICE
    202143108500|0_1000_VOICE
    202143108500|0_TRAFFIC
    202143108500|0_TRAFFIC
    202143108500|0_TRAFFIC
    202121366359|0_1000_VOICE
    202121366359|0_TRAFFIC
    202143108500|0_1000_VOICE
    202143108500|0_1000_VOICE
    202143108500|0_1000_VOICE
    202143108500|0_TRAFFIC
    202143108500|0_TRAFFIC
    202143108500|0_TRAFFIC
    202121366359|0_1000_VOICE
    202121366359|0_TRAFFIC

    2、将不含“#”筛选出来

    awk -F "|"  '$2!~/#/{print}' ./test.txt

    结果:

    202113492312|W_GH_YYM
    202132164529|

    经过上面两步就可以解决,将结果生成新的文件 a.txt

    awk -F "|"  -vOFS="|"  '{l=split($2,arr,"#");for(i=1;i<l;i++){$2=arr[i+1];print}}' ./test.txt >a.txt
    awk -F "|"  '$2!~/#/{print}' ./test.txt >>a.txt

    a.txt:

    202143108500|0_1000_VOICE
    202143108500|0_1000_VOICE
    202143108500|0_1000_VOICE
    202143108500|0_TRAFFIC
    202143108500|0_TRAFFIC
    202143108500|0_TRAFFIC
    202121366359|0_1000_VOICE
    202121366359|0_TRAFFIC
    202143108500|0_1000_VOICE
    202143108500|0_1000_VOICE
    202143108500|0_1000_VOICE
    202143108500|0_TRAFFIC
    202143108500|0_TRAFFIC
    202143108500|0_TRAFFIC
    202121366359|0_1000_VOICE
    202121366359|0_TRAFFIC
    202113492312|W_GH_YYM
    202132164529|
  • 相关阅读:
    Java IO输入输出流 FileWriter 字符流
    Java IO输入输出流File 字节流
    Java List集合和Map集合的综合应用
    表单提交中的重复问题(表单令牌验证)
    php中const与define的区别
    阿里云中获取文件及目录列表的方法
    巧用php中的array_filter()函数去掉多维空值
    文件大小格式化函数
    UTC 通用格式时间 转换为 时间戳,并格式化为2017-01-01 12:00:00
    关于匿名函数的使用,购物车中计算销售税的应用
  • 原文地址:https://www.cnblogs.com/jnba/p/10641991.html
Copyright © 2011-2022 走看看