例子文件素材样式:
120.197.87.216 - - [04/Jan/2012:00:00:02 +0800] "GET /home.php?mod=space&uid=563413&mobile=yes HTTP/1.1" 200 3388 "-" "-"
123.126.50.73 - - [04/Jan/2012:00:00:02 +0800] "GET /thread-679411-1-1.html HTTP/1.1" 200 5251 "-" "Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)"
203.208.60.187 - - [04/Jan/2012:00:00:02 +0800] "GET /archiver/tid-3003.html HTTP/1.1" 200 2056 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
114.112.141.6 - - [04/Jan/2012:00:00:02 +0800] "GET /ctp080113.php?action=getgold HTTP/1.1" 200 13886 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; InfoPath.3; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"
awk '{print $1}' access.20120104.log
awk '{print substr($4,2)}' access.20120104.log
实例:
1.模拟windows dir的输出风格
ls -l | awk '{printf $6" "$7" "$8" ";if (substr($1,1,1)=="d") {printf "<dir> "} else {printf $5" "};print $9}'
4月 27 2014 <dir> a
3月 11 2014 688211055 access.20120104.log
10月 9 2013 3025757 access.log.10
5月 31 2014 <dir> Algorithms
加强版:a. 年/月/日 时:分 b. 字段对齐 (使用printf格式化字符串)
ls -al --time-style=long-iso | awk '{printf $6" "$7" ";if(substr($1,1,1)=="d") {printf "%-15s","<dir>"} else {printf "%15s",$5};print " "$8}'
2014-03-11 14:11 688211055 access.20120104.log
2013-10-09 10:35 3025757 access.log.10
2014-05-31 00:56 <dir> Algorithms
2015-05-02 18:47 18748 .bash_history
2014-03-13 14:06 220 .bash_logout
2014-03-13 14:06 3486 .bashrc
2014-05-09 15:45 <dir> .cache
注:
2. 计算网站的ip数和pv数
awk 'BEGIN{print "ip","num"} {ip[$1]++} END{ for (i in ip) {print i,ip[i]}}'< access.20120104.log| wc -l
上面也可以结合sort与uniq的使用
3. 计算以a开头的普通文件的平均长度
find . -name "a*" -type f -exec ls -l {} ; | awk 'BEGIN{count=0;sum=0};{count+=1;sum+=$5};END{print "avg="sum/count}'
当然,上面可以不用count,之一直接使用NR代替进行计算
4.根据下表计算每一个人的总额和平均值
vi pay.txt
Name 1st 2st 3st
grid 23000 24000 25000
lily 21000 23000 20000
david 25000 19000 24000
awk '{if(NR==1) {printf "%10s %10s %10s %10s %10s %10s ",$1,$2,$3,$4,"total","avg"};if(NR>=2){total=$1+$2+$3+$4;avg=total/(NF-1);printf "%10s %10d %10d %10d %10.2f %10.2f ",$1,$2,$3,$4,total,avg}}' pay.txt
Name 1st 2st 3st total avg
grid 23000 24000 25000 72000.00 24000.00
lily 21000 23000 20000 64000.00 21333.33
david 25000 19000 24000 68000.00 22666.67
注:1.printf格式化语句的熟练。2.有判断语句,花括号不要弄错,记住if-else,for都是一个语句,之间只有分号