zoukankan      html  css  js  c++  java
  • Awk基本入门[6] Additional Awk Commands 3

    1、Argument Processing (ARGC, ARGV, ARGIND)


    The built-in variables we discussed earlier, FS, NFS, RS, NR, FILENAME, OFS, and ORS, are all available on all versions of awk (including nawk, and gawk).

    • The environment variables discussed in this hack are available only on nawk and gawk.
    • Use ARGC and ARGV to pass some parameters to the awk script from the command line.
    • ARGC contains the total number of arguments passed to the awk script.
    • ARGV is an array contains all the arguments passed to the awk script in the index from 0 through ARGC
    • When you pass 5 arguments, ARGC will contain the value of 6.
    • ARGV[0] will always contain awk.

    The following simple arguments.awk shows how ARGC and ARGV behave:

    $ cat arguments.awk
    BEGIN {
        print "ARGC=",ARGC
        for (i = 0; i < ARGC; i++)
            print ARGV[i]
    }
    
    $ awk -f arguments.awk arg1 arg2 arg3 arg4 arg5
    ARGC= 6
    awk
    arg1
    arg2
    arg3
    arg4
    arg5

    In gawk the file that is currently getting processed is stored in the ARGV array that is accessed from the body loop. The ARGIND is the index to this ARGV array to retrieve the current file.
    When you are processing only one file in an awk script, the ARGIND will be 1, and ARGV[ARGIND] will give the file name that is currently getting processed.

    The following example contains only the body block, that prints the value of the ARGIND, and the current file name from the ARGV[ARGIND]

     

    $ cat argind.awk
    {
        print "ARGIND:", ARGIND
        print "Current file:", ARGV[ARGIND]
    }

    2、GAWK Built-in Environment Variables


    The built-in variables discussed in this section are available only in GAWK.

    ENVIRON

    ENVIRON is an array that contains all the environment values. The index to the ENVIRON array is the environment variable name.
    For example, the array element ENVIRON["PATH"] will contain the value of the PATH environment variable.

    $ cat environ.awk
    BEGIN {
        OFS="="
        for(x in ENVIRON)
            print x,ENVIRON[x];
    }        

    Partial output is shown below.

    $ awk -f environ.awk
    SHELL=/bin/bash
    PATH=/home/ramesh/bin:/usr/local/sbin:/usr/local/bin:/u
    sr/sbin:/usr/bin:/sbin:/bin:/usr/games
    HOME=/home/ramesh
    TERM=xterm
    USERNAME=ramesh
    DISPLAY=:0.0
    AWKPATH=.:/usr/share/awk

    IGNORECASE

    By default IGNORECASE is set to 0. So, the awk program is case sensitive.
    When you set IGNORECASE to 1, the awk program becomes case insensitive. This will affect regular expression and string comparisons.

    The following will not print anything, as it is looking for "video" with lower case "v". But, the items.txt file contains only "Video" with upper case "V".

    awk '/video/ {print}' items.txt

    However when you set IGNORECASE to 1, and search for "video", it will print the line containing "Video", as it will not do a case sensitive pattern match.

    $ awk 'BEGIN{IGNORECASE=1} /video/ {print}' items.txt
    101,HD Camcorder,Video,210,10

    As you see in the example below, this works for both string and regular expression comparisons.

    $ cat ignorecase.awk
    BEGIN {
        FS=",";
        IGNORECASE=1;
    }
    {
        if ($3 == "video") print $0;
        if ($2 ~ "TENNIS") print $0;
    }
    
    
    $ awk -f ignorecase.awk items.txt
    101,HD Camcorder,Video,210,10
    104,Tennis Racket,Sports,190,20

    ERRNO

    When there is an error while using I/O operations (for example: getline), the ERRNO variable will contain the corresponding error message.

     3、Awk Profiler - pgawk


    The pgawk program is used to create an execution profile of your awk program. Using pgawk you can view how many time each awk statement (and custom user defined functions) were executed.

    First, create a sample awk program that we'll run through the pgawk to see how the profiler output looks like.

    $ cat profiler.awk
    BEGIN {
        FS=",";
        print "Report Generated On:" strftime("%a %b %d %H:%M:%S %Z %Y",systime());
    }
    {
        if ( $5 <= 5 )
            print "Buy More: Order", $2, "immediately!"
        else
            print "Sell More: Give discount on", $2, "immediately!"
    }
    END {
        print "----"
    }

    Next, execute the sample awk program using pgawk (instead of just calling awk).

    $ pgawk -f profiler.awk items.txt
    Report Generated On:Mon Jan 31 08:35:59 PST 2011
    Sell More: Give discount on HD Camcorder immediately!
    Buy More: Order Refrigerator immediately!
    Sell More: Give discount on MP3 Player immediately!
    Sell More: Give discount on Tennis Racket immediately!
    Buy More: Order Laser Printer immediately!
    ----

    By default pgawk creates a file called profiler.out (or awkprof.out). You can specify your own profiler output file name using --profiler option as shown below.

    $ pgawk --profile=myprofiler.out -f profiler.awk items.txt

    View the default awkprof.out to understand the execution counts of the individual awk statements.

    $ cat awkprof.out
    # gawk profile, created Mon Jan 31 08:35:59 2011
    # BEGIN block(s)
    BEGIN {
    1 FS = ","
    1 print ("Report Generated On:" strftime("%a %b %d %H:%M:%S %Z %Y", systime()))
    }
    # Rule(s)
    5 {
    5if ($5 <= 5) { # 2
    2 print "Buy More: Order", $2,"immediately!"
    3} else {
    3 print "Sell More: Give discount on", $2,"immediately!"
    }
    }
    # END block(s)
    END {
    1 print "----"
    }

    While reading the awkprof.out, please keep the following in mind:

    • The column on the left contains a number. This indicates how many times that particular awk command has executed. For example, the print statement in begin executed only once (duh!). The while lop executed 6 times.
    • For any condition checking, one on the left side, another on the right side after the parenthesis. The left side indicates how many times the pattern was checked. The right side indicate how many times it was successful. In the above example, if
    was executed 5 times, but it was successful 2 times as indicated by ( # 2 ) next to the if statement.

     

  • 相关阅读:
    吉他 摄影
    前端思考独处时间自我成长
    约束力
    js算法
    旅行计划
    生产者消费者问题
    Lock锁
    线程和进程
    什么是JUC
    GC日志分析和垃圾回收器的新展望
  • 原文地址:https://www.cnblogs.com/yangfengtao/p/3310199.html
Copyright © 2011-2022 走看看