Quick Guide to nawk - Examples, Field Separators, Arrays
Posted on Tuesday, January 10, 2006 at 11:56 PM by Malcolm
Here is a quick guide to nawk.I prefer to use nawk over awk as it has more functionalities. Most systems now would have both programs installed. See alsoTo run nawk
- From command line : nawk 'program' inputfile1 inputfile2 …
- From a file : nawk -f programfile inputfile1 inputfile2 …
Structure of nawk program
- A nawk program can consist of three sections: nawk 'BEGIN{…}{… /* BODY */ …}{END}' inputfile
- Both 'BEGIN' and 'END' blocks are optional and are executed only once.
- The body is executed for each line in the input file.
Field Separators
- The following example adds the field '=' separator, in addition to the blank space separator : nawk 'BEGIN{FS = " *|="}{print $2}' input file.
- For example, if the input file contains the line "Total = 500", then the output will be 500.
Printing Environment Variables
-
The following example appends the current path to a list of
filenames/directories:
ls -alg | nawk '{print ENVIRON["$PWD"] "/" $8}'
-
ENVIRON is an array of
environment variables index by the individual variable name.
- The variable FILENAME is a string that stores the current name of the file nawk is parsing.
Examples of usage
- To kill all the jobs of the current user : kill -9 `ps -ef | grep $LOGNAME | nawk '{print $2}'`
Multi-dimensional array
- To use 2D or multi-dimensional array, use comma to seperate the array index: matrix[3, 5] = $(i+5)
Another examples
- The example below calculates the averages for 16 items from 10 sets of readings.
-
Example of an input line the program is trying to match : Total
elapsed time is 560
BEGIN{ printf("--------- Execution Time -----------\n"); item=16; set=10; } {# all new variables are initialized to 0for(;j < set;j++) for(i=0;i < item; i++) {# skip input until the second word matches "elapsed"while($2 != "elapsed") getline;# notice the use of array without declaring its# dimensionsum[i]+=$5; getline; } if(j==set){for(i=0;i < item;i++){ # this and the next 2 lines are comments # you can use either print or printf for output # print sum[i]/set; printf("Set %d : %6.3f\n",i,sum[i]/set); } j++; } }END{ printf("-------------- End --------------"); }
Examples from the man page
-
Write to the standard output all input lines for which field 3 is
greater than 5:
$3 > 5
-
Write every tenth line:
(NR % 10) == 0
-
Write any line with a substring matching the regular expression:
/(G|D)(2[0-9][[:alpha:]]*)/
-
Print any line with a substring containing a G or D, followed by a
sequence of digits and characters:
/(G|D)([[:digit:][:alpha:]]*)/
-
Write any line in which the second field contains a backslash:
$2 ~ /\\/
-
Write any line in which the second field contains a backslash
(alternate method). Note that backslash escapes are interpreted twice,
once in lexical processing of the string and once in processing the
regular expression.
$2 ~ "\\\\"
-
Write the second to the last and the last field in each line,
separating the fields by a colon:
{OFS=":";print $(NF-1), $NF}
-
Write lines longer than 72 characters:
{length($0) > 72}
-
Write first two fields in opposite order separated by the OFS:
{ print $2, $1 }
-
Same, with input fields separated by comma or space and tab
characters, or both:
BEGIN { FS = ",[\t]*|[\t]+" }{ print $2, $1 }
-
Add up first column, print sum and average:
{s += $1 }END{print "sum is ", s, " average is", s/NR}
-
Write fields in reverse order, one per line (many lines out for each
line in):
{ for (i = NF; i > 0; --i) print $i }
-
Write all lines between occurrences of the strings "start" and "stop":
/start/, /stop/
-
Write all lines whose first field is different from the previous one:
$1 != prev { print; prev = $1 }
-
Simulate the echo command:
BEGIN { for (i = 1; i < ARGC; ++i) printf "%s%s", ARGV[i], i==ARGC-1?"\n":""}
-
Write the path prefixes contained in the PATH environment variable,
one per line:
BEGIN{n = split (ENVIRON["PATH"], path, ":") for (i = 1; i <= n; ++i) print path[i]}
Edited on: Sunday, May 27, 2012 12:18 PM