Here are some examples of GNU Awk, re-implementing common command line tools in Awk code one-liners to illustrate how Awk works. First line is the original tool, second line is the Awk equivalent. Obviously you would not do these tasks in Awk, rather you would combine these techniques into more complex actions difficult or impossible in the shell.
Awk is a text processing language and tool that sits somewhere between common Unix command line tools and full-blown languages like Python and Ruby. It’s ideal for quick hacks that are difficult to achieve without complex shell pipelines or full scripts, and for many tasks it’s extremely fast.
The language structure is a series of statements of the form pattern { action }; actions are executed if the pattern matches a line. A null pattern matches every line, a null action prints the line.
Matching certain lines
- Search for a regexp
 - a pattern without an action means the default action: print
 grep foo fileawk '/foo/' file- First 10 lines
 - patterns can be any expression; 
NRis the current line number head -n 10 fileawk 'NR <= 10' file- Lines 11 through 20
 - more complex expression
 tail -n +11 file | head -n 10awk 'NR >= 11 && NR <= 20' file- Multiple positive and negative regexs
 - the Awk version is a single process
 grep foo file | grep -v bar | grep bazawk '/foo/ && !/bar/ && /baz/' file- Last lines
 - Awk must buffer 10 lines to be able to print them at the end; 
$0is the entire line; the special patternENDruns after all input tail -n 10 fileawk '{ buf[NR%10] = $0 } END { for (i = NR-9; i <= NR; i++) { print buf[i%10] } }' file- Numeric condition in a single field
 - expressions can use any valid variable; 
$xis field number x; fields are separated by runs of whitespace and leading whitespace is ignored (not possible in a single shell line)awk -F: '$4 > 5 && $4 < 100' /etc/passwd- Match a regex in a single field
 (complex regex for grep)awk -F: '$5 ~ /foo/' /etc/passwd- Select lines with at least 5 fields
 NFis the number of fields in the current line(not possible in a single shell line)awk 'NF >= 5'- Choose a random line
 srand()needed to properly initialize the random number generator; the specialBEGINaction runs before any inputshuf -n 1 fileawk 'BEGIN { srand() } rand() < 1/NR { l = $0 } END { print l }' file
Modifying output
- Extract one column
 - an action without a pattern is run for every line; 
-Fseparates by the given single character or regex, runs are not compressed cut -d: -f6 /etc/passwdawk -F: '{ print $6 }' /etc/passwd- Extract multiple columns
 cut -d: -f4,6 /etc/passwdawk -F: '{ print $4 ":" $6 }' /etc/passwd- Replace a regex with another value
 sed 's/foo/bar/' fileawk '{ gsub(/foo/, "bar"); print }' file- Replace a regex with another value in a single field
 (needs a complex regex in sed)awk '{ gsub(/foo/, "bar", $5); print }' file- Print file with line numbers
 printalways adds a newline butprintfneeds it explicitlycat -n fileawk '{ printf("%6d %s\n", NR, $0)}' file- Coerce a field to numeric
 - need to set the output field separator to preserve the file format
 (not possible in a single shell line)awk -F: 'BEGIN { OFS = FS } { $5 = 0 + $5; print }' file- Write lines in reverse order (caution: memory)
 tac fileawk '{ buf[++l] = $0 } END { for (i = l; i > 0; --i) { print buf[i] } }' file- Show uniq lines in sorted input
 sort file | uniqorsort -u filesort file | awk 'prev != $0 { print; prev = $0 }'- Show only uniq lines in unsorted input (caution: memory)
 (not possible in a single shell line)awk '!seen[$0]++' file- This needs some explanation; uses 
seenas an associative array with lines as keys;seen[$0]++is >0 and thus true if the line has been seen before, the preceding!inverts the condition so nothing is printed. The first time a line is seen it is null (and thus false), is incremented to 1 after the test, and the preceding!inverts the condition causing it to be printed since that’s the default action. See below to show the counts. 
Counting and summing
- Count all lines
 wc -l fileawk 'END { print NR }' file- Count words
 - these are only approximately the same
 wc -w fileawk '{ t += NF } END { print t }' file- Count bytes
 wc -c file(not possible - GNU Awk works with encoded characters, not bytes)- Count all lines matching a regex
 - variables spring into existence as required, and take on numeric values in a numeric context
 grep foo *.c | wc -lawk '/foo/ { t++ } END { print t }' *.c- Count unique lines (caution: memory)
 - all arrays are associative; the Awk version does not need to sort
 sort -u file | wc -lawk '{ seen[$0]++ } END { print length(seen) }' file- Count each unique line
 - the Awk version does not need to sort
 sort file | uniq -cawk '{ count[$0]++ } END { for (line in count) { printf("%6d %s\n", count[line], line) } }' file- Count lines by file
 - special 
ENDFILEaction runs after each distinct input file,FNRis the line number within each file wc -l *.cawk 'ENDFILE { printf("%4d %s\n", FNR, FILENAME) } END { printf("%4d total\n", NR) }' *.c- Sum the fifth field, treating non-numbers as zero
 *needs a complex loop in the shell*- `awk ‘{ t += 0 + $5 } END { print t } file’
 - Sum all numbers in a file
 patsplitputs all values matching a regex into an array, similar to Python’sre.findallor Ruby’sString#scan*not possible in a single line*awk '{ patsplit($0, n, /[[:digit:]]+/); for (i in n) { t += n[i] } } END { print t }' file
I/O
- Split a file into chunks of 100 lines with a numeric suffix for each line
 - line numbers start at 1; the 
>construct takes an expression used as a string;closeto avoid running out of file descriptors split -d -n 100 file fooawk 'NR % 100 == 1 { close(fname); fname = sprintf("foo%02d", NR/5) } { print >fname }' file
Running external commands
- Run a command for a column
 cut -d: -f6 /etc/passwd | xargs ls -ldawk -F: '{ system("ls -ld \"", $6, "\"") }' /etc/passwd