Learn by reading through in order

awk — Aggregation and Reports

Practice awk: the order BEGIN { } and END { } run in, s += $1 to accumulate column 1 into a total, and counting with NR or { c++ } to build aggregation reports — illustrated and hands-on in a browser terminal.

Running Before and After — BEGIN and END

awk processes each line in turn, but you can put work you want to run once before and after into BEGIN { ... } and END { ... }. BEGIN runs once before the first line is read, and END runs once after all lines have been read. Use them to print a header up front or to print a summary at the very end.

The order BEGIN and END run in
BEGIN { ... }once before readingprint a header first{ body }repeats per lineone line at a timeEND { ... }once after all linesprint sum or count
BEGIN runs once before processing, the body runs per line, and END runs once after.
FormMeaning
BEGIN { ... }Run once before the first line is read
{ ... }The body run repeatedly, one line at a time
END { ... }Run once after all lines have been read
s += $1Keep adding column 1 into the variable s (a running total)
NRLines read so far (the total line count at the end)
{ c++ } END { print c }Increment c per line and print the count at the end
printf 'start\nmiddle\nend\n' > lines.txt          # create 3 lines of material
awk 'BEGIN { print "--- report ---" } { print $0 } END { print "rows:", NR }' lines.txt
# a header up front, rows: 3 at the end

① Create a 3-line material file with printf 'apple\nbanana\ncherry\n' > fruits.txt.

② Check the contents with cat fruits.txt.

③ Using awk, print a header line once in BEGIN, print each line as-is in the body, and print the total line count once in END using NR.

④ Check that a header is at the top, the 3 lines are in the middle, and the line count is at the end. (If you run it correctly, an explanation will appear.)

Linux console
0 / 3 completed
Loading Linux Terminal...

Sum and Count — s += $1 and NR

In awk you can use variables without declaring them, and numbers add up directly. Writing { s += $1 } in the body keeps adding each line's first column into the variable s, and END { print s } prints the total at the end. For a count, either print NR (the number of lines read) directly in END, or increment a counter with { c++ } and print it with END { print c }. With this you can build aggregation reports like a sales total or a record count.

Accumulate column 1 into a total
10 apple20 banana30 cherrys += $1 -> s=10s += $1 -> s=30END print s -> 60
s += $1 adds column 1 line by line, and END prints the total of 60.
printf '120 mon\n80 tue\n200 wed\n' > sales.txt   # create 3 lines with numbers
awk '{ s += $1 } END { print "total:", s }' sales.txt   # total: 400
awk 'END { print "days:", NR }' sales.txt               # days: 3

① Create 3 lines with numbers using printf '50 a\n70 b\n30 c\n' > nums.txt.

② Check the contents with cat nums.txt.

③ In the awk body, keep adding column 1 into a variable, and print its total once in END.

④ Then, in awk's END, use NR to print the line count (the number of records).

⑤ Check that the total and the count are shown on screen.

Linux console
0 / 4 completed
Loading Linux Terminal...

Counting with a Counter — { c++ } END { print c }

NR counts every line, but when you want to count only the lines that match a condition, use a counter in the body. { c++ } increments the variable c by 1 for each line read, and END { print c } prints that count. Adding a pattern like /pat/{ c++ } counts only the lines containing a particular string.

Count only matching lines with a counter
pass xfail ypass z/pass/ c++ -> c=1fail -> not countedEND print c -> 2
/pass/{ c++ } increments c only on matching lines, and END prints the count of 2.
FormMeaning
{ c++ }Increment the variable c by 1 for each line read
/pattern/{ c++ }Increment c only on lines containing pattern
END { print c }Print the count c once at the end
printf 'ok pay\nng pay\nok ship\n' > log.txt        # create 3 lines with a status
awk '/ok/{ c++ } END { print "ok count:", c }' log.txt   # count lines containing ok -> 2

① Create 3 lines with a status using printf 'pass x\nfail y\npass z\n' > result.txt.

② Check the contents with cat result.txt.

③ With awk, increment a counter only on lines containing pass, and print that count once in END.

④ Check that the count shown matches the number of lines containing pass.

⑤ On the same result.txt, run both awk 'END { print NR }' result.txt (total lines) and awk '/pass/{ c++ } END { print "pass count:", c }' result.txt (matching count) and compare the difference.

Linux console
0 / 4 completed
Loading Linux Terminal...
QUIZ

Knowledge Check

Answer each question one by one.

Q1When does the body of BEGIN { ... } run?

Q2What does awk '{ s += $1 } END { print s }' f show?

Q3What comes out when you print NR in the END block?