Learn by reading through in order

awk — Field Extraction

Practice awk: '{print $1}' pulls out a column, $NF grabs the last column, -F',' switches the separator to a comma, NR and NF, and /error/{print $2} processes only matching lines — illustrated and hands-on in a browser terminal.

Pulling Out Columns — $1 / $NF / -F

awk is a command that splits each line into columns (fields) by whitespace and lets you process them column by column. Writing awk '{print $1}' pulls out and prints just the first column of each line. Inside {} you write the action to run on each line (here, print). $1 is the first column, $2 is the second, and $0 refers to the whole line.

$NF is a special way to refer to the last column. NF is a variable holding the number of fields on a line, so $NF gives you that line's final column. Even when the column count differs from line to line, you always get the last one. When the separator is not whitespace, specify it with -F, as in -F','. It's safest to wrap the separator in quotes (-F',').

printf 'alice 30 tokyo\nbob 25 osaka\n' > users.txt   # create the material
awk '{print $1}' users.txt                              # column 1: alice and bob
awk '{print $2}' users.txt                              # column 2: 30 and 25
awk '{print $NF}' users.txt                             # last column: tokyo and osaka
printf 'alice,30\nbob,25\n' > csv.txt                  # comma-separated material
awk -F',' '{print $1}' csv.txt                          # comma separator, column 1
How awk counts columns
alice30tokyo$1 = alice$2 = 30$NF = tokyo (last)
$1 is the first column, and $NF points to that line's last column.
FormMeaningExample
$1Pull out column 1awk '{print $1}' users.txt
$2Pull out column 2awk '{print $2}' users.txt
$NFPull out the last columnawk '{print $NF}' users.txt
$0Pull out the whole lineawk '{print $0}' users.txt
-F','Change the separator to a commaawk -F',' '{print $1}' csv.txt
NRCurrent line numberawk '{print NR, $0}' nf.txt
NFNumber of fields on the lineawk '{print NF}' nf.txt
/pat/{print $1}Print column 1 of lines matching patawk '/error/{print $1}' log.txt

① Create whitespace-separated material with printf 'alice 30 tokyo\nbob 25 osaka\n' > users.txt.

② Check the contents with cat users.txt.

③ Use awk to print just the first column of each line.

④ Then use awk '{print $2}' users.txt to print the second column of each line.

⑤ Next, use the form that refers to the last column to print the final column of each line.

⑥ Create comma-separated material with printf 'alice,30\nbob,25\n' > csv.txt, then print column 1 using the option that specifies the separator. (If you run it correctly, an explanation will appear.)

Linux console
0 / 7 completed
Loading Linux Terminal...

Line Number and Field Count — NR / NF

NR is a built-in variable holding the number of the line currently being processed, and NF holds the field count of that line. Writing awk '{print NR, $0}' prints each whole line with its line number in front. With awk '{print NF}' you can see how many columns each line was split into.

printf 'red\ngreen blue\n' > nf.txt   # create the material
awk '{print NR, $0}' nf.txt             # 1 red / 2 green blue
awk '{print NF}' nf.txt                 # line 1 has 1 column, line 2 has 2

① Create two lines with different field counts using printf 'red\ngreen blue\n' > nf.txt.

② With awk, combine the variable for the line number with the whole line to print each line with its number in front.

③ Then print the variable for the field count and check that line 1 and line 2 have different counts.

Linux console
0 / 3 completed
Loading Linux Terminal...

Processing Only Matching Lines — /pat/{print ...}

When you write a pattern before the program, awk applies that program only to the matching lines. awk '/error/{print $1}' prints column 1 only for lines containing error. Where grep shows the whole line, awk can pull just the columns you need from the matching lines.

Process only matching lines with a pattern
awk '{print $1}' fprint column 1 of every lineawk '/error/{print $1}' fcolumn 1 only of error lines
A pattern before the program narrows down which lines get processed.
printf 'error disk\ninfo start\nerror cpu\n' > log.txt   # create the material
awk '/error/{print $2}' log.txt                            # column 2 of error lines: disk and cpu

① Create a material file with printf 'error disk\ninfo start\nerror cpu\n' > log.txt.

② With awk, write a pattern before the program and print column 2 only for lines containing error.

③ Check that the printed words are only the columns pulled from the error lines.

④ Also run grep error log.txt and compare: grep prints the whole line, while awk prints only column 2.

Linux console
0 / 3 completed
Loading Linux Terminal...
QUIZ

Knowledge Check

Answer each question one by one.

Q1What does awk '{print $1}' f output?

Q2Which option do you add to make awk use a comma as the separator?

Q3Which lines does awk '/error/{print $1}' f process?