AWK INTRO
Usage:- # awk '{instructions}' file(s)
# awk '/pattern/ {procedure }' file
# awk -f script_file(s)
Tasks:
1. print entire row, one at a time from an input file
Note: $0 represents the current record or row
a. # awk '{ print $0 }' animals.txt
2. Print specific colums from animals.txt
# awk '{ print $1 }' animals.txt --- prints the firest colums from the file
awk '{ print $2 }' animals.txt
If we change column from 1 to 2 we can print elements of second column
even blank space as well.
3. Print multiple colums from animals.txt
awk '{ print $1; print $2; }' animals.txt
awk '{ print $1, print $2; }' animals.txt
awk '{ print $1,$2; }' animals.txt ---------- If there is no delemeters in the
file(only space) we can give like this
awk -F: '{ print $1; print $2; }' /etc/passwd ------- if any delemeter is there better give '-F' option.
awk -F: '{ print $1, $2; }' /etc/passwd
4. Print columns from lines containing 'deer' --- pattern
a. awk '/deer/ { print $0 }' animals.txt
5. Prints columns from lines containing digits
a. awk '/[0-9]]/ { print $0 }' animals.txt
dog1
deer200
regular expresion search [0-9]
6. Print lines begunning with numeric characters
a. awk '/^[0-9]]/ { print $0 }' animals.txt
123lion
7
6. Print lines begunning with Multiple characters
a. awk '/^[0-9]]*$/ { print $0 }' animals.txt
7. Remove blank lines using sed and pipe output to awk for processing
# sed -e /^$/d animals.txt | awk '/^[0-9]]*$/ { print $0 }'
In awk if we dont specify any pattern it will operate on every line, if we specify the pattern it will operate on that specific line or lines.
awk '/^$/ { print $0 }' animals.txt --- print all blank lines.
Case insensitve and Case sensitve etc are same as sed
Delimiters
Default Delimiter: whitespace {space, tabs}
Use: '-F' to influence the default delimiter
awk -F: '{ print $1; print $2; }' /etc/passwd ---- It will print the dats in seperate lines
awk -F: '{ print $1, $2; }' /etc/passwd --- It will print the datas in single line.
awk Scripts
organize patterns and procedures into a script file
Loops through lines of input from various sources: STDIN, Pipe, Files
Tasks:
1.Print something to the screen without reading input
a. # awk 'BEGIN { print "testing awk without input file" } '
2. set system: Field seperator to colon in BEGIN block
a. awk 'BEGIN { FS = ":" } '
b. awk 'BEGIN { FS = ":" ; print "testing awk without input file" } '
Here seperating FS(field seperator and print using "semi-colon" (;). We can use multiple print statement using seperaate them by using semi-colon(;)
c. awk 'BEGIN { FS = ":" ; print "testing awk without input file"; print FS } '
Task: Print the line which containing the work deer from animals.txt
# awk script to find deer
# Component 1 - BEGIN
BEGIN { print " processing various records" }
# Component 2 - Main Loop
/deer/ { print } ----- this reads as "if the line containing deer, print the line"
# Component 3 - END
END { print "process complete"
in this script first two components are optional, only main loop is enough for executing it.
here saved file as animals.awk and then run the awk script
# awk -f animals.awk animals.txt
Task: parse /etc/passwd
Print entire line
# Component 1 - BEGIN
BEGIN { FS = ":" print " processing various records" }
# Component 2 - Main Loop
{ print } ----- this will simply print the entire lines
# Component 3 - END
END { print "process complete"
saved it as test.awk
# awk -f test.awk /etc/passwd
Task: To print the specific name or colum from /etc/passwd
# Component 1 - BEGIN
BEGIN { FS = ":" print " processing various records" }
# Component 2 - Main Loop
{ print $1, $5 } ---- print specific columns
# Component 3 - END
END { print "process complete"
Taks: Print specific columns for a specific user: /linuxcbt/ { print $1, $5 }
# Component 1 - BEGIN
BEGIN { FS = ":" print " processing various records" }
# Component 2 - Main Loop
/linuxcbt/ { print $1, $5 } ---- Print specific columns for a specific user
# Component 3 - END
END { print "process complete"
# awk -f test.awk /etc/passwd
Taks: Print specific coloms for a specific user matching a given column: $1 ~ /name/ { print $1, $5 }
column number 1 followed by tild and value containing column number 1
Taks: print all users who contains shell "bash"
# Component 1 - BEGIN
BEGIN { FS = ":" print " processing various records" }
# Component 2 - Main Loop
$7 ~ /bash/ { print } ---- Print specific columns for a specific user
# Component 3 - END
END { print "process complete"
# awk -f test.awk /etc/passwd
AWK Variables
Three Types:-
1.System Variables -- FileName,RS(Record Seperator)
2.Scalars ---- ie, a = 3
3.Arrays -- ie, variable_name
NOTE:- Variables do not need to be declared. AWK automatically rgisters them in memory.
NOTE:- Variable names are case sensitive
System Variables:
1. FILENAME: Name of current input file
2. FNR : used when multiple input files are used
3. FS or Field seperator : defaults to whitespaces - can be a single character, including via RegEx
4. OFS : Output field seperator
5. NF: Number of fields in current records
# Component 1 - BEGIN
BEGIN { print " processing various records" }
# Component 2 - Main Loop
/deer/ { print } ----- this reads as "if the line containing deer, print the line"
{ print NF }
# Component 3 - END
END { print "process complete"
print "Filename: " FILENAME
AWK Process Records
Processing multiple delemeters in the same file
awk -F "[:; ]" '{print }' animal2.txt
awk -F "[:; ]" '{print $1 }' animal2.txt
awk -F "[:; ]" '{print $1,$2 }' animal2.txt
Usage:- # awk '{instructions}' file(s)
# awk '/pattern/ {procedure }' file
# awk -f script_file(s)
Tasks:
1. print entire row, one at a time from an input file
Note: $0 represents the current record or row
a. # awk '{ print $0 }' animals.txt
2. Print specific colums from animals.txt
# awk '{ print $1 }' animals.txt --- prints the firest colums from the file
awk '{ print $2 }' animals.txt
If we change column from 1 to 2 we can print elements of second column
even blank space as well.
3. Print multiple colums from animals.txt
awk '{ print $1; print $2; }' animals.txt
awk '{ print $1, print $2; }' animals.txt
awk '{ print $1,$2; }' animals.txt ---------- If there is no delemeters in the
file(only space) we can give like this
awk -F: '{ print $1; print $2; }' /etc/passwd ------- if any delemeter is there better give '-F' option.
awk -F: '{ print $1, $2; }' /etc/passwd
4. Print columns from lines containing 'deer' --- pattern
a. awk '/deer/ { print $0 }' animals.txt
5. Prints columns from lines containing digits
a. awk '/[0-9]]/ { print $0 }' animals.txt
dog1
deer200
regular expresion search [0-9]
6. Print lines begunning with numeric characters
a. awk '/^[0-9]]/ { print $0 }' animals.txt
123lion
7
6. Print lines begunning with Multiple characters
a. awk '/^[0-9]]*$/ { print $0 }' animals.txt
7. Remove blank lines using sed and pipe output to awk for processing
# sed -e /^$/d animals.txt | awk '/^[0-9]]*$/ { print $0 }'
In awk if we dont specify any pattern it will operate on every line, if we specify the pattern it will operate on that specific line or lines.
awk '/^$/ { print $0 }' animals.txt --- print all blank lines.
Case insensitve and Case sensitve etc are same as sed
Delimiters
Default Delimiter: whitespace {space, tabs}
Use: '-F' to influence the default delimiter
awk -F: '{ print $1; print $2; }' /etc/passwd ---- It will print the dats in seperate lines
awk -F: '{ print $1, $2; }' /etc/passwd --- It will print the datas in single line.
awk Scripts
organize patterns and procedures into a script file
Loops through lines of input from various sources: STDIN, Pipe, Files
Tasks:
1.Print something to the screen without reading input
a. # awk 'BEGIN { print "testing awk without input file" } '
2. set system: Field seperator to colon in BEGIN block
a. awk 'BEGIN { FS = ":" } '
b. awk 'BEGIN { FS = ":" ; print "testing awk without input file" } '
Here seperating FS(field seperator and print using "semi-colon" (;). We can use multiple print statement using seperaate them by using semi-colon(;)
c. awk 'BEGIN { FS = ":" ; print "testing awk without input file"; print FS } '
Task: Print the line which containing the work deer from animals.txt
# awk script to find deer
# Component 1 - BEGIN
BEGIN { print " processing various records" }
# Component 2 - Main Loop
/deer/ { print } ----- this reads as "if the line containing deer, print the line"
# Component 3 - END
END { print "process complete"
in this script first two components are optional, only main loop is enough for executing it.
here saved file as animals.awk and then run the awk script
# awk -f animals.awk animals.txt
Task: parse /etc/passwd
Print entire line
# Component 1 - BEGIN
BEGIN { FS = ":" print " processing various records" }
# Component 2 - Main Loop
{ print } ----- this will simply print the entire lines
# Component 3 - END
END { print "process complete"
saved it as test.awk
# awk -f test.awk /etc/passwd
Task: To print the specific name or colum from /etc/passwd
# Component 1 - BEGIN
BEGIN { FS = ":" print " processing various records" }
# Component 2 - Main Loop
{ print $1, $5 } ---- print specific columns
# Component 3 - END
END { print "process complete"
Taks: Print specific columns for a specific user: /linuxcbt/ { print $1, $5 }
# Component 1 - BEGIN
BEGIN { FS = ":" print " processing various records" }
# Component 2 - Main Loop
/linuxcbt/ { print $1, $5 } ---- Print specific columns for a specific user
# Component 3 - END
END { print "process complete"
# awk -f test.awk /etc/passwd
Taks: Print specific coloms for a specific user matching a given column: $1 ~ /name/ { print $1, $5 }
column number 1 followed by tild and value containing column number 1
Taks: print all users who contains shell "bash"
# Component 1 - BEGIN
BEGIN { FS = ":" print " processing various records" }
# Component 2 - Main Loop
$7 ~ /bash/ { print } ---- Print specific columns for a specific user
# Component 3 - END
END { print "process complete"
# awk -f test.awk /etc/passwd
AWK Variables
Three Types:-
1.System Variables -- FileName,RS(Record Seperator)
2.Scalars ---- ie, a = 3
3.Arrays -- ie, variable_name
NOTE:- Variables do not need to be declared. AWK automatically rgisters them in memory.
NOTE:- Variable names are case sensitive
System Variables:
1. FILENAME: Name of current input file
2. FNR : used when multiple input files are used
3. FS or Field seperator : defaults to whitespaces - can be a single character, including via RegEx
4. OFS : Output field seperator
5. NF: Number of fields in current records
# Component 1 - BEGIN
BEGIN { print " processing various records" }
# Component 2 - Main Loop
/deer/ { print } ----- this reads as "if the line containing deer, print the line"
{ print NF }
# Component 3 - END
END { print "process complete"
print "Filename: " FILENAME
AWK Process Records
Processing multiple delemeters in the same file
awk -F "[:; ]" '{print }' animal2.txt
awk -F "[:; ]" '{print $1 }' animal2.txt
awk -F "[:; ]" '{print $1,$2 }' animal2.txt
Good examples. Thanks for sharing. I have written a similar article. if you are interested you can check this out, Awk Command in Unix
ReplyDelete