The AWK has a number of functions built into it that are always
available to the programmer. This tutorial describes AWK's Arithmetic,
String, Time, Bit manipulation and other miscellaneous functions with
suitable examples:
first statement, cmd = "tr [a-z] [A-Z]" - is the command to which we are going to establish the two way communication from AWK.
The next statement i.e. print command, provides input to the tr command. Here the &| indicates the two way communication.
The third statement i.e. close(cmd, "to") - closes the to process after competing its execution.
Next statement cmd |& getline out stores the output into out variable with the aid of getline function.
Next print statement prints the output and finally close function closes the command.
At the start up AWK reads first line from the file marks.txt and stores it into $0 variable.
In next statement we are instructing AWK to read next line using getline. Hence AWK reads second line and store it into $0 variable.
And finally AWK's print statement prints the second line. This process continues until file is exhausted.
First create two files. Our file1.txt looks like as follow:
First, create a functions.awk file containing AWK command as shown below:
Arithmetic Functions
AWK has the following built-in arithmetic functions:atan2(y, x)
It return the arctangent of (y/x) in radians. Following simple example illustrates this:[jerry]$ awk 'BEGIN {
PI = 3.14159265
x = -10
y = 10
result = atan2 (y,x) * 180 / PI;
printf "The arc tangent for (x=%f, y=%f) is %f degrees\n", x, y, result
}'
On executing the above code, you get the following result:The arc tangent for (x=-10.000000, y=10.000000) is 135.000000 degrees
cos(expr)
This function returns the cosine of expr, which is in radians. Below simple example illustrates this:[jerry]$ awk 'BEGIN {
PI = 3.14159265
param = 60
result = cos(param * PI / 180.0);
printf "The cosine of %f degrees is %f.\n", param, result
}'
On executing the above code, you get the following result:The cosine of 60.000000 degrees is 0.500000.
exp(expr)
This function is used to find the exponential value.[jerry]$ awk 'BEGIN {
param = 5
result = exp(param);
printf "The exponential value of %f is %f.\n", param, result
}'
On executing the above code, you get the following result:The exponential value of 5.000000 is 148.413159.
int(expr)
This function truncate the expr to integer value. Below simple example illustrates this:[jerry]$ awk 'BEGIN {
param = 5.12345
result = int(param)
print "Truncated value =", result
}'
On executing the above code, you get the following result:Truncated value = 5
log(expr)
This function calculates the natural logarithm.[jerry]$ awk 'BEGIN {
param = 5.5
result = log (param)
printf "log(%f) = %f\n", param, result
}'
On executing the above code, you get the following result:log(5.500000) = 1.704748
rand
This function returns a random number N, between 0 and 1, such that 0 <= N < 1. For instance below example generates three random numbers:[jerry]$ awk 'BEGIN {
print "Random num1 =" , rand()
print "Random num2 =" , rand()
print "Random num3 =" , rand()
}'
On executing the above code, you get the following result:Random num1 = 0.237788 Random num2 = 0.291066 Random num3 = 0.845814
sin(expr)
This function returns the sine of expr, which is in radians. Below simple example illustrates this:[jerry]$ awk 'BEGIN {
PI = 3.14159265
param = 30.0
result = sin(param * PI /180)
printf "The sine of %f degrees is %f.\n", param, result
}'
On executing the above code, you get the following result:The sine of 30.000000 degrees is 0.500000.
sqrt(expr)
This function returns the square root of expr.[jerry]$ awk 'BEGIN {
param = 1024.0
result = sqrt(param)
printf "sqrt(%f) = %f\n", param, result
}'
On executing the above code, you get the following result:sqrt(1024.000000) = 32.000000
srand([expr])
This function generates random number using seed value. It use expr as the new seed for the random number generator. In absence of expr it uses the time of day as seed value.[jerry]$ awk 'BEGIN {
param = 10
printf "srand() = %d\n", srand()
printf "srand(%d) = %d\n", param, srand(param)
}'
On executing the above code, you get the following result:srand() = 1 srand(10) = 1417959587
String Functions
AWK has the following built-in String functions:asort(arr [, d [, how] ])
This function sorts the contents of arr using gawk's normal rules for comparing values, and replace the indices of the sorted values arr with sequential integers starting with 1.[jerry]$ awk 'BEGIN {
arr[0] = "Three"
arr[1] = "One"
arr[2] = "Two"
print "Array elements before sorting:"
for (i in arr) {
print arr[i]
}
asort(arr)
print "Array elements after sorting:"
for (i in arr) {
print arr[i]
}
}'
On executing the above code, you get the following result:Array elements before sorting: Three One Two Array elements after sorting: One Three Two
asorti(arr [, d [, how] ])
The behaviour of this function is the same as that of asort(), except that the array indices are used for sorting.[jerry]$ awk 'BEGIN {
arr["Two"] = 1
arr["One"] = 2
arr["Three"] = 3
asorti(arr)
print "Array indices after sorting:"
for (i in arr) {
print arr[i]
}
}'
On executing the above code, you get the following result:Array indices after sorting: One Three Two
gsub(regex, sub, string)
gsub stands for global substitution. It replaces every occurrence of sub with regex. The third parameter is optional if it is omitted then $0 is used.[jerry]$ awk 'BEGIN {
str = "Hello, World"
print "String before replacement = " str
gsub("World", "Jerry", str)
print "String after replacement = " str
}'
On executing the above code, you get the following result:String before replacement = Hello, World String after replacement = Hello, Jerry
index(str, sub)
It checks whether sub is a substring of str or not. On success it returns the position where sub starts otherwise it returns 0. The first character of str is in position 1.[jerry]$ awk 'BEGIN {
str = "One Two Three"
subs = "Two"
ret = index(str, subs)
printf "Substring \"%s\" found at %d location.\n", subs, ret
}'
On executing the above code, you get the following result:Substring "Two" found at 5 location.
length(str)
It returns the length of string string.[jerry]$ awk 'BEGIN {
str = "Hello, World !!!"
print "Length = ", length(str)
}'
On executing the above code, you get the following result:Length = 16
match(str, regex)
It returns the index of the first longest match of regex in string str. It returns 0 if no match found.[jerry]$ awk 'BEGIN {
str = "One Two Three"
subs = "Two"
ret = match(str, subs)
printf "Substring \"%s\" found at %d location.\n", subs, ret
}'
On executing the above code, you get the following result:Substring "Two" found at 5 location.
split(str, arr, regex)
This function splits string str into fields by regular expression regex and the fields are loaded into array arr. If regex is omitted then FS is used.[jerry]$ awk 'BEGIN {
str = "One,Two,Three,Four"
split(str, arr, ",")
print "Array contains following values"
for (i in arr) {
print arr[i]
}
}'
On executing the above code, you get the following result:Array contains following values One Two Three Four
sprintf(format, expr-list)
This function returns a string constructed from expr-list according to format.[jerry]$ awk 'BEGIN {
str = sprintf("%s", "Hello, World !!!")
print str
}'
On executing the above code, you get the following result:Hello, World !!!
strtonum(str)
This function examines str and return its numeric value. If str begins with a leading 0, treat it as an octal number. If str begins with a leading 0x or 0X, treat it as a hexadecimal number. Otherwise, assume it is a decimal number.[jerry]$ awk 'BEGIN {
print "Decimal num = " strtonum("123")
print "Octal num = " strtonum("0123")
print "Hexadecimal num = " strtonum("0x123")
}'
On executing the above code, you get the following result:Decimal num = 123 Octal num = 83 Hexadecimal num = 291
sub(regex, sub, string)
This function performs single substitution. It replaces first occurrence of sub with regex. The third parameter is optional if it omitted, $0 is used.[jerry]$ awk 'BEGIN {
str = "Hello, World"
print "String before replacement = " str
sub("World", "Jerry", str)
print "String after replacement = " str
}'
On executing the above code, you get the following result:String before replacement = Hello, World String after replacement = Hello, Jerry
substr(str, start, l)
This function returns the substring of string str, starting at index start of length l. If length is omitted, the suffix of str starting at index start is returned.[jerry]$ awk 'BEGIN {
str = "Hello, World !!!"
subs = substr(str, 1, 5)
print "Substring = " subs
}'
On executing the above code, you get the following result:Substring = Hello
tolower(str)
This function returns a copy of string str with all upper case characters converted to lower case.[jerry]$ awk 'BEGIN {
str = "HELLO, WORLD !!!"
print "Lowercase string = " tolower(str)
}'
On executing the above code, you get the following result:Lowercase string = hello, world !!!
toupper(str)
This function returns a copy of string str with all lower case characters converted to upper case.[jerry]$ awk 'BEGIN {
str = "hello, world !!!"
print "Uppercase string = " toupper(str)
}'
On executing the above code, you get the following result:Uppercase string = HELLO, WORLD !!!
Time Functions
AWK has the following built-in time functions:systime
This function returns the current time of the day as the number of seconds since the Epoch (1970-01-01 00:00:00 UTC on POSIX systems).[jerry]$ awk 'BEGIN {
print "Number of seconds since the Epoch = " systime()
}'
On executing the above code, you get the following result:Number of seconds since the Epoch = 1418574432
mktime(datespec)
This function converts datespec string into a time stamp of the same form as returned by systime(). The datespec is a string of the form YYYY MM DD HH MM SS.[jerry]$ awk 'BEGIN {
print "Number of seconds since the Epoch = " mktime("2014 12 14 30 20 10")
}'
On executing the above code, you get the following result:Number of seconds since the Epoch = 1418604610
strftime([format [, timestamp[, utc-flag]]])
This function formats timestamps according to the specification in format.[jerry]$ awk 'BEGIN {
print strftime("Time = %m/%d/%Y %H:%M:%S", systime())
}'
On executing the above code, you get the following result:Time = 12/14/2014 22:08:42Following are the various time formats supported by AWK:
| Date format specification | Description |
|---|---|
| %a | The locale’s abbreviated weekday name. |
| %A | The locale’s full weekday name. |
| %b | The locale’s abbreviated month name. |
| %B | The locale’s full month name. |
| %c | The locale’s appropriate date and time representation. (This is %A %B %d %T %Y in the C locale.) |
| %C | The century part of the current year. This is the year divided by 100 and truncated to the next lower integer. |
| %d | The day of the month as a decimal number (01–31). |
| %D | Equivalent to specifying %m/%d/%y. |
| %e | The day of the month, padded with a space if it is only one digit. |
| %F | Equivalent to specifying %Y-%m-%d. This is the ISO 8601 date format. |
| %g | The year modulo 100 of the ISO 8601 week number, as a decimal number (00–99). For example, January 1, 1993 is in week 53 of 1992. Thus, the year of its ISO 8601 week number is 1992, even though its year is 1993. Similarly, December 31, 1973 is in week 1 of 1974. Thus, the year of its ISO week number is 1974, even though its year is 1973. |
| %G | The full year of the ISO week number, as a decimal number. |
| %h | Equivalent to %b. |
| %H | The hour (24-hour clock) as a decimal number (00–23). |
| %I | The hour (12-hour clock) as a decimal number (01–12). |
| %j | The day of the year as a decimal number (001–366). |
| %m | The month as a decimal number (01–12). |
| %M | The minute as a decimal number (00–59). |
| %n | A newline character (ASCII LF). |
| %p | The locale’s equivalent of the AM/PM designations associated with a 12-hour clock. |
| %r | The locale’s 12-hour clock time. (This is %I:%M:%S %p in the C locale.) |
| %R | Equivalent to specifying %H:%M. |
| %S | The second as a decimal number (00–60). |
| %t | A TAB character. |
| %T | Equivalent to specifying %H:%M:%S. |
| %u | The weekday as a decimal number (1–7). Monday is day one. |
| %U | The week number of the year (the first Sunday as the first day of week one) as a decimal number (00–53). |
| %V | The week number of the year (the first Monday as the first day of week one) as a decimal number (01–53). |
| %w | The weekday as a decimal number (0–6). Sunday is day zero. |
| %W | The week number of the year (the first Monday as the first day of week one) as a decimal number (00–53). |
| %x | The locale’s appropriate date representation. (This is %A %B %d %Y in the C locale.) |
| %X | The locale’s appropriate time representation. (This is %T in the C locale.) |
| %y | The year modulo 100 as a decimal number (00–99). |
| %Y | The full year as a decimal number (e.g. 2011). |
| %z | The time-zone offset in a +HHMM format (e.g., the format necessary to produce RFC 822/RFC 1036 date headers). |
| %Z | The time zone name or abbreviation; no characters if no time zone is determinable. |
Bit Manipulation Functions
AWK has the following built-in bit manipulation functions:and
Performs bitwise AND operation.[jerry]$ awk 'BEGIN {
num1 = 10
num2 = 6
printf "(%d AND %d) = %d\n", num1, num2, and(num1, num2)
}'
On executing the above code, you get the following result:(10 AND 6) = 2
compl
Performs bitwise COMPLEMENT operation.[jerry]$ awk 'BEGIN {
num1 = 10
printf "compl(%d) = %d\n", num1, compl(num1)
}'
On executing the above code, you get the following result:compl(10) = 9007199254740981
lshift
Performs bitwise LEFT SHIFT operation.[jerry]$ awk 'BEGIN {
num1 = 10
printf "lshift(%d) by 1 = %d\n", num1, lshift(num1, 1)
}'
On executing the above code, you get the following result:lshift(10) by 1 = 20
rshift
Performs bitwise RIGHT SHIFT operation.[jerry]$ awk 'BEGIN {
num1 = 10
printf "rshift(%d) by 1 = %d\n", num1, rshift(num1, 1)
}'
On executing the above code, you get the following result:rshift(10) by 1 = 5
or
Performs bitwise OR operation.[jerry]$ awk 'BEGIN {
num1 = 10
num2 = 6
printf "(%d OR %d) = %d\n", num1, num2, or(num1, num2)
}'
On executing the above code, you get the following result:(10 OR 6) = 14
xor
Performs bitwise XOR operation.[jerry]$ awk 'BEGIN {
num1 = 10
num2 = 6
printf "(%d XOR %d) = %d\n", num1, num2, xor(num1, num2)
}'
On executing the above code, you get the following result:(10 bitwise xor 6) = 12
Miscellaneous Functions
AWK has the following miscellaneous functions:close(expr)
This function closes file of pipe.[jerry]$ awk 'BEGIN {
cmd = "tr [a-z] [A-Z]"
print "hello, world !!!" |& cmd
close(cmd, "to")
cmd |& getline out
print out;
close(cmd);
}'
On executing the above code, you get the following result:HELLO, WORLD !!!Does script look cryptic ? Let us demystify it.
first statement, cmd = "tr [a-z] [A-Z]" - is the command to which we are going to establish the two way communication from AWK.
The next statement i.e. print command, provides input to the tr command. Here the &| indicates the two way communication.
The third statement i.e. close(cmd, "to") - closes the to process after competing its execution.
Next statement cmd |& getline out stores the output into out variable with the aid of getline function.
Next print statement prints the output and finally close function closes the command.
delete
This function deletes an element from array. Below simple example shows the usage of the close function:[jerry]$ awk 'BEGIN {
arr[0] = "One"
arr[1] = "Two"
arr[2] = "Three"
arr[3] = "Four"
print "Array elements before delete operation:"
for (i in arr) {
print arr[i]
}
delete arr[0]
delete arr[1]
print "Array elements after delete operation:"
for (i in arr) {
print arr[i]
}
}'
On executing the above code, you get the following result:Array elements before delete operation: One Two Three Four Array elements after delete operation: Three Four
exit
This function stops the execution of the script. It also accepts an optional expr which becomes AWK's return value. Below example describes the usage of the exit function.[jerry]$ awk 'BEGIN {
print "Hello, World !!!"
exit 10
print "AWK never executes this statement."
}'
On executing the above code, you get the following result:Hello, World !!!
fflush
This function flushes any buffers associated with open output file or pipe. Below is the syntax of the function.fflush([output-expr])If no output-expr is supplied, it flushes standard output. If output-expr is the null string ("") then it flushes all open files and pipes.
getline
This function instructs AWK to read next line. Below example reads and displays the marks.txt file using getline function.[jerry]$ awk '{getline; print $0}' marks.txt
On executing the above code, you get the following result:2) Rahul Maths 90 4) Kedar English 85 5) Hari History 89Script worked fine. But where is the first line ? Let us find out it.
At the start up AWK reads first line from the file marks.txt and stores it into $0 variable.
In next statement we are instructing AWK to read next line using getline. Hence AWK reads second line and store it into $0 variable.
And finally AWK's print statement prints the second line. This process continues until file is exhausted.
next
The next function changes the flow of the program. It causes the current processing of the pattern space to stop. The program reads the next line, and starts executing the commands again with the new line. For instance below program does not perform any processing when pattern match succeeds.[jerry]$ awk '{if ($0 ~/Shyam/) next; print $0}' marks.txt
On executing the above code, you get the following result:1) Amit Physics 80 2) Rahul Maths 90 4) Kedar English 85 5) Hari History 89
nextfile
The nextfile function changes the flow of the program. It stop processing the current input file and start new cycle through pattern/procedures statements, beginning with the first record of the next file. For instance below example stops processing of the first file when pattern match succeeds.First create two files. Our file1.txt looks like as follow:
file1:str1 file1:str2 file1:str3 file1:str4And our file2.txt looks like as follow:
file2:str1 file2:str2 file2:str3 file2:str4Now let us use nextfile function.
[jerry]$ awk '{ if ($0 ~ /file1:str2/) nextfile; print $0 }' file1.txt file2.txt
On executing the above code, you get the following result:file1:str1 file2:str1 file2:str2 file2:str3 file2:str4
return
This function can be used within a user-defined function to return the value. Please note that the return value of a function is undefined if expr is not provided. Below example describes the usage of the return function.First, create a functions.awk file containing AWK command as shown below:
function addition(num1, num2)
{
result = num1 + num2
return result
}
BEGIN {
res = addition(10, 20)
print "10 + 20 = " res
}
On executing the above code, you get the following result:10 + 20 = 30
system
This function executes the specified command and returns its exit status. A return status 0 indicates that command execution succeeded. A non-zero value indicates a failure of command execution. For instance below example displays the current date and also shows the return status of the command.[jerry]$ awk 'BEGIN { ret = system("date"); print "Return value = " ret }'
On executing the above code, you get the following result:Sun Dec 21 23:16:07 IST 2014 Return value = 0

No comments:
Post a Comment