using awk to check between two dates
Solution 1
The key observation is that you can compare your timestamps using alphanumeric comparisons and get the correct answer - that is the beauty of ISO 8601 notation.
Thus, adapting your code slightly - and formatting to avoid scroll bars:
awk 'BEGIN {
FS = "\n"
RS = ""
OFS = ";"
ORS = "\n"
t1 = "2010-03-23T07:45:00"
t2 = "2010-03-23T08:00:00"
m1 = "eventTimestamp: " t1
m2 = "eventTimestamp: " t2
}
$1 ~ /eventTimestamp:/ && $4 ~ /SMS-MO-FSM(-INFO)?$/ {
if ($1 >= m1 && $1 <= m2) print $1, $2, $3, $4;
}' "$@"
Obviously, you could put this into a script file - you wouldn't want to type it often. And getting the date range entered accurately and conveniently is one of the hard parts. Note that I've adjusted the time range to match the data.
When run on the sample data, it outputs one record:
eventTimestamp: 2010-03-23T07:56:19.186;result: Allowed;protocol: SMS;payload: SMS-MO-FSM
Solution 2
A bit of a kludge, but this script assumes you have the unix "date" command. Also hard coded your start and end timestamps in the BEGIN block. Note that your test data listed above does not fall within your sample start/end times.
#!/usr/bin/awk -f
BEGIN {
command="date -f\"%s\" -d \"2010-03-23 12:56:47\""; command | getline startTime; close(command)
command="date -f\"%s\" -d \"2010-03-23 13:56:47\""; command | getline endTime; close(command)
}
$0 ~ /^eventTimestamp:/ {
command="date -f\"%s\" -d " $2; command | getline currTime; close(command)
if (currTime >= startTime && currTime <= endTime) {
printIt="true"
}else{
printIt="false";
}
}
printIt == "true" { print }
amadain
Updated on June 04, 2022Comments
-
amadain almost 2 years
I have a file with multiple data structures in it like so:
eventTimestamp: 2010-03-23T07:56:19.166 result: Allowed protocol: SMS payload: RCOMM_SMS eventTimestamp: 2010-03-23T07:56:19.167 result: Allowed protocol: SMS payload: RCOMM_SMS eventTimestamp: 2010-03-23T07:56:19.186 result: Allowed protocol: SMS payload: SMS-MO-FSM eventTimestamp: 2010-03-23T07:56:19.197 result: Allowed protocol: SMS payload: COPS eventTimestamp: 2010-03-23T07:56:29.519 result: Blocked protocol: SMS payload: COPS type: URL_IWF result: Blocked
I want to find all of the events that are payload: SMS-MO-FSM or payload: SMS-MO-FSM-INFO that occurred between the times 2010-03-23 12:56:47 and 2010-03-23 13:56:47. When querying this file so far I have used awk in the following manner:
cat checkThis.txt | awk 'BEGIN{FS="\n"; RS=""; OFS=";"; ORS="\n"} $1~/eventTimestamp: 2010-03-23T14\:16\:35/ && $4~/SMS-MO-FSM-INFO|SMS-MO-FSM$/ {$1=$1 ""; print $0}'
Which will give me all of the events that occurred on the second of 14:16:35 in 2010-03-23. I am struggling, however, to think of how I could put the date range into my query. I could use the following to put the dates into epoch time but how can I use the following in my awk to check whether the date is between the times needed:
python -c "import time; ENGINE_TIME_FORMAT='%Y-%m-%dT%H:%M:%S'; print int(time.mktime(time.strptime('2010-03-23T12:52:52', ENGINE_TIME_FORMAT)))"
I know this could done in Python but I have written a parser in Python for this and I want this method as an alternative checker so I want to use awk if at all possible.
I took this a little further and created a python script for time conversion:
#!/usr/local/bin/python import time, sys ENGINE_TIME_FORMAT='%Y-%m-%dT%H:%M:%S' testTime = sys.argv[1] try: print int(time.mktime(time.strptime(testTime, ENGINE_TIME_FORMAT))) except: print "Time to convert %s" % testTime raise
I then tried to use getline to assign the conversion to a variable for comparison:
cat checkThis.txt| awk 'BEGIN {FS="\n"; RS=""; OFS=";"; ORS="\n"; "./firstDate '2010-03-23T12:56:47'" | getline start_time; close("firstDate"); "./firstDate '2010-03-23T13:56:47'" | getline end_time; close("firstDate");} ("./firstDate $1" | getline) > start_time {$1=$1 ""; print $0}' Traceback (most recent call last): File "./firstDate", line 4, in <module> testTime = sys.argv[1] IndexError: list index out of range
The getline works in the BEGIN and I checked it in the final print but I seem to have problems in the comparison part of the script.
-
cryptochaos over 13 yearsAs I don't have enough reputation to vote yet, I can't vote for Jonathan Leffler's solution. But it's a good one.