How to filter logs between a time range

text-processing awk sed grep

32,281

Solution 1

With awk:

awk -v 'start=2018-04-12 14:44:00.000' -v end='2018-04-12 14:45:00.000' '
   /^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2} / {
     inrange = $0 >= start && $0 <= end
   }
   inrange' < your-file

It won;t work with mawk which doesn't support POSIX character classes nor interval regexp operators.

Solution 2

If you just want particular lines between a certain time then awk will work. To give a slight tutorial

To start with and find out which lines you want:

cat -n logfile

That will show the contents of the file with the line numbers.

To print out the line numbers that you want:

awk 'NR==2,NR==4' logfile

That prints out the range between lines 2 and 4.

If you want to print out two ore more lines or a range of lines that aren't consecutive in case you want that then you can separate them with either || or ;

awk 'NR==5,NR==10;NR==15,NR==20' logfile

Moving on to printing the lines between a certain time range, combine the above with grep egrep:

egrep "2018-04-12 14:44:01.000|2018-04-12 14:46:00.000" logfile | awk NR==5,NR==10

egrep allows multiple strings to be returned. The | symbol separates each string. That will print the lines with the start and end of the time range (I changed the end to a later time to include more lines) along with their line number. You can then use awk to print the range between and including the two lines.

You can take all of this as an example and modify it to suit your needs for your log files and what you want to print out according to the times.

Solution 3

Had a similar Problem currently , but the "simple" sed/awk way failed when there are no logs for a minute ( e.g. idle routers )

Finally generated a grep statement for the last n minutes like this:

searchterms() {
  backlog_minutes=15;
  for searchstamp in $(seq 0 60 $((60*backlog_minutes)));do
   LANG=en_US.UTF-8 date "+%b %d %H:%M" -d @"$(($(date +%s)-$searchstamp))";done ;
  } ;
greptarget=$(searchterms|sed 's/^/-e "/g;s/$/"/g' )

##Openwrt
which logread |grep -q logread && ( grepcmd=$(echo  grep  "$greptarget"); echo "logread|$(echo $grepcmd)"|sh )
## Linux Debian/Ubuntu
which logread |grep -q logread || (echo grep -e $greptarget /var/log/syslog|sh ;exit 0)

what i tried before ( https://unix.stackexchange.com/a/437445/374376 )

##_date_syslog_15_min() { LANG=en_US.UTF-8 date "+%b %d %H:%M" -d "15 min ago"   ; } ;
##_date_syslog_now() { LANG=en_US.UTF-8 date "+%b %d %H:%M" ; } ;
## ↑↑ this ones failed when there are no log entries

Solution 4

Dropping your date limits into file1

2018-04-12 14:44:00.000
2018-04-12 14:45:00.000

Then we can awk this

awk -F' |-|:' '
  {this=mktime($1" "$2" "$3" "$4" "$5" "$6);
     this=(this==-1)?last:this; last=this}
  NR==1{from=this;next}
  NR==2{to=this;next}
  this>=from&&this<=to' file1 file

2018-04-12 14:44:01.000 ERROR world
2018-04-12 14:44:03.000 INFO this is a multi-line log
NOTICE THIS LINE, this line is also part of the log

Walkthrough

Set up your field separators to split out the date/time elements

awk -F' |-|:'

Convert the first 6 fields to a timestamp

  '{this=mktime($1" "$2" "$3" "$4" "$5" "$6);

If this is a textfield without a stamp then take the last valid timestamp

     this=(this==-1)?last:this; last=this}

If this is the first record of the first file then store the timestamp in from and go get the next record

  NR==1{from=this;next}

Ditto forto

  NR==2{to=this;next}

Then just iterate over the second (log) file, check if the timestamp is in range and print if it is of interest

  this>=from&&this<=to' file1 file

View more solutions

32,281

Author by

aLeX

Updated on September 18, 2022

Comments

aLeX over 1 year
Here's my log format(simplified for demonstrating)
```
2018-04-12 14:43:00.000 ERROR hello
2018-04-12 14:44:01.000 ERROR world
2018-04-12 14:44:03.000 INFO this is a multi-line log
NOTICE THIS LINE, this line is also part of the log
2018-04-12 14:46:00.000 INFO foo
```
So how to filter the log of [2018-04-12 14:44:00.000, 2018-04-12 14:45:00.000) to produce the following output?
```
2018-04-12 14:44:01.000 ERROR world
2018-04-12 14:44:03.000 INFO this is a multi-line log
NOTICE THIS LINE, this line is also part of the log
```
- WashichawbachaW about 6 years
  
  So you're trying to get the log between a minute 14:44:00.000 and 14:45:00.000. Then I guess between that time, there are countless number of lines that will be produce right?`
- aLeX about 6 years
  
  @WashichawbachaW yes exactly
aLeX about 6 years

@WashichawbachaW thanks for your mention. I accepted this because of the usage of the comma(,) in sed and awk.