How to grep log files during a specific time period

26,958

Solution 1

I have found the answer in the form I was looking for:

cat /dir/dir/dir/2014-07-30.txt | grep *someword* | cut -d',' -f1,4,3,7 | egrep '[^ ]+ (2[0-2]):[0-9]'

The following command gets me all the information I need from the cut, and greps for the someword I need and with the egrep I can search the times I need.

Solution 2

Using awk:

logsearch() {
    grep "$3" "$4" | awk -v start="$1" -v end="$2" '{split($2, a, /:/)} (a[1] >= start) && (a[1] <= end)'
}

# logsearch <START> <END> <PATTERN> <FILE>
logsearch 18 20 '*someword*' /dir/dir/dir/2014-07-30.txt

Or with only awk (possibly different pattern quoting requirements):

logsearch2 ()
{
    awk -v start="$1" -v end="$2" -v pat="$3" '($0 ~ pat) {split($2, a, /:/)} ($0 ~ pat) && (a[1] >= start) && (a[1] <= end)' "$4"
}

Solution 3

Not having seen the original input data I'm guessing from your cut what's going on.

Will this give you something similar to your desired outcome?

 awk -F, '/someword/ && $4 ~ /^(18|19|20)/{printf "%s %s %s %s\n", $1,$4,$3,$7}' /dir/dir/dir/2014-07-30.txt

That said: a bit of sample data typically goes a long way!

Edit1:

Given the input line you added to both your comment and the original post the following awk statement does what you're asking:

awk '/something/ && $2 ~ /^(18|19|20)/{printf "%s %s %s %s\n", $1,$2,$3,$4} /path/to/your/input_file

Solution 4

This is a very interesting question. The pure BASH solution offers quite a bit of flexibility in how you deal with or process the entries after you identify those responsive to the range of date/time of interest. The simplest way in BASH is simply to get your start-time and stop-time in seconds since epoch and then test each log entry to determine if it falls within that range and then -- do something with the log entry. The basic logic involved is relatively short. The width of the date_time field within the log can be set by passing the width as argument 4. Set the default dwidth as needed (currently 15 to match syslog and journalctl format. The only required argument is the logfile name. If no start/stop time is specified, it will find all entries:

## set filename, set start time and stop time (in seconds since epoch) 
#  and time_field width (number of chars that make up date in log entry)
lfname=${1}
test -n "$2" && starttm=`date --date "$2" +%s` || starttm=0
test -n "$3" && stoptm=`date --date "$3" +%s`  ||  stoptm=${3:-`date --date "Jan 01 2037 00:01:00" +%s`}
dwidth=${4:-15}

## read each line from the log file and act on only those with
#  date_time between starttm and stoptm (inclusive)
while IFS=$'\n' read line || test -n "$line"; do

    test "${line:0:1}" != - || continue           # exclude journalctl first line
    logtm=`date --date "${line:0:$dwidth}" +%s`   # get logtime from entry in seconds since epoch

    if test $logtm -ge $starttm && test $logtm -le $stoptm ; then
        echo "logtm: ${line:0:$dwidth} => $logtm"
    fi

done < "${lfname}"

working example:

#!/bin/bash

## log date format      len
#   journalctl          15
#   syslog              15
#   your log example    23

function usage {
    test -n "$1" && printf "\n Error: %s\n" "$1"
    printf "\n  usage  : %s logfile ['start datetime' 'stop datetime' tmfield_width]\n\n" "${0//*\//}"
    printf "  example: ./date-time-diff.sh syslog \"Jul 31 00:15:02\" \"Jul 31 00:18:30\"\n\n"
    exit 1
}

## test for required input & respond to help
test -n "$1" || usage "insufficient input."
test "$1" = "-h" || test "$1" = "--help" && usage

## set filename, set start time and stop time (in seconds since epoch) 
#  and time_field width (number of chars that make up date in log entry)
lfname=${1}
test -n "$2" && starttm=`date --date "$2" +%s` || starttm=0
test -n "$3" && stoptm=`date --date "$3" +%s`  ||  stoptm=${3:-`date --date "Jan 01 2037 00:01:00" +%s`}
dwidth=${4:-15}

## read each line from the log file and act on only those with
#  date_time between starttm and stoptm (inclusive)
while IFS=$'\n' read line || test -n "$line"; do

    test "${line:0:1}" != - || continue           # exclude journalctl first line
    logtm=`date --date "${line:0:$dwidth}" +%s`   # get logtime from entry in seconds since epoch

    if test $logtm -ge $starttm && test $logtm -le $stoptm ; then
        echo "logtm: ${line:0:$dwidth} => $logtm"
    fi

done < "${lfname}"

exit 0

usage:

$ ./date-time-diff.sh -h

  usage  : date-time-diff.sh logfile ['start datetime' 'stop datetime' tmfield_width]

  example: ./date-time-diff.sh syslog "Jul 31 00:15:02" "Jul 31 00:18:30"

Remember to quote your starttm and stoptm strings. Testing with 20 entries in logfile between Jul 31 00:12:58 and Jul 31 00:21:10.

test output:

$ ./date-time-diff.sh jc.log "Jul 31 00:15:02" "Jul 31 00:18:30"
logtm: Jul 31 00:15:02 => 1406783702
logtm: Jul 31 00:15:10 => 1406783710
logtm: Jul 31 00:15:11 => 1406783711
logtm: Jul 31 00:15:11 => 1406783711
logtm: Jul 31 00:15:11 => 1406783711
logtm: Jul 31 00:15:11 => 1406783711
logtm: Jul 31 00:18:30 => 1406783910

Depending on what you need, another one of the solutions may fit your needs, but if you need to be able to process or manipulate the matching log entries, it is hard to beat a BASH script.

Share:
26,958
ZeroLoop
Author by

ZeroLoop

Updated on July 07, 2022

Comments

  • ZeroLoop
    ZeroLoop almost 2 years

    Okay, So i have log files and I would like to search within specific ranges. These ranges will be different throughout the day. Below is a piece of a log file and this is the only piece I can show you, sorry work stuff. I am using the cat command if that matters.

    Working EXAMPLE : cat /dir/dir/dir/2014-07-30.txt | grep *someword* | cut -d',' -f1,4,3,7

    2014-07-30 19:17:34.542 ;; (p=0,siso=0)

    The above gets me the info I need along with the time stamp, but shows all time ranges and that is what I would like to correct. Lets say I only want ranges of 18 to 20 in the first column of the time.

    Actual --> 2014-07-30 19:17:34.542 ;; (p=0,siso=0)

    Only range I am looking for --> [18-20]:00:00.000 ;; (p=0,siso=0)

    I am not worried about the 00s as they can be any digit.

    Thanks for looking. I have not used much in the way of scripting as you can tell from my example, but any help is greatly appreciated.

    I have included a log file, the colons and commas are where they should be.

    2014-07-30 14:33:19.259 ;; (p=0,ser=0,siso=0) IN ### Word:Numbers=00000,word=None something goes here and here (something here andhere:here also here:2222),codeword=8,codeword=0,Noideanumbers=00000000,something=something, ;;
    
    • Etan Reisner
      Etan Reisner over 9 years
      That's a useless use of cat for the record. grep '*someword*' /dir/dir/dir/2014-07-30.txt does the same thing without the extra process and pipe.
    • ZeroLoop
      ZeroLoop over 9 years
      It sure does, but I use the pipes and the extra process because I need certain pieces of info from the log file. I realize and know I can do grep in front of it. thanks for your input..
    • Etan Reisner
      Etan Reisner over 9 years
      I don't follow. The cat in that pipeline doesn't do anything at all for you. It can't (except stop grep from knowing that you are reading from a file and what the filename is).
    • ZeroLoop
      ZeroLoop over 9 years
      Well if I use your command with grep in the front with my pipes and delimits I get the same info but with the directory info at the front where as with mine I get only the info I need without the extra directory jargon. We search through log files in hundreds of directories at a time and only need key info.
    • Etan Reisner
      Etan Reisner over 9 years
      Are you talking about the filename prefix (/path/to/file:) that grep puts on output lines when fed more than one file? Because -h turns that off.
    • ZeroLoop
      ZeroLoop over 9 years
      Im really new to linux, did not know that but it does the same thing so i will try them both and let u know
  • ZeroLoop
    ZeroLoop over 9 years
    No such file or directory error is returned.
  • ZeroLoop
    ZeroLoop over 9 years
    I can't do one grep at a time as the log file contains info that needs to be together on the same line. Thanks.
  • ZeroLoop
    ZeroLoop over 9 years
    I will see if I can create something that will help a little better
  • ZeroLoop
    ZeroLoop over 9 years
    This is a sample and the colons and commas are where they should be. 2014-07-30 14:33:19.259 ;; (p=0,ser=0,siso=0) IN ### Word:Numbers=000000000000,word=None something goes here and here (something here andhere:here also here:2222),codeword=8,codeword=0,Noideanumbers=00000000,some‌​thing=something, ;;
  • tink
    tink over 9 years
    Hmmm ... with that input your cut leaves the line intact. I still don't know what you're doing. Unless your commas are something other than what you pasted.
  • Etan Reisner
    Etan Reisner over 9 years
    The cut in the OP modifies that example line. It doesn't drop much from the line but it does drop a little bit.
  • tripleee
    tripleee over 9 years
    That's a wacky thing to say. The only file or directory is exactly as in your question.