sort logfile by timestamp on linux command line

36,238

Solution 1

Use sort's -k flag:

sort -k1 -r freeswitch.log

That will sort the file, in reverse, by the first key (i.e. freeswitch.log:2011-09-08 12:21:07.282236). If the filename is always the same (freeswitch.log), then it should sort by the date.

Solution 2

Use sort's --stable, --reverse, and --key options:

sort --stable --reverse --key=1,2 freeswitch.log

(For non-didactic purposes, this can be shortened to -srk1,2.)

The sort command (as you might expect) outputs each line of the named files (or STDIN) in sorted order. What each of these options does:

  • The --reverse option tells sort to sort lines with greater values (later dates) higher, rather than lower. It's assumed, based on other answers, that this is what you mean by "descending" (even though this kind of sorting would normally be considered "ascending"). If you want to sort the lines in chronological order, you would omit this option.
  • The --key=1,2 option tells sort to only use the first two whitespace-separated "fields" (the "freeswitch.log:"-prefixed date, and the time) as the key for sorting. It is important that you specify the last field to use, even if you are only sorting by one field (for instance, if each line kept time and date together in an ISO-8601 standard field like freeswitch.log 2011-09-08T12:21:07.282236, you would use -k 2,2), as, by default, the fields used by a key extend to the end of the line.
  • The --stable option tells sort to not perform "last-resort ordering". Without this option, a line with two equal keys (as specified with the --keys option) will then be sorted according to the entire line, meaning that the filename and/or content will change the sort order of the lines.

It is important to specify both extents of the --key, as well as the --stable option. Without them, multiple lines of output that occurred at the same time (in other words, a multi-line message) would be sorted according to the content of the message (without the second field in --key) and/or the filename (without --stable, if the filename is a separate field, as described below).

In other words, a log message like this:

freeswitch.log:2011-09-08 12:21:10.374238 Warning: Syntax error on line 20:
freeswitch.log:2011-09-08 12:21:10.374238
freeswitch.log:2011-09-08 12:21:10.374238    My[brackets(call)
freeswitch.log:2011-09-08 12:21:10.374238               ^
freeswitch.log:2011-09-08 12:21:10.374238 Suggestion:
freeswitch.log:2011-09-08 12:21:10.374238   did you forget to
freeswitch.log:2011-09-08 12:21:10.374238   close your brackets?

would get "sorted" into:

freeswitch.log:2011-09-08 12:21:10.374238
freeswitch.log:2011-09-08 12:21:10.374238               ^
freeswitch.log:2011-09-08 12:21:10.374238   close your brackets?
freeswitch.log:2011-09-08 12:21:10.374238   did you forget to
freeswitch.log:2011-09-08 12:21:10.374238    My[brackets(call)
freeswitch.log:2011-09-08 12:21:10.374238 Suggestion:
freeswitch.log:2011-09-08 12:21:10.374238 Warning: Syntax error on line 20:

This is "sorted" (because "c" comes before "d", and "S" comes before "W"), but it's not in order. Specifying --stable (and keeping your --key bounded) will skip the extra sorting and preserve the order, which is what you want.


Also, sorting by this combined filename-and-date field will only work if every line in your output starts with the same filename. Given the syntax you posted, if your input has multiple, different filenames that you want to ignore in sorting, you need to use a program like sed to convert the filename to its own space-separated field, then pipe the converted lines to sort (after which you may then convert the field separators back):

sed 's/:/ /' freeswitch.log | sort -srk2,3 | sed 's/ /:/'

Note that the fields used by the key are changed to 2,3, skipping the first (filename) field.

Solution 3

Crude but effective technique: Prefix each line with a numeric representation of the date, sort numerically, then remove the extra info.

Oneliner:

while IFS=' ' read -r name_date trailing ; do date=$(cut -d: -f2 <<<"$name_date") ; printf '%s:%s\n' $(date -d "$date" +%s) "$name_date $trailing" ; done < freeswitch.log | sort -k1 -t: | cut -d: -f2-

Shell script:

#!/usr/bin/env bash

logfile="$1"

if [ -f "$logfile" ] ; then
    while IFS=' ' read -r name_date trailing ; do
            date=$(cut -d: -f2 <<<"$name_date")
        printf '%s:%s\n' $(date -d "$date" +%s) "$name_date $trailing"
    done < "$logfile" | sort -k1 -t: | cut -d: -f2-
fi

Note: Requires GNU date.

If the output at this point is the reverse of what you want it is simple to pipe through tac or to modify the script to also pass -r to sort.

EDIT: I missed the part where the filename was literally on each line. Updated version will now actually work.

Solution 4

You can try using sort

sort -k1,2 file
Share:
36,238
markus
Author by

markus

Updated on July 10, 2022

Comments

  • markus
    markus almost 2 years

    I have a logfile with entries like:

    ...    
    freeswitch.log:2011-09-08 12:21:07.282236 [ERR] ftdm_queue.c:136 Failed to enqueue obj 0x7f2cda3525c0 in queue 0x7f2ce8005990, no more room! windex == rindex == 58!
    freeswitch.log:2011-08-08 13:21:07.514261 [ERR] ftdm_queue.c:136 Failed to enqueue obj 0x7f2cda354460 in queue 0x7f2ce8005990, no more room! windex == rindex == 58!
    freeswitch.log:2011-06-04 16:21:08.998227 [ERR] ftdm_queue.c:136 Failed to enqueue obj 0x7f2cda356300 in queue 0x7f2ce8005990, no more room! windex == rindex == 58! 
    freeswitch.log:2011-09-08 12:21:10.374238 [ERR] ftdm_queue.c:136 Failed to enqueue obj 0x7f2cda3581a0 in queue 0x7f2ce8005990, no more room! windex == rindex == 58!
    ...
    

    How can I sort the file with linux command line tools by the timestamp in each row decending?

  • markus
    markus over 12 years
    The entries are not completly sorted by the timestamp yet
  • markus
    markus over 12 years
    no it's not appending data, I already grep the lines I want to analyze from the original logfile into a new file.
  • markus
    markus over 12 years
    but how to tell to sort by the timestamp in the row. Note they are not ordered yet (I changes it in my question)
  • Patrick B.
    Patrick B. over 12 years
    If I understand your comment correctly, this answer is: grep ftdm_queue.c freeswitch.log | sort -r # this is a guess as you having provided your full command-line.
  • user3479901
    user3479901 about 9 years
    Without specifying an end field or a -s flag, that -k1 is effectively meaningless. See the explanation in my answer.
  • user3479901
    user3479901 about 9 years
    Even with an end field specified, last-resort comparison makes it so -k1,2 ultimately doesn't mean much of anything. See the explanation of --stable in my answer.