Python parsing log file to extract events in real time

26,682

Solution 1

C programs usually seek to the current position to clear any “end of file” flags. But as @9000 correctly pointed out, python apparently takes care of this, so you can read from the same file repeatedly even if it has reached end of file.

You might have to take care of incomplete lines, though. If your application writes its log in pieces, then you want to make sure that you handle whole lines, and not those pieces. The following code will accomplish that:

f = open('some.log', 'r')
while True:
    line = ''
    while len(line) == 0 or line[-1] != '\n':
        tail = f.readline()
        if tail == '':
            time.sleep(0.1)          # avoid busy waiting
            # f.seek(0, io.SEEK_CUR) # appears to be unneccessary
            continue
        line += tail
    process(line)

Solution 2

No need to run tail -f. Plain Python files should work:

with open('/tmp/track-this') as f:
  while True:
    line = f.readline()
    if line:
      print line

This thing works almost exactly like tail -f. Check it by running in another terminal:

echo "more" >> /tmp/track-this
# alt-tab here to the terminal with Python and see 'more' printed
echo "even more" >> /tmp/track-this

Don't forget to create /tmp/track-this before you run the Python snippet.

Parsing and taking appropriate actions are up to you. Probably long actions should be taken in separate threads/processes.

Stop condition is also up to you, but plain ^C works.

Solution 3

Thanks everyone for the answers. I found this as well. http://www.dabeaz.com/generators/follow.py

Share:
26,682
Soumya Simanta
Author by

Soumya Simanta

Updated on August 12, 2020

Comments

  • Soumya Simanta
    Soumya Simanta almost 4 years

    I've a process that is logging messages to a file.

    I want to implement another process (in Python) that parses these logs (as they are written to the file), filters the lines that I'm interested in and then performs certain actions based on the state of the first process.

    I was wondering before I go ahead and write something on my own if there is a library in Python that does something like this.

    Also, ideas regarding how implement something like this Python would be appreciated.

    Thanks.