Python - How can I open a file and specify the offset in bytes?
Solution 1
You can manage the position in the file thanks to the seek
and tell
methods of the file
class see
https://docs.python.org/2/tutorial/inputoutput.html
The tell
method will tell you where to seek next time you open
Solution 2
log = open('myfile.log')
pos = open('pos.dat','w')
print log.readline()
pos.write(str(f.tell())
log.close()
pos.close()
log = open('myfile.log')
pos = open('pos.dat')
log.seek(int(pos.readline()))
print log.readline()
Of course you shouldn't use it like that - you should wrap the operations up in functions like save_position(myfile)
and load_position(myfile)
, but the functionality is all there.
Solution 3
If your logfiles fit easily in memory (this is, you have a reasonable rotation policy) you can easily do something like:
log_lines = open('logfile','r').readlines()
last_line = get_last_lineprocessed() #From some persistent storage
last_line = parse_log(log_lines[last_line:])
store_last_lineprocessed(last_line)
If you cannot do this, you can use something like (see accepted answer's use of seek and tell, in case you need to do it with them) Get last n lines of a file with Python, similar to tail
dave
Updated on March 01, 2020Comments
-
dave over 4 years
I'm writing a program that will parse an Apache log file periodically to log it's visitors, bandwidth usage, etc..
The problem is, I don't want to open the log and parse data I've already parsed. For example:
line1 line2 line3
If I parse that file, I'll save all the lines then save that offset. That way, when I parse it again, I get:
line1 line2 line3 - The log will open from this point line4 line5
Second time round, I'll get line4 and line5. Hopefully this makes sense...
What I need to know is, how do I accomplish this? Python has the seek() function to specify the offset... So do I just get the filesize of the log (in bytes) after parsing it then use that as the offset (in seek()) the second time I log it?
I can't seem to think of a way to code this >.<
-
Duncan almost 14 yearsThat would actually put the read position 3 characters from the EOF, not 3 lines.
-
dave almost 14 yearsThis seems like it'll do exactly what I want. Cheers.
-
dave almost 14 yearsThe logs are for virtual hosts so, currently, no log rotation. I suppose I should looking into setting that up... Which would make your solution rather useful. Cheers.
-
cevaris about 8 yearsHmm, seems that link needs to be update. Has no reference to file objects; Perhaps: docs.python.org/2/tutorial/inputoutput.html