Watching something be written to a file live with tail

5,691

Solution 1

You may also need to explicitly flush the buffer for it to get piped upon generation. This is because output is typically only printed when the pipe's buffer fills up (which is in kilobytes I belive), and when the stdin message ends. This is probably to save on read/writes. You could do this after every print, or if you are looping, after the last print within the loop.

import sys
...
print('Some message')
sys.stdout.flush()

Solution 2

Run python with the unbuffered flag:

python -u myprog.py > output.txt

Output will then print in real time.

Solution 3

Instead of trying to tail a live file, use tee instead. It was made to do exactly what you're trying to do.

From man tee:

tee(1) - Linux man page

Name tee - read from standard input and write to standard output and files

Synopsis

tee [OPTION]... [FILE]...

Description

Copy standard input to each FILE, and also to standard output.

-a, --append  
   append to the given FILEs, do not overwrite  
-i, --ignore-interrupts  
   ignore interrupt signals   
--help  
   display this help and exit  
--version
   output version information and exit

If a FILE is -, copy again to standard output.

So in your case you'd run:

python myprog.py | tee output.txt

EDIT: As others have pointed out, this answer will run into the same issue OP was originally having unless sys.stdout.flush() is used in the python program as described in Davey's accepted answer. The testing I did before posting this answer did not accurately reflect OP's use-case.

tee can still be used as an alternative--albeit less than optimal--method of displaying the output while also writing to the file, but Davey's answer is clearly the correct and best answer.

Solution 4

Terminology: There is no pipe anywhere in this scenario. (I edited the question to fix that). Pipes are a different type of file (a buffer inside the kernel).

This is a redirect to a regular file.

C stdio, and Python, default to making stdout line-buffered when it's connected to a TTY, otherwise it's full-buffered. Line-buffered means the buffer is flushed after a newline. Full-buffered means it's only flushed to become visible to the OS (i.e. with a write() system call) when it's full.

You will see output eventually, in chunks of maybe 4kiB at a time. (I don't know the default buffer size.) This is generally more efficient, and means fewer writes to your actual disk. But not great for interactive monitoring, because output is hidden inside the memory of the writing process until it's flushed.

On Stack Overflow, there's a Disable output buffering Python Q&A which lists many ways to get unbuffered (or line-buffered?) output to stdout in Python. The question itself summarizes the answers.

Options include running python -u (Or I guess putting #!/usr/bin/python -u at the top of your script), or using the PYTHONUNBUFFERED environment variable for that program. Or explicit flushing after some/all print functions, like @Davey's answer suggests.


Some other programs have similar options, e.g. GNU grep has --line-buffered, and GNU sed has -u / --unbuffered, for use-cases like this, or for example piping the output of your python program. e.g. ./slowly-output-stuff | grep --line-buffered 'foo.*bar'.

Share:
5,691

Related videos on Youtube

interstar
Author by

interstar

Updated on September 18, 2022

Comments

  • interstar
    interstar over 1 year

    I have a python program which is, slowly, generating some output.

    I want to capture that in a file, but I also thought I could watch it live with tail.

    So in one terminal I'm doing :

    python myprog.py > output.txt
    

    and in another terminal :

    tail -f output.txt
    

    But it seems like the tail isn't showing me anything while the python program is running.

    If I hit ctrl-c to kill the python script, suddenly the tail of output.txt starts filling up. But not while python is running.

    What am I doing wrong?

    • n8te
      n8te about 5 years
      How about python myprog.py | tee output.txt instead?
    • JPhi1618
      JPhi1618 about 5 years
      @n8te tee might show the same problem if the program isn't flushing the output buffer regularly. This needs flush() and tee.
    • studog
      studog about 5 years
      stdbuf can be used to alter the buffering status of file descriptors.
    • Peter Cordes
      Peter Cordes about 5 years
      Terminology: There is no pipe anywhere in this scenario. There's a redirect to a regular file. (Which causes C stdio and Python to decide to make stdout full-buffered instead of line-buffered because it's not a TTY). Pipes are a different type of file (a buffer inside the kernel). I edited your question to correct that.
    • Mark Wagner
      Mark Wagner about 5 years
      Probably not needed in your situation but if you don't want to terminate the program you can use gdb and call fflush: see stackoverflow.com/questions/8251269/…
  • mckenzm
    mckenzm about 5 years
    If you have read this far, please don't be thinking of closing and re-opening the file to do this, the seeks will be a problem, especially for very large files. (I've seen this done!).
  • Baldrickk
    Baldrickk about 5 years
    tail in another thread is a good solution for when you've started the application before you decide you want to see the output though.
  • Mathew Lionnet
    Mathew Lionnet about 5 years
    That requires a permanent console session, this is why it’s often much easier to use tail -F or even better the follow function of less. But in all cases the flush should be used.
  • Dan
    Dan about 5 years
    You can also use print's flush parameter to do just as well. For example, print('some message', flush=True).
  • glglgl
    glglgl about 5 years
    It has nothing to do with the pipe's buffer, but with the stdout mechanism which doesn't flush after newline if it doesn't write to a tty.
  • Barmar
    Barmar about 5 years
    This won't solve the problem that the OP is having. Python's output to the pipe will be buffered just like output to the file.
  • Roel Schroeven
    Roel Schroeven about 5 years
    This is the correct answer. Python by default writes unbuffered (or actually line-buffered for text I/O) when writing to the console, but buffered when stdout is redirected to a file. -u forces Python to be unbuffered (or line-buffered for text) for writes.
  • Roel Schroeven
    Roel Schroeven about 5 years
    Instead of add flush() calls to the program, you can also use Python's -u command line switch; see BHC's answer.
  • wizzwizz4
    wizzwizz4 about 5 years
    @glglgl Please explain. I don't actually know what you mean by that; how is "the stdout mechanism" different to that of any other file?
  • ivanivan
    ivanivan about 5 years
    Know nothing of python but when I read what was happening I thought "gee, sounds like it is buffering output...."
  • Charles Duffy
    Charles Duffy about 5 years
    @wizzwizz4, it's only different in that Python follows the standard C library's convention of, by default, configuring stderr to be unbuffered and stdout to be line-buffered when they point to a TTY, but stdout to be unbuffered if it's opened to a non-TTY device.
  • glglgl
    glglgl about 5 years
    @wizzwizz4 Essentially, you have to differentiate between what the program itself does (by means of the libc and other layers of accessing file descriptors) and what the OS does. The libc flushes to a terminal after each newline, otherwise it flushes/writes only if its buffer is full. In a pipe, these data are then written to the pipe buffer, whose contents are ready for reading to the other side(s) of the pipe immediately. If the pipe buffer is full, writing blocks until enough has been written. In short: different levels of what happens and different mechanisms.