Is closing a file after having opened it with `open()` required in Python?

44,647

When do files get closed?

As we can learn from Is explicitly closing files important? (StackOverflow), the Python interpreter closes the file in the following cases:

  • you manually invoke the close() method of a file object explicitly or implicitly by leaving a with open(...): block. This works of course always and on any Python implementation.
  • the file object's last reference got removed and therefore the object gets processed by the Garbage Collector. This is not a language feature, but a special feature of the CPython implementation only, so for portability don't rely on this!
  • the Python interpreter terminates. In this case it should close all file handles that got opened. Some older versions of Python3 would have also printed a warning that you should have closed them manually yourself. However, imagine a crash or you force-killing the Python interpreter and you will see that this is also not reliable.

So only the first (manual) method is reliable!

What would happen if a file stays open?

First, depending on the implementation of your Python interpreter, if you opened a file with write access, you can not be sure that your modifications got flushed to the disk until you either manually induce it or the file handler gets closed.

Second, you may only open a limited number of files on your system per user. If you exceed this limit by e.g. opening many files in a loop in your Python program without closing them as soon as possible, the system may refuse to open further file handles for you and you'll receive an exception. It may also happen that your program takes the last allowed open file and another program will fail because it gets declined.

Third, open files on a removable device prevent it from getting unmounted or ejected. You may still delete the file on some file systems like ext4, where simply the file descriptor/hard link to the file's inode gets removed/unlinked but the program which opened the file may still access the inode through its own temporary file handler. This is e.g. also the mechanism that allows you to update packages while the respective software is running. However, e.g. NTFS has no such feature. It may however never get modified by two concurrent processes, so it will be still somehow blocked for others.

Share:
44,647

Related videos on Youtube

TellMeWhy
Author by

TellMeWhy

I Wonder

Updated on September 18, 2022

Comments

  • TellMeWhy
    TellMeWhy over 1 year

    Regarding my previous question, I noticed that in both answers that used the open() function, there was no mention of closing the file.

    I've read that it's good practice to do so, but is there actually any need for it? Is it just unnecessary code?

    Does the file get closed automatically?

    • Admin
      Admin over 8 years
      with open (file_name, ...) as variable: automatically invokes the close() method as soon as you leave this code block.
    • Admin
      Admin over 8 years
      Cross-site duplicate of stackoverflow.com/q/7395542/4464570
    • Admin
      Admin over 8 years
      In python3, the file is garbaged automatically once it has no more references.
    • Admin
      Admin over 8 years
      I'm voting to close this question as off-topic because it is a general question about Python with no connection to scripting on Ubuntu. it is asking about best practices in a programming language.
  • Jacob Vlijm
    Jacob Vlijm over 8 years
    Not entirely true for python3 !!
  • Byte Commander
    Byte Commander over 8 years
    @JacobVlijm Would you mind explaining it?
  • Jacob Vlijm
    Jacob Vlijm over 8 years
    See the comment below the question. The answer about "not good practice" in the link of your answer is from 2011. If it ever existed, the warning in python3 does not exist any more and the answer is outdated. Automatic garbage collection exists for a reason and works perfectly. It has been years since I used close() specifically. Never ever ran into a single error caused by not using it.
  • Byte Commander
    Byte Commander over 8 years
    Soo... You know whether the other implementations like PyPy, Jython or IronPython close files in the same way CPython does? And if I killed the interpreter, it would not have a chance to close the file, right? And I also think that the open file limit still applies. What about flushing written data? You know anything more up to date there?
  • Byte Commander
    Byte Commander over 8 years
    @JacobVlijm Thanks for the tip. :P My machine already crashed while creating that file... I did not even need Python for that.
  • Jacob Vlijm
    Jacob Vlijm over 8 years
    In gedit, it will have a hard time, but with two files, it works if you start with a few lines, then toggle cat file1 >> file2 and vice versa a number of times :)
  • Byte Commander
    Byte Commander over 8 years
    @JacobVlijm I ran dd if=/dev/zero of=bigfile bs=2G - probably it tried to allocate 2GB of RAM to buffer that thing which I don't have (2GB RAM, but ~80% full)
  • Jacob Vlijm
    Jacob Vlijm over 8 years
    Ah, then better run the test with a smaller file :)
  • DGoiko
    DGoiko over 5 years
    @JacobVlijm it is a terrible practice to rely on the underlying mechanics to do that. Just because "the garbage collector does the job" it doesn't mean you should leave it to do so. For instance, I'm coding a threaded Python3.7 program that runs in the background which recently ran into a "too many oppened files" mark after reaching 1 million files opened in 3~ hours. Is that a bug in the interpreter? I dont know, all I can tell you is that after adding a forgotten "close()" the program has been running in test for more than 48 hours with no problems.
  • DGoiko
    DGoiko over 5 years
    @JacobVlijm what I mean is that for desktop toys and non-serious environments your solution may me ok: at most, everything will be closed when the user closes the main process (as Linux will take care of the cleanning), but in high concurrency scenarios, you should DEFINETLY close resources AS SOON as you don't need'em. Using with will allways be a good practice (provided it properly works if an exception is raised or a return performed inside, I'm not python expert so I don't know if it's properly implemented)