Haskell IO and closing files

12,258

Solution 1

As others have stated, it is because of lazy evaluation. The handle is half-closed after this operation, and will be closed automatically when all data is read. Both hGetContents and readFile are lazy in this way. In cases where you're having issues with handles being kept open, typically you just force the read. Here's the easy way:

import Control.Parallel.Strategies (rnf)
-- rnf means "reduce to normal form"
main = do inFile <- openFile "foo" 
          contents <- hGetContents inFile
          rnf contents `seq` hClose inFile -- force the whole file to be read, then close
          putStr contents

These days, however, nobody is using strings for file I/O anymore. The new way is to use Data.ByteString (available on hackage), and Data.ByteString.Lazy when you want lazy reads.

import qualified Data.ByteString as Str

main = do contents <- Str.readFile "foo"
          -- readFile is strict, so the the entire string is read here
          Str.putStr contents

ByteStrings are the way to go for big strings (like file contents). They are much faster and more memory efficient than String (= [Char]).

Notes:

I imported rnf from Control.Parallel.Strategies only for convenience. You could write something like it yourself pretty easily:

  forceList [] = ()
  forceList (x:xs) = forceList xs

This just forces a traversal of the spine (not the values) of the list, which would have the effect of reading the whole file.

Lazy I/O is becoming considered evil by experts; I recommend using strict bytestrings for most of file I/O for the time being. There are a few solutions in the oven which attempt to bring back composable incremental reads, the most promising of which is called "Iteratee" by Oleg.

Solution 2

This is because hGetContents doesn't do anything yet: it's lazy I/O. Only when you use the result string the file is actually read (or the part of it that is needed). If you want to force it to be read, you can compute its length, and use the seq function to force the length to be evaluated. Lazy I/O can be cool, but it can also be confusing.

For more information, see the part about lazy I/O in Real World Haskell, for example.

Solution 3

As previously noted, hGetContents is lazy. readFile is strict, and closes the file when it's done:

main = do contents <- readFile "foo"
          putStr contents

yields the following in Hugs

> main
blahblahblah

where foo is

blahblahblah

Interestingly, seq will only guarantee that some portion of the input is read, not all of it:

main = do inFile <- openFile "foo" ReadMode
          contents <- hGetContents $! inFile
          contents `seq` hClose inFile
          putStr contents

yields

> main
b

A good resource is: Making Haskell programs faster and smaller: hGetContents, hClose, readFile

Solution 4

If you want to keep your IO lazy, but to do it safely so that errors such as this don't occur, use a package designed for this such as safe-lazy-io. (However, safe-lazy-io doesn't support bytestring I/O.)

Share:
12,258
Jay Conrod
Author by

Jay Conrod

Updated on June 16, 2022

Comments

  • Jay Conrod
    Jay Conrod almost 2 years

    When I open a file for reading in Haskell, I've found that I can't use the contents of the file after closing it. For example, this program will print the contents of a file:

    main = do inFile <- openFile "foo" ReadMode
              contents <- hGetContents inFile
              putStr contents
              hClose inFile
    

    I expected that interchanging the putStr line with the hClose line would have no effect, but this program prints nothing:

    main = do inFile <- openFile "foo" ReadMode
              contents <- hGetContents inFile
              hClose inFile
              putStr contents
    

    Why does this happen? I'm guessing it has something to do with lazy evaluation, but I thought these expressions would get sequenced so there wouldn't be a problem. How would you implement a function like readFile?

  • Jay Conrod
    Jay Conrod over 15 years
    Could you link any good things to read on these topics? I wasn't able to find much other than sparse documentation and mailing list messages about specific issues.
  • sclv
    sclv over 13 years
    Two comments. First, lots of people still use strings for file IO. They're perfectly fine, when what you want to get out of a file is a string! Second, Lazy IO is not considered evil by lots of folks, but it is considered tricky. It lets us do all sorts of neat things with a very low syntactic overhead, but at the cost of maintaining certain limited types of operational reasoning alongside equational reasoning.
  • Xavier Ho
    Xavier Ho almost 13 years
    Came across this answer and thanks, @liqui! Just wanted to point out (3 years later) that your rnf should be: rnf contents 'seq' hClose inFile, with the backticks around seq. Also, rnf has been moved to Control.DeepSeq.
  • alternative
    alternative almost 13 years
    readFile uses hGetContents and doesn't close the file. Its lazy, according to Real World Haskell and the source code itself.
  • luqui
    luqui over 12 years
    @Peter, I think we were talking about lazy IO, which your comment does not address.
  • Mauricio Scheffer
    Mauricio Scheffer about 12 years
    "Lazy IO in serious, server-side programming is unprofessional" – Oleg Kiselyov
  • Ben Millwood
    Ben Millwood almost 12 years
    Firstly, readFile is not strict, as mentioned, secondly, the use of $! with hGetContents is wholly redundant.
  • Ben Millwood
    Ben Millwood almost 12 years
    I don't think unsafePerformIO is relevant here. Maybe unsafeInterleaveIO.