What is the best way to slurp a file into a string in Perl?

29,071

Solution 1

How about this:

use File::Slurp;
my $text = read_file($filename);

ETA: note Bug #83126 for File-Slurp: Security hole with encoding(UTF-8). I now recommend using File::Slurper (disclaimer: I wrote it), also because it has better defaults around encodings:

use File::Slurper 'read_text';
my $text = read_text($filename);

or Path::Tiny:

use Path::Tiny;
path($filename)->slurp_utf8;

Solution 2

I like doing this with a do block in which I localize @ARGV so I can use the diamond operator to do the file magic for me.

 my $contents = do { local(@ARGV, $/) = $file; <> };

If you need this to be a bit more robust, you can easily turn this into a subroutine.

If you need something really robust that handles all sorts of special cases, use File::Slurp. Even if you aren't going to use it, take a look at the source to see all the wacky situations it has to handle. File::Slurp has a big security problem that doesn't look to have a solution. Part of this is its failure to properly handle encodings. Even my quick answer has that problem. If you need to handle the encoding (maybe because you don't make everything UTF-8 by default), this expands to:

my $contents = do {
    open my $fh, '<:encoding(UTF-8)', $file or die '...';
    local $/;
    <$fh>;
    };

If you don't need to change the file, you might be able to use File::Map.

Solution 3

In writing File::Slurp (which is the best way), Uri Guttman did a lot of research in the many ways of slurping and which is most efficient. He wrote down his findings here and incorporated them info File::Slurp.

Solution 4

open(my $f, '<', $filename) or die "OPENING $filename: $!\n";
$string = do { local($/); <$f> };
close($f);

Solution 5

Things to think about (especially when compared with other solutions):

  1. Lexical filehandles
  2. Reduce scope
  3. Reduce magic

So I get:

my $contents = do {
  local $/;
  open my $fh, $filename or die "Can't open $filename: $!";
  <$fh>
};

I'm not a big fan of magic <> except when actually using magic <>. Instead of faking it out, why not just use the open call directly? It's not much more work, and is explicit. (True magic <>, especially when handling "-", is far more work to perfectly emulate, but we aren't using it here anyway.)

Share:
29,071
dreeves
Author by

dreeves

Startup: Beeminder.com Blog: MessyMatters.com Homepage: Dreev.es Twitter.com/dreev Favorite programming language: Mathematica Random fact: Dreeves is an ultra-marathon inline skater

Updated on July 09, 2022

Comments

  • dreeves
    dreeves almost 2 years

    Yes, There's More Than One Way To Do It but there must be a canonical or most efficient or most concise way. I'll add answers I know of and see what percolates to the top.

    To be clear, the question is how best to read the contents of a file into a string. One solution per answer.

  • user2522201
    user2522201 over 15 years
    This is probably the most inefficient way I can think of, especially for large files. Now you have two copies of the same data and you have processed it twice just to load it into a scalar.
  • dland
    dland over 15 years
    And in case it's not obvious to those following along at home, at the end of the curly block, $fh goes out of scope and the file handle is closed automatically.
  • ephemient
    ephemient over 15 years
    I'm lazy and write my $contents = do {local (@ARGV,$/) = $file; <>};, which is the exact same thing in less characters :)
  • Mr.Ree
    Mr.Ree over 15 years
    It's all situational. For a small file or a run-only-once quickie script, where "$string=cat $filename" is not available, this is perfectly reasonable. Inefficient yes! But that's not necessarily the only consideration.
  • Powerlord
    Powerlord over 15 years
    I'm wondering why local @ARGV = $file; <> would be any different than <$file>.
  • brian d foy
    brian d foy over 15 years
    @Bemrose: because $file is not a filehandle.
  • Kip
    Kip about 14 years
    this does have the disadvantage that it is not included in out-of-the-box perl. at least not my ActiveState perl for windows (v5.10.0).
  • unixman83
    unixman83 about 12 years
    This answer doesn't deserve a negative rating. Bunch of script kiddies that don't understand or care about what perl means by <FILEHANDLE>. It's an array silly. No worse performance than some of the other answers on this page. Very informative on how to think about Perl filehandles and slurping, as an array.
  • Leon Timmermans
    Leon Timmermans almost 12 years
    Actually, File::Map (disclaimer: written by me) would be a better choice nowadays. It's far more portable (it works on both Unix and Windows), but also easier to use («map_file my $str, $file_name;»).
  • brian d foy
    brian d foy about 10 years
    Note that File::Slurp has recently been discovered to be a huge security problem: rt.cpan.org/Ticket/Display.html?id=83126
  • brian d foy
    brian d foy about 10 years
    Note that File::Slurp has recently been discovered to be a huge security problem: rt.cpan.org/Ticket/Display.html?id=83126
  • Adam Millerchip
    Adam Millerchip over 8 years
    I got shot in the foot adding this method to a file that further down was already using <>, expecting it to read from STDIN. The behaviour of <> differs from the first call to subsequent calls, and since I changed the first call, I altered the behaviour of the existing call too (which was expecting the <STDIN> behaviour of <>).
  • stenlytw
    stenlytw almost 8 years
    Hi, I got Undefined subroutine &main::read_text. It should be use File::Slurper 'read_text';. metacpan.org/pod/File::Slurper
  • Britton Kerin
    Britton Kerin about 5 years
    Makes for a nice way to check that an attempted line substitution in a file actually happens: perl -p -i -0 -e 's/^old_line/new_line/m or (print and die)' some_file, or probably could use /mg to do all matching lines if many expected.
  • Freon Sandoz
    Freon Sandoz about 2 years
    File::Slurp isn't a portable solution. On Windows, it will barf on all non-ANSI filenames. I haven't tested Path::Tiny, but given Perl's general level of contempt for Windows, I'll bet that it's the same. It may be possible to open a file with Win32::LongPath and pass the filehandle instead of the filepath. (That's something that Perl really should support in all file I/O,)