Why doesn't the command "ls | file" work?


The fundamental issue is that file expects file names as command-line arguments, not on stdin. When you write ls | file the output of ls is being passed as input to file. Not as arguments, as input.

What's the difference?

  • Command-line arguments are when you write flags and file names after a command, as in cmd arg1 arg2 arg3. In shell scripts these arguments are available as the variables $1, $2, $3, etc. In C you'd access them via the char **argv and int argc arguments to main().

  • Standard input, stdin, is a stream of data. Some programs like cat or wc read from stdin when they're not given any command-line arguments. In a shell script you can use read to get a single line of input. In C you can use scanf() or getchar(), among various options.

file does not normally read from stdin. It expects at least one file name to be passed as an argument. That's why it prints out usage when you write ls | file, because you didn't pass an argument.

You could use xargs to convert stdin into arguments, as in ls | xargs file. Still, as terdon mentions, parsing ls is a bad idea. The most direct way to do this is simply:

file *

Because, as you say, the input of file has to be filenames. The output of ls, however, is just text. That it happens to be a list of file names doesn't change the fact that it is simply text and not the location of files on the hard drive.

When you see output printed on the screen, what you see is text. Whether that text is a poem or a list of filenames makes no difference to the computer. All it knows is that it is text. This is why you can pass the output of ls to programs that take text as input (although you really, really shouldn't):

$ ls / | grep etc

So, to use the output of a command that lists file names as text (such as ls or find) as input for a command that takes filenames, you need to use some tricks. The typical tool for this is xargs:

$ ls
file1 file2

$ ls | xargs wc
 9  9 38 file1
 5  5 20 file2
14 14 58 total

As I said before, though, you really don't want to be parsing the output of ls. Something like find is better (the print0 prints a \0 instead of a newilne after each file name and the -0 of xargs lets it deal with such input; this is a trick to make your commands work with filenames containing newlines):

$ find . -type f -print0 | xargs -0 wc
 9  9 38 ./file1
 5  5 20 ./file2
14 14 58 total

Which also has its own way of doing this, without needing xargs at all:

$ find . -type f -exec wc {} +
 9  9 38 ./file1
 5  5 20 ./file2
14 14 58 total

Finally, you can also use a shell loop. However, note that in most cases, xargs will be much faster and more efficient. For example:

$ for file in *; do wc "$file"; done
 9  9 38 file1
 5  5 20 file2

learned that '|' (pipeline) is meant to redirect the output from a command to the input of another one.

It doesn't "redirect" the output, but takes the output of a program and use it as input, while file doesn't take inputs but filenames as arguments, which are then tested. Redirections do not pass these filenames as arguments neither piping does, the later what you are doing.

What you can do is read the filenames from a file with the --files-from option if you have a file which list all files you want to test, otherwise just pass the paths to your files as arguments.

The accepted answer explains why the pipe command doesn't work straightaway, and with the file * command, it offers a simple, straightforward solution.

I'd like to suggest another alternative that might come in handy at some time. The trick is using the backtick (`) character. The backtick is explained in great detail here. In short, it takes the output of the command enclosed in the backticks and substitutes it as a string into the remaining command.

So, find `ls` will take the output of the ls command, and substitute it as arguments for the find command. This is longer and more complicated than the accepted solution, but variants of this may be helpful in other situations.

The output of ls through a pipe is a solid block of data with 0x0a separating each line - ie a linefeed character - and file gets this as one parameter, where it expects multiple characters to work on one at a time.

As a general rule, never use ls to generate a data source for other commands - one day it'll pipe .. into rm and then you're in trouble!

Better to use a loop, such as for i in *; do file "$i" ; done which will produce the output you want, predictably. The quotes are there in case of filenames with spaces.


  • IanC
    IanC almost 2 years

    I've been studying about the command line and learned that | (pipeline) is meant to redirect the output from a command to the input of another one. So why does the command ls | file doesn't work?

    file input is one of more filenames, like file filename1 filename2

    ls output is a list of directories and files on a folder, so I thought ls | file was supposed to show the file type of every file on a folder.

    When I use it however, the output is:

        Usage: file [-bcEhikLlNnprsvz0] [--apple] [--mime-encoding] [--mime-type]
            [-e testname] [-F separator] [-f namefile] [-m magicfiles] file ...
        file -C [-m magicfiles]
        file [--help]

    As there was some error with the usage of the file command

