What is the difference between "du -sh *" and "du -sh ./*"?

36,779

Solution 1

$ touch ./-c $'a\n12\tb' foo
$ du -hs *
0       a
12      b
0       foo
0       total

As you can see, the -c file was taken as an option to du and is not reported (and you see the total line because of du -c). Also, the file called a\n12\tb is making us think that there are files called a and b.

$ du -hs -- *
0       a
12      b
0       -c
0       foo

That's better. At least this time -c is not taken as an option.

$ du -hs ./*
0       ./a
12      b
0       ./-c
0       ./foo

That's even better. The ./ prefix prevents -c from being taken as an option and the absence of ./ before b in the output indicates that there's no b file in there, but there's a file with a newline character (but see below1 for further digressions on that).

It's good practice to use the ./ prefix when possible, and if not and for arbitrary data, you should always use:

cmd -- "$var"

or:

cmd -- $patterns

If cmd doesn't support -- to mark the end of options, you should report it as a bug to its author (except when it's by choice and documented like for echo).

There are cases where ./* solves problems that -- doesn't. For instance:

awk -f file.awk -- *

fails if there is a file called a=b.txt in the current directory (sets the awk variable a to b.txt instead of telling it to process the file).

awk -f file.awk ./*

Doesn't have the problem because ./a is not a valid awk variable name, so ./a=b.txt is not taken as a variable assignment.

cat -- * | wc -l

fails if there a file called - in the current directory, as that tells cat to read from its stdin (- is special to most text processing utilities and to cd/pushd).

cat ./* | wc -l

is OK because ./- is not special to cat.

Things like:

grep -l -- foo *.txt | wc -l

to count the number of files that contain foo are wrong because it assumes file names don't contain newline characters (wc -l counts the newline characters, those output by grep for each file and those in the filenames themselves). You should use instead:

grep -l foo ./*.txt | grep -c /

(counting the number of lines with a / character is more reliable as there can only be one per filename).

For recursive grep, the equivalent trick is to use:

grep -rl foo .//. | grep -c //

./* may have some unwanted side effects though.

cat ./*

adds two more character per file, so would make you reach the limit of the maximum size of arguments+environment sooner. And sometimes you don't want that ./ to be reported in the output. Like:

grep foo ./*

Would output:

./a.txt: foobar

instead of:

a.txt: foobar

Further digressions

1. I feel like I have to expand on that here, following the discussion in comments.

$ du -hs ./*
0       ./a
12      b
0       ./-c
0       ./foo

Above, that ./ marking the beginning of each file means we can clearly identify where each filename starts (at ./) and where it ends (at the newline before the next ./ or the end of the output).

What that means is that the output of du ./*, contrary to that of du -- *) can be parsed reliably, albeit not that easily in a script.

When the output goes to a terminal though, there are plenty more ways a filename may fool you:

  • Control characters, escape sequences can affect the way things are displayed. For instance, \r moves the cursor to the beginning of the line, \b moves the cursor back, \e[C forward (in most terminals)...

  • many characters are invisible on a terminal starting with the most obvious one: the space character.

  • There are Unicode characters that look just the same as the slash in most fonts

     $ printf '\u002f \u2044 \u2215 \u2571 \u29F8\n'
     / ⁄ ∕ ╱ ⧸
    

(see how it goes in your browser).

An example:

$ touch x 'x ' $'y\bx' $'x\n0\t.\u2215x' $'y\r0\t.\e[Cx'
$ ln x y
$ du -hs ./*
0       ./x
0       ./x
0       ./x
0       .∕x
0       ./x
0       ./x

Lots of x's but y is missing.

Some tools like GNU ls would replace the non-printable characters with a question mark (note that (U+2215) is printable though) when the output goes to a terminal. GNU du does not.

There are ways to make them reveal themselves:

$ ls
x  x   x?0?.∕x  y  y?0?.?[Cx  y?x
$ LC_ALL=C ls
x  x?0?.???x  x   y  y?x  y?0?.?[Cx

See how turned to ??? after we told ls that our character set was ASCII.

$ du -hs ./* | LC_ALL=C sed -n l
0\t./x$
0\t./x $
0\t./x$
0\t.\342\210\225x$
0\t./y\r0\t.\033[Cx$
0\t./y\bx$

$ marks the end of the line, so we can spot the "x" vs "x ", all non-printable characters and non-ASCII characters are represented by a backslash sequence (backslash itself would be represented with two backslashes) which means it is unambiguous. That was GNU sed, it should be the same in all POSIX compliant sed implementations but note that some old sed implementations are not nearly as helpful.

$ du -hs ./* | cat -vte
0^I./x$
0^I./x $
0^I./x$
0^I.M-bM-^HM-^Ux$

(not standard but pretty common, also cat -A with some implementations). That one is helpful and uses a different representation but is ambiguous ("^I" and <TAB> are displayed the same for instance).

$ du -hs ./* | od -vtc
0000000   0  \t   .   /   x  \n   0  \t   .   /   x      \n   0  \t   .
0000020   /   x  \n   0  \t   . 342 210 225   x  \n   0  \t   .   /   y
0000040  \r   0  \t   . 033   [   C   x  \n   0  \t   .   /   y  \b   x
0000060  \n
0000061

That one is standard and unambiguous (and consistent from implementation to implementation) but not as easy to read.

You'll notice that y never showed up above. That's a completely unrelated issue with du -hs * that has nothing to do with file names but should be noted: because du reports disk usage, it doesn't report other links to a file already listed (not all du implementations behave like that though when the hard links are listed on the command line).

Solution 2

There is no difference between a * and ./* in terms of what files either will list. The only difference would be with the 2nd form, each file would have a dot slash ./ prefixed in front of them, which typically means the current directory.

Remember that the . directory is a shorthand notation for the current directory.

$ ls -la | head -4
total 28864
drwx------. 104 saml saml    12288 Jan 23 20:04 .
drwxr-xr-x.   4 root root     4096 Jul  8  2013 ..
-rw-rw-r--.   1 saml saml      972 Oct  6 20:26 abcdefg

You can convince yourself that these 2 lists are essentially the same thing by using echo to see what the shell would expand them to.

$ echo *
$ echo ./*

These 2 commands will list all the files in your current directory.

Examples

We can make some fake data like so:

$ touch file{1..5}
$ ll
total 0
-rw-rw-r--. 1 saml saml 0 Jan 24 07:14 file1
-rw-rw-r--. 1 saml saml 0 Jan 24 07:14 file2
-rw-rw-r--. 1 saml saml 0 Jan 24 07:14 file3
-rw-rw-r--. 1 saml saml 0 Jan 24 07:14 file4
-rw-rw-r--. 1 saml saml 0 Jan 24 07:14 file5

Now when we use the above echo commands we see the following output:

$ echo *
file1 file2 file3 file4 file5
$ echo ./*
./file1 ./file2 ./file3 ./file4 ./file5

This difference may seem unnecessary but there are situations where you want to guarantee to the various Unix command line tools that you are passing filenames to them via the command line, and nothing more!

So then why use ./*?

As @Stephane's answer points out, due to the nature of what characters are legal when naming files & directories in Unix, dangerous filenames can be constructed which have unexpected side effects when they're passed to various Unix commands at the command line.

So often the use of ./ will be used to help guarantee that expanded filenames are considered as file names when passed as arguments to the various Unix commands.

Share:
36,779

Related videos on Youtube

Biswanath
Author by

Biswanath

Updated on September 18, 2022

Comments

  • Biswanath
    Biswanath almost 2 years

    What's the difference between du -sh * and du -sh ./* ?

    Note: What interests me is the * and ./* parts.

    • S edwards
      S edwards over 10 years
      the output ? one will show you ./ in front of filename
  • Olivier Dulac
    Olivier Dulac over 10 years
    +1, Nice and thorough (as far as i can tell ^^). I especially love the "grep -c /" advantage. Also worth noting: the advantage of "./*" over "*" appears in one of the (many) good answers of the Unix FAQ (probably on faqs.org. iirc, it's in the question about rm-ing files starting with a "-").
  • user2071406
    user2071406 over 10 years
    …and it's not bad practice to have files with newlines and tabs in their names? I know I try to limit names to [a-z0-9.+-].
  • Stéphane Chazelas
    Stéphane Chazelas over 10 years
    @BlacklightShining, it's very bad to steal cars, but it's bad to leave your car unlocked (ignore newlines), especially when it's an expensive car (script running as a privileged user, on a server with sensitive data...) or when you park it in a rough area (/tmp) or an area with lots of expensive cars ($HOME) and it's even worse to go to a Q&A site and say that's always fine not to lock your car without specifying in which conditions (in a locked garage, script you wrote run by yourself only on a machine not connected to any network or removable storage...)
  • alexis
    alexis over 10 years
    Nice explication, but it misleadingly suggests that the ./ not only disambiguates filenames that begin with -, but addresses problems with whitespace. It does not. (I know you don't make inaccurate claims, but it's easy to miss the point without very careful reading).
  • Stéphane Chazelas
    Stéphane Chazelas over 10 years
    @alexis, I fail to see where you see a reference to whitespace in there. white space are neither a problem for du * nor du ./* (except maybe if you want to argue that a file called "b" or "b " (but then again "a\bb" as well) would look the same in the du output to a terminal)
  • user2071406
    user2071406 over 10 years
    @StephaneChazelas Wait, so now we're root on a server with sensitive data? I thought we were a regular user, interactively getting disk usage information. Obviously if one is letting anyone create files, then one should be sanitizing the filenames. I would expect that, in production, one wouldn't be running programs that create files with things like newlines in their names.
  • user2071406
    user2071406 over 10 years
    Alternately, this program running as root could be made to print \n literally rather than an actual newline, like Python's repr() does for strings. In case someone makes a file called a\n12\t\xe2\x80\xa4\xe2\x88\x95b.
  • Stéphane Chazelas
    Stéphane Chazelas over 10 years
    @BlacklightShining, yes though that one (like "b " or "a\bb") would fool a user on a terminal but not a script parsing the output of du ./*. I should probably add a note about that. Will do tomorrow. Note that earlier I meant privileged in the general sense, not root (though applies all the more to root of course). newlines are permitted, ignoring them is a bug. bugs have a habit of being exploited. You've got to measure the risk on a case by case basis. Good coding practice can avoid the problems in many cases. Certainly on SE, we should raise awareness.
  • user2071406
    user2071406 over 10 years
    @StephaneChazelas Ignoring newlines is a bug? Depends on what the program is doing. If it's fetching the file sizes itself and only handling the raw filenames internally, it doesn't have to do anything. The only problems would arise if the information was presented to the user or received from|passed to another process.
  • Wildcard
    Wildcard over 8 years
    @BlacklightShining, it sounds like you don't believe in making your scripts robust or in handling edge cases. If filenames can contain special characters (which they can, obviously) then your scripts and programs should be written to handle those special characters if/when they occur. If you disagree then I sure hope you are never let loose to write scripts for anyone else's use....
  • user2071406
    user2071406 over 8 years
    @Wildcard A robust script wouldn't invoke du and then try to parse its output, for exactly the reasons detailed previously in this comment thread.