What's a command line way to find large files/directories to remove and free up space?

407,182

Solution 1

If you just need to find large files, you can use find with the -size option. The next command will list all files larger than 10MiB (not to be confused with 10MB):

find / -size +10M -ls

If you want to find files between a certain size, you can combine it with a "size lower than" search. The next command find files between 10MiB and 12MiB:

find / -size +10M -size -12M -ls

apt-cache search 'disk usage' lists some programs available for disk usage analysis. One application that looks very promising is gt5.

From the package description:

Years have passed and disks have become larger and larger, but even on this incredibly huge harddisk era, the space seems to disappear over time. This small and effective programs provides more convenient listing than the default du(1). It displays what has happened since last run and displays dir size and the total percentage. It is possible to navigate and ascend to directories by using cursor-keys with text based browser (links, elinks, lynx etc.)

Screenshot of gt5

On the "related packages" section of gt5, I found ncdu. From its package description:

Ncdu is a ncurses-based du viewer. It provides a fast and easy-to-use interface through famous du utility. It allows to browse through the directories and show percentages of disk usage with ncurses library.

Screenshot of ncdu

Solution 2

My favorite solution uses a mix from several of these good answers.

du -aBM 2>/dev/null | sort -nr | head -n 50 | more

du arguments:

  • -a for "all" files and directories. Leave it off for just directories
  • -BM to output the sizes in megabyte (M) block sizes (B)
  • 2>/dev/null - exclude "permission denied" error messages (thanks @Oli)

sort arguments:

  • -n for "numeric"
  • -r for "reverse" (biggest to smallest)

head arguments:

  • -n 50 for the just top 50 results.
  • Leave off more if using a smaller number

Note: Prefix with sudo to include directories that your account does not have permission to access.

Example showing top 10 biggest files and directories in /var (including grand total).

cd /var
sudo du -aBM 2>/dev/null | sort -nr | head -n 10
7555M   .
6794M   ./lib
5902M   ./lib/mysql
3987M   ./lib/mysql/my_database_dir
1825M   ./lib/mysql/my_database_dir/a_big_table.ibd
997M    ./lib/mysql/my_database_dir/another_big_table.ibd
657M    ./log
629M    ./log/apache2
587M    ./log/apache2/ssl_access.log
273M    ./cache

Solution 3

I just use a combination of du and sort.

sudo du -sx /* 2>/dev/null | sort -n

0   /cdrom
0   /initrd.img
0   /lib64
0   /proc
0   /sys
0   /vmlinuz
4   /lost+found
4   /mnt
4   /nonexistent
4   /selinux
8   /export
36  /media
56  /scratchbox
200 /srv
804 /dev
4884    /root
8052    /bin
8600    /tmp
9136    /sbin
11888   /lib32
23100   /etc
66480   /boot
501072  /web
514516  /lib
984492  /opt
3503984 /var
7956192 /usr
74235656    /home

Then it's a case of rinse and repeat. Target the subdirectories you think are too big, run the command for them and you find out what's causing the problem.

Note: I use du's -x flag to keep things limited to one filesystem (I have quite a complicated arrangement of cross-mounted things between SSD and RAID5).

Note 2: 2>/dev/null redirects any error messages into oblivion. If they don't bother you, it's not obligatory.

Solution 4

To display the biggest top-20 directories (recursively) in the current folder, use the following one-liner:

du -ah . | sort -rh | head -20

or (more Unix oriented):

du -a . | sort -rn | head -20

For the top-20 biggest files in the current directory (recursively):

ls -1Rs | sed -e "s/^ *//" | grep "^[0-9]" | sort -nr | head -n20

or with human readable sizes:

ls -1Rhs | sed -e "s/^ *//" | grep "^[0-9]" | sort -hr | head -n20

Please note that -h is available for GNU sort only, so to make it work on OSX/BSD properly, you've to install it from coreutils. Then add its folder into your PATH.

So these aliases are useful to have in your rc files (every time when you need it):

alias big='du -ah . | sort -rh | head -20'
alias big-files='ls -1Rhs | sed -e "s/^ *//" | grep "^[0-9]" | sort -hr | head -n20'

Solution 5

qbi's answer is correct but it will be very slow when there are a lot of files since it will start a new ls process for each item.

a much faster version using find without spawning child processes would be to use printf to print the size in bytes (%s) and the path (%p)

find "$directory" -type f -printf "%s - %p\n" | sort -n | tail -n $num_entries

Share:
407,182

Related videos on Youtube

Ryan Detzel
Author by

Ryan Detzel

I love startups and bleeding edge tech. Lets chat if you can help me with either of those. :-)

Updated on September 18, 2022

Comments

  • Ryan Detzel
    Ryan Detzel over 1 year

    Looking for a series of commands that will show me the largest files on a drive.

    • Jason Southwell
      Jason Southwell about 13 years
      Would something graphical be fine?
    • Ryan Detzel
      Ryan Detzel about 13 years
      nope, running on command line over ssh.
    • Ryan Detzel
      Ryan Detzel about 13 years
      What's odd is I have two servers that are running the same thing. One is at 50% disk usage and the other is 99%. I can't find what's causing this.
  • Tonioooooo
    Tonioooooo over 11 years
    In this particular question, the OP prefers a command line method. See the comments to the question. I'll edit the question as well.
  • Cookie
    Cookie over 9 years
    Confirm that this is much faster
  • Jamie
    Jamie about 8 years
    When I run this command du descends into child directories. From the du man page: "Summarize disk usage of each FILE, recursively for directories."
  • Lukas Liesis
    Lukas Liesis about 7 years
    ncdu is very quick and just what i needed, thanks! I've tried gt5 too, but just canceled it because it was "thinking" too long w/o any feedback
  • While-E
    While-E over 6 years
    Holy crap, ncdu is amazing, thank you for sharing your findings!
  • Martin Thoma
    Martin Thoma over 5 years
    I would love if ncdu was pointed out stronger. I need it once in a while and I can't remember the name.
  • Mr Coder
    Mr Coder about 5 years
    Enough of remembering commands thanks to ncdu :)
  • matanster
    matanster over 4 years
    Can you add the -h option (or other ones) to the ls somehow?
  • Lekensteyn
    Lekensteyn over 4 years
    @matt Nope, the output format for the -ls output is hardcoded (see the source code for pred_fls and list_file functions). You could try to approximate the output using the -printf option, post-process the output with awk, or use something like find ... -type f -exec ls -ldh {} \; | column -t
  • janoside
    janoside over 4 years
    Well, starting at the root of the filesystem was exactly the opposite of pointless to me. It allowed me to identify the most important places to target for space saving across the whole filesystem. Regardless, you can use the mentioned to start at any directory. Check out the docs for that tool. But, as requested, here is the link to the source answer.
  • NeverMine17
    NeverMine17 over 3 years
    Amazing utility if you have access to GUI
  • Eyni Kave
    Eyni Kave over 2 years
    Tanx for this Enterprise solution. I just want to complete this by adding 'cd /' before running the command : 'cd /;sudo du -aBM 2>/dev/null | sort -nr | head -n 10 > sizelog.txt'