Find directories with lots of files in
Solution 1
Check /lost+found
in case there was a disk problem and a lot of junk ended up being detected as separate files, possibly wrongly.
Check iostat
to see if some application is still producing files like crazy.
find / -xdev -type d -size +100k
will tell you if there's a directory that uses more than 100kB of disk space. That would be a directory that contains a lot of files, or contained a lot of files in the past. You may want to adjust the size figure.
I don't think there's a combination of options to GNU du
to make it count 1 per directory entry. You can do this by producing the list of files with find
and doing a little bit of counting in awk. Here is a du
for inodes. Minimally tested, doesn't try to cope with file names containing newlines.
#!/bin/sh
find "$@" -xdev -depth | awk '{
depth = $0; gsub(/[^\/]/, "", depth); depth = length(depth);
if (depth < previous_depth) {
# A non-empty directory: its predecessor was one of its files
total[depth] += total[previous_depth];
print total[previous_depth] + 1, $0;
total[previous_depth] = 0;
}
++total[depth];
previous_depth = depth;
}
END { print total[0], "total"; }'
Usage: du-inodes /
. Prints a list of non-empty directories with the total count of entries in them and their subdirectories recursively. Redirect the output to a file and review it at your leisure. sort -k1nr <root.du-inodes | head
will tell you the biggest offenders.
Solution 2
You can check with this script:
#!/bin/bash
if [ $# -ne 1 ];then
echo "Usage: `basename $0` DIRECTORY"
exit 1
fi
echo "Wait a moment if you want a good top of the bushy folders..."
find "$@" -type d -print0 2>/dev/null | while IFS= read -r -d '' file; do
echo -e `ls -A "$file" 2>/dev/null | wc -l` "files in:\t $file"
done | sort -nr | head | awk '{print NR".", "\t", $0}'
exit 0
This prints the top 10 subdirectories by file count. If you want a top x, change head
with head -n x
, where x
is a natural number bigger than 0.
For 100% sure results, run this script with root privileges:
Solution 3
Often faster than find, if your locate database is up to date:
# locate '' | sed 's|/[^/]*$|/|g' | sort | uniq -c | sort -n | tee filesperdirectory.txt | tail
This dumps the entire locate database, strips off everything past the last '/' in the path, then the sort and "uniq -c" get you the number of files/directories per directory. "sort -n" piped to tail to get you the ten directories with the most things in them.
Solution 4
a bit old thread but interesting so I suggest my solutions.
First uses few piped commands and it finds directories with over 1000 files inside:
find / -type d |awk '{print "echo -n "$0" ---- ; ls -1 "$0" |wc -l "}'|bash |awk -F "----" '{if ($2>1000) print $1}'
Second is simple. It just try to find directories that have size over 4096B. Normally empty directory has 4096B on the ext4 filesystem i 6B on the xfs:
find / -type d -size +4096c
You can adjust it of course but I believe that it should work in most cases with such value.
Related videos on Youtube
Oli
Hi, I'm Oli and I'm a "full-stack" web-dev-op. Eurgh. I'm also allergic to jargon BS. I spend most of my professional time writing Django websites and webapps for SMEs. I write a lot of Python outside of Django sites too. I administer various Linux servers for various tasks. I contribute to the open source projects that I use when I can. I'm a full-time Linux user and that has lead to helping other people live the dream. I am an official Ubuntu Member and I earnt my ♦ on SE's own Ask Ubuntu in 2011's moderator election. That's probably where I spend most of my unpaid time. I also run thepcspy.com which has been my place to write for the last decade or so. If you need to contact me for extended help, you can do so via my website, just remember that I have bills so if I feel your request is above and beyond normal duty, I might ask for remuneration for one-on-one support. For more social contact, you can usually find me (or just my computer) lurking in the Ask Ubuntu General Chat Room and on Freenode in #ubuntu and #ubuntu-uk under the handle Oli or Oli``.
Updated on September 18, 2022Comments
-
Oli over 1 year
So a client of mine got an email from Linode today saying their server was causing Linode's backup service to blow up. Why? Too many files. I laughed and then ran:
# df -ih Filesystem Inodes IUsed IFree IUse% Mounted on /dev/xvda 2.5M 2.4M 91K 97% /
Crap. 2.4million inodes in use. What the hell has been going on?!
I've looked for the obvious suspects (
/var/{log,cache}
and the directory where all the sites are hosted from) but I'm not finding anything really suspicious. Somewhere on this beast I'm certain there's a directory that contains a couple of million files.For context one my my busy servers uses 200k inodes and my desktop (an old install with over 4TB of used storage) is only just over a million. There is a problem.
So my question is, how do I find where the problem is? Is there a
du
for inodes? -
Radu Rădeanu almost 11 yearsThe script give errors:
awk: line 2: find: regular expression compile failed (bad class -- [], [^] or [) [^ awk: line 2: syntax error at or near ] `/tmp/tmpw99dhs': Permission denied
-
h3. almost 11 years@RaduRădeanu Ah, I see, I used a gawk peculiarity that doesn't work in other versions. I've added a backslash which I think is necessary as per POSIX.
-
Max Beikirch about 9 years+1: using the locate database is a very nice idea!
-
Sandeep over 5 yearsWhen you can't use locate for whatever reason, run a
find /path/to/parent -xdev > filelist
first, then direct sed to read input from that list. -
Dipin almost 5 years2019: raised
10: read: Illegal option -d
... scrubbed the-d
flag fromread
hoping nothing bad will happen. Will let you know when it finishes running ... -
Guasqueño over 2 yearsIf the file system is close to 100% full you don't want to create another temp file that makes the situation worst. That's wy I like Gille's answer better as it does not create a new file.