Listing directories based on size from largest to smallest on single line
Solution 1
If you are confident that the directory names do not contain whitespace, then it is simple to get all the directory names on one line:
du -sk [a-z]*/ 2>/dev/null | sort -nr | awk '{printf $2" "}'
Getting the information into python
If you want to capture that output in a python program and make it into a list. Using python2.7 or better:
import subprocess
dir_list = subprocess.check_output("du -sk [a-z]*/ 2>/dev/null | sort -nr | awk '{printf $2\" \"}'", shell=True).split()
In python2.6:
import subprocess
subprocess.Popen("du -sk [a-z]*/ 2>/dev/null | sort -nr | awk '{printf $2\" \"}'", shell=True, stdout=subprocess.PIPE).communicate()[0].split()
We can also take advantage of python's features to reduce the amount of work done by the shell and, in particular, to eliminate the need for awk
:
subprocess.Popen("du -sk [a-z]*/ | sort -nr", shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()[0].split()[1::2]
One could go further and read the du
output directly into python, convert the sizes to integers, and sort on size. It is simpler, though, just to do this with sort -nr
in the shell.
Specifying a directory
If the directories whose size you want are not in the current directory, there are two possibilities:
du -sk /some/path/[a-z]*/ 2>/dev/null | sort -nr | awk '{printf $2" "}'
and also:
cd /some/path/ && du -sk [a-z]*/ 2>/dev/null | sort -nr | awk '{printf $2" "}'
The difference between these two is whether /some/path
is included in the output or not.
Solution 2
Using paste
du -sk [a-z]* 2>/dev/null | sort -nr| cut -f2- | paste -s -
Solution 3
zsh
has the ability to sort its globs using globbing qualifiers. You can also define your own glob qualifiers with functions. For instance:
zdu() REPLY=$(du -s -- "$REPLY")
print -r -- [[:alpha:]]*(/nO+zdu)
would print the directories (/
) whose name starts with a letter (btw, [a-z]
only makes sense in the C locale) numerically (n) reverse sorted (O) using the zdu
function.
Note that when you do:
du -s a b
If a
and b
contain hardlinks to the same files, their disk usage will be counted for a
but not for b
. The zsh
approach here avoids that.
If you're going to use python, I'd do the same from there: call du -s
for each of the files, and sort that list there. Remember that file names can contain any character including space, tab and newline.
Related videos on Youtube
etho201
Updated on September 18, 2022Comments
-
etho201 over 1 year
I can use the following command to get a list of directories and their sizes and sort them from largest to smallest (in the example I renamed the directories to numbers to make this easier to understand).
$: du -sk [a-z]* 2>/dev/null | sort -nr 413096 one 106572 two 97452 three 76428 four 55052 five 45068 six 33680 seven 23220 eight 17716 nine
I'm writing a program that requires input of these directories from largest to smallest, but for matters of convenience it needs them all on one line. Is there a command that will allow me to sort the directories from largest to smallest on one line without the size?
I would like the output to be like this:
one two three four five six seven eight nine
-
etho201 almost 10 yearsBetter yet... Since I will be pasting the one line into Python and splitting it into a list, is there any way I can just have Python check a specified directory for the directories within and sort them by the size of its contents, and produce a list like this: [one, two, three, four, five, six, seven, eight, nine]?
-
-
etho201 almost 10 yearsOh I found a way... I can just use: cd dir && du -sk [a-z]*/ 2>/dev/null | sort -nr | awk '{printf $2" "}'
-
John1024 almost 10 years@user2554129 Good. I added that and an alternative to the answer.
-
etho201 almost 10 yearsAttributeError: 'module' object has no attribute 'check_output' --> I should have mentioned I'm constrained to using Python 2.6. It doesn't look like "check_output" works on 2.6. I read somewhere to use subprocess.Popen() but that doesn't seem to work as expected. Any ideas what I am doing wrong?
-
etho201 almost 10 yearsI like the nice clean list that produces. Thank you!
-
etho201 almost 10 yearsI like the alternate method, and that it is clean and easy to understand... Is this more efficient than using "du -sk [a-z]* 2>/dev/null | sort -nr | awk '{print $2}' | sed ':a;N;$!ba;s/\n/ /g'" ?
-
Neven almost 10 yearsNo problem glad to help. Can you please just then mark this as the answer. If this is what you looking for, of course. :)
-
Stéphane Chazelas almost 10 years@user2554129, yes, and more portable and more reliable.
-
etho201 almost 10 yearsI'm really new to Python so I'm having trouble understanding what you're trying to say, but it seems good because it is capable of accurately recognizing files names with spaces. If you had the folders located here: $: cd $ORACLE_BASE/admin $: du -sk [a-z]* 2>/dev/null | sort -n 14994 two words 12194 oneword 1692 one 1499 two 1432 this folder 1300 three 1129 four How would you do what you're suggesting with Python to get a sorted list like this: [two words, oneword, one, two, this folder, three, four]
-
Bernhard almost 10 yearsI would use
tr
instead ofpaste