What is the fastest way to create a list of directories specified in a file?
Solution 1
With GNU xargs
:
xargs -d '\n' mkdir -p -- < foo.txt
xargs
will run as few mkdir
commands as possible.
With standard syntax:
(export LC_ALL=C
sed 's/[[:blank:]"\'\'']/\\&/g' < foo.txt | xargs mkdir -p --)
Where it's not efficient is that mkdir -p a/b/c
will attempt some mkdir("a")
and possibly stat("a")
and chdir("a")
and same for "a/b"
even if "a/b"
existed beforehand.
If your foo.txt
has:
a
a/b
a/b/c
in that order, that is, if for each path, there have been a line for each of the path components before, then you can omit the -p
and it will be significantly more efficient. Or alternatively:
perl -lne 'mkdir $_ or warn "$_: $!\n"' < foo.txt
Which avoids invoking a (many) mkdir
command altogether.
Solution 2
I know we will get lot of answers for this question.But still you can TRY this :) :D
while read -r line; do mkdir -p "$line" ; done < file.txt
Related videos on Youtube
Kaizer Sozay
Updated on September 18, 2022Comments
-
Kaizer Sozay almost 2 years
I have a text file, "foo.txt", that specifies a directory in each line:
data/bar/foo data/bar/foo/chum data/bar/chum/foo ...
There could be millions of directories and subdirectories What is the quickest way to create all the directories in bulk, using a terminal command ?
By quickest, I mean quickest to create all the directories. Since there are millions of directories there are many write operations.
I am using ubuntu 12.04.
EDIT: Keep in mind, the list may not fit in memory, since there are MILLIONS of lines, each representing a directory.
EDIT: My file has 4.5 million lines, each representing a directory, composed of alphanumeric characters, the path separator "/" , and possibly "../"
When I ran
xargs -d '\n' mkdir -p < foo.txt
after a while it kept printing errors until i did ctrl + c:mkdir: cannot create directory `../myData/data/a/m/e/d': No space left on device
But running
df -h
gives the following output:Filesystem Size Used Avail Use% Mounted on /dev/xvda 48G 20G 28G 42% / devtmpfs 2.0G 4.0K 2.0G 1% /dev none 401M 164K 401M 1% /run none 5.0M 0 5.0M 0% /run/lock none 2.0G 0 2.0G 0% /run/shm
free -m
total used free shared buffers cached Mem: 4002 3743 258 0 2870 13 -/+ buffers/cache: 859 3143 Swap: 255 26 229
EDIT: df -i
Filesystem Inodes IUsed IFree IUse% Mounted on /dev/xvda 2872640 1878464 994176 66% / devtmpfs 512053 1388 510665 1% /dev none 512347 775 511572 1% /run none 512347 1 512346 1% /run/lock none 512347 1 512346 1% /run/shm
df -T
Filesystem Type 1K-blocks Used Available Use% Mounted on /dev/xvda ext4 49315312 11447636 37350680 24% / devtmpfs devtmpfs 2048212 4 2048208 1% /dev none tmpfs 409880 164 409716 1% /run none tmpfs 5120 0 5120 0% /run/lock none tmpfs 2049388 0 2049388 0% /run/shm
EDIT: I increased the number of inodes, and reduced the depth of my directories, and it seemed to work. It took 2m16seconds this time round.
-
Sreeraj over 9 yearsIs this a virtual machine? Does the main node has enough space?
-
Kaizer Sozay over 9 years@Sree It is a Linode VPS. How can I tell if it has enough space ? The directory I am running it in is in /home/myuser/ which should have a lot of free space
-
Sreeraj over 9 yearsYes. You seem to have enough space in all the partitions, there are free inodes, but still if it says you don't have enough space, probably the hypervisor on which your VPS is located has run out of space. You might have to contact your VPS provider to check that.
-
PM 2Ring over 9 yearsIs that output from
df -i
from before or after you try to runxargs -d '\n' mkdir -p < foo.txt
? -
Stéphane Chazelas over 9 yearsWhat FS type (
df -T /
)? -
Kaizer Sozay over 9 years@StéphaneChazelas updated question.
-
Kaizer Sozay over 9 years@StéphaneChazelas I ignored the problem, and just increased the size of the disk image so that there are more inodes. I also reduced the depth of the directory structure and it seems to work. So now I could run your command without problem :)
-
-
Stéphane Chazelas over 9 yearsThat's running one mkdir per directory and is flawed because of that wrong usage of the split+glob operator. That also means storing that whole huge list in memory.
-
Stéphane Chazelas over 9 yearsThat's running one mkdir per directory and is flawed because of that wrong usage of
read
and the split+glob operator. -
Sreeraj over 9 yearsCool. Going through the
man
page ofxargs
now after looking at your comment in the question. Always something new to learn everytime I open SE :) -
Thushi over 9 yearsYes it is.Because of dependency.To create the folder bar we should have data and in the same way for others.But I didn't find any flaws in
read
.Can you execute my command and check it once?.I did and it's working for me. -
Stéphane Chazelas over 9 yearsTo read a line, it's
IFS= read -r line
,read line
does extra processing. Leaving$line
unquoted means invoking the split+glob operator.mkdir
can take several arguments. -
Thushi over 9 yearsOh k.Thank you.I will improve my answer. I just took the above example ;)
-
Thushi over 9 yearsWhat about $i? Unquoted?? :P
-
Sreeraj over 9 yearsBut how is it holding a hugelist in memory since there is only one iteration variable. Wouldn't it hold only that one variable during each iteration?
-
Stéphane Chazelas over 9 years@Sree, expanding
$(cat...)
means reading the output ofcat
in memory, split+glob it and iterate over the resulting huge list. -
cuonglm over 9 yearsIn your standard syntax, does it mean POSIX?
-
yorkshiredev over 9 yearsTo be honest, since we don't know the entire list of directories, we cannot assume that this answer is correct. A single space in a name will cause the wrong directory structure to be built.
-
Stéphane Chazelas over 9 years@John,
xargs
runs as many instances of the command as needed so as to avoid the limit on the maximum number of arguments. So it will probably invoke manymkdir
commands each one of them passed a few thousand of directories to create. -
Kaizer Sozay over 9 yearsthere are no spaces. the directory paths are only alpha numeric characters, "../" and the path separator "/"
-
Kaizer Sozay over 9 yearsIt repeats the error "mkdir: cannot create directory `../myData/data/a/m/e/d': No space left on device" many times for each file ? Could there be a bug in your command ? My file seems to have only unique entries. Or is this just how the error is displayed ?
-
Pryftan almost 6 years@KaizerSozay I know this is old but - the point is that the file could have spaces; and if you're saying that files can't have spaces in them you're wrong (so can directories but directories are a file in the end). They can also have newlines (etc.).
-
Stéphane Chazelas about 5 years@KaizerSozay, you're running out of space or inodes, the errors are probably about creating a directory component leading to the files.