Efficient way to create multiple files

31,575

Solution 1

The limitation is on the size of the arguments upon execution of a command. So the options are to execute a command with fewer arguments, for instance with xargs to run smaller batches, increase the limit (ulimit -s 100000 on Linux), not execute anything (do it all in the shell), or build the list in the tool that creates the files.

zsh, ksh93, bash:

printf '%s ' {1..1391803} | xargs touch

printf is builtin, so there's no exec, so the limit is not reached. xargs splits the list of args passed to touch to avoid breaking the limit. That's still not very efficient as the shell has to first create the whole list (slow especially with bash), store it in memory, and then print it.

seq 1391803 | xargs touch

(assuming you have a seq command) would be more efficient.

for ((i=1; i<=1391803; i++)); do : >> "$i"; done

Everything is done in the shell, no big list stored in memory. Should be relatively efficient except maybe with bash.

POSIXly:

i=1; while [ "$i" -le 1391803 ]; do : >> "$i"; i=$(($i + 1)); done

echo 'for (i=1;i<=1391803;i++) i' | bc | xargs touch

awk 'BEGIN {for (i=1; i<=1391803; i++) {printf "" >> i; close(i)}}'

Solution 2

In your example, Bash complains because when expanding test_{1..1391803}.txt it ends up with a too long argument command line. The maximum length of the command line which can be passed to a command is fixed by the kernel, because the exec system call which is responsible of starting new processes (in fact, replacing the program of an existing process by another) must put those arguments in the stack of the process and the size of the stack is limited.

I think the most efficient way to do this would be not to start a new touch process each time you want a file.

You could in ruby for example:

ruby -e '1.upto(1391803) { |n| File.open("test_#{n}.txt", "w") {} }'

This way, you start only one process which will create all the files without the need to launch the touch program.

This command launches the ruby interpreter. Then ruby builds a loop over the range 1..1391803 and for each number, calls the function File.open which executes the open system call with a file name built with the number. As the block after File.open is empty, the file is immediately closed.

Solution 3

You are being limited by the maximum number of arguments touch can handle. The best bet would be to use a loop. You don't need ruby for doing that, though:

for i in $(seq 1391803); do touch test_${i}.txt; done

An alternate approach might be to split the number into chunks, say 100, and then feed those to touch at a time:

i=1; while ((i<=1391803)); do touch $(seq $i $((i+99))); i=$((i+100)); done
Share:
31,575

Related videos on Youtube

Rahul Patil
Author by

Rahul Patil

Just a Simple guy with Linux experience.. I simply like scripting, coding and helping others is a fulfilling way to stay up to date and give something back to the community and the web from which I learned so much. Currently Working as Big Data DevOps Engineer. You reach me at : tr a-z@. n-za-m.@ &lt;&lt;&lt; ybtvaenuhy90.tznvy@pbz Linkedin GitHub

Updated on September 18, 2022

Comments

  • Rahul Patil
    Rahul Patil over 1 year

    I have been testing find directory which is taking max inodes and while testing I had run

    touch test_{1..1391803}.txt
    

    But it's give me error "-bash: /usr/bin/touch: Argument list too long", now I'm running below command, but it's seems it will take Hugh time

    ruby -e '1.upto(1391803) { |n| %x( touch "test_#{n}.txt" ) }'
    

    So the question is : is there any way to create multiple files in small amount of time ? should I touch 1 lac files per loop or any better way ?

    Test Result :

    No. 1

    [root@dc1 inode_test]# time seq 343409 | xargs touch
    
    real    0m7.760s
    user    0m0.525s
    sys     0m4.385s
    

    No. 2

    [root@test-server inode_test]# time echo 'for (i=1;i<=343409;i++) i' | bc | xargs touch
    
    real    0m8.781s
    user    0m0.722s
    sys     0m4.997s
    

    No. 3

    [root@test-server inode_test]# time printf '%s ' {1..343409} | xargs touch
    
    real    0m8.913s
    user    0m1.144s
    sys     0m4.541s
    

    No. 4

    [root@test-server inode_test]# time awk 'BEGIN {for (i=1; i<=343409; i++) {printf "" >> i; close(i)}}'
    
    real    0m12.185s
    user    0m2.005s
    sys     0m6.057s
    

    No. 5

    [root@test-server inode_test]# time ruby -e '1.upto(343409) { |n| File.open("#{n}", "w") {} }'
    
    real    0m12.650s
    user    0m3.017s
    sys     0m4.878s
    
  • Rahul Patil
    Rahul Patil about 10 years
    could you please explain how this works ?
  • lgeorget
    lgeorget about 10 years
    I updated my answer. Don't hesitate to ask if there is any doubt left. :)
  • Stéphane Chazelas
    Stéphane Chazelas about 10 years
    The limit in not with touch but with the execve() system call (on the cumulative size of the arguments and environment variables passed along that call).
  • user1084563
    user1084563 about 8 years
    +1 because this is by far the fastest, even faster if you echo the commands the pipe to parallel. The accepted solution would've taken over 1 hour to create my 10 million files, this did it in less than 1 min