'Argument list too long' error while copying a large number of files

37,897

Solution 1

*.jpg expands to a list longer than the shell can handle. Try this instead

find  /home/ftpuser/public_html/ftparea/ -name "*.jpg" -exec cp -uf "{}" /your/destination \;

Solution 2

There is a maximum limit to how long an argument list can be for system commands -- this limit is distro-specific based on the value of MAX_ARG_PAGES when the kernel is compiled, and cannot be changed without recompiling the kernel.

Due to the way globbing is handled by the shell, this will affect most system commands when you use the same argument ("*.jpg"). Since the glob is processed by the shell first, and then sent to the command, the command:

cp -uf *.jpg /targetdir/

is essentially the same to the shell as if you wrote:

cp -uf 1.jpg 2.jpg ... n-1.jpg n.jpg /targetdir/

If you're dealing with a lot of jpegs, this can become unmanageable very quick. Depending on your naming convention and the number of files you actually have to process, you can run the cp command on a different subset of the directory at a time:

cp -uf /sourcedir/[a-m]*.jpg /targetdir/
cp -uf /sourcedir/[n-z]*.jpg /targetdir/

This could work, but exactly how effective it would be is based on how well you can break your file list up into convenient globbable blocks.

Globbable. I like that word.

Some commands, such as find and xargs, can handle large file lists without making painfully sized argument lists.

find /sourcedir/ -name '*.jpg' -exec cp -uf {} /targetdir/ \;

The -exec argument will run the remainder of the command line once for each file found by find, replacing the {} with each filename found. Since the cp command is only run on one file at a time, the argument list limit is not an issue.

This may be slow due to having to process each file individually. Using xargs could provide a more efficient solution:

find /sourcedir/ -name '*.jpg' -print0 | xargs -0 cp -uf -t /destdir/

xargs can take the full file list provided by find, and break it down into argument lists of manageable sizes and run cp on each of those sublists.

Of course, there's also the possibility of just recompiling your kernel, setting a larger value for MAX_ARG_PAGES. But recompiling a kernel is more work than I'm willing to explain in this answer.

Solution 3

As GoldPseudo commented, there is a limit to how many arguments you can pass to a process you're spawning. See his answer for a good description of that parameter.

You can avoid the problem by either not passing the process too many arguments or by reducing the number of arguments you're passing.

A for loop in the shell, find, and ls, grep, and a while loop all do the same thing in this situation --

for file in /path/to/directory/*.jpg ; 
do
  rm "$file"
done

and

find /path/to/directory/ -name '*.jpg' -exec rm  {} \;

and

ls /path/to/directory/ | 
  grep "\.jpg$" | 
  while
    read file
  do
    rm "$file"
  done

all have one program that reads the directory (the shell itself, find, and ls) and a different program that actually takes one argument per execution and iterates through the whole list of commands.

Now, this will be slow because the rm needs to be forked and execed for each file that matches the *.jpg pattern.

This is where xargs comes into play. xargs takes standard input and for every N (for freebsd it is by default 5000) lines, it spawns one program with N arguments. xargs is an optimization of the above loops because you only need to fork 1/N programs to iterate over the whole set of files that read arguments from the command line.

Solution 4

That happens because your wildcard expression (*.jpg) exceeds the command line argument length limit when expanded (probably because you have lots of .jpg files under /home/ftpuser/public_html/ftparea).

There are several ways for circumventing that limitation, like using find or xargs. Have a look at this article for more details on how to do that.

Solution 5

There is a maximum number of arguments that can be specified to a program, bash expands *.jpg to a lot of arguments to cp. You can solve it by using find, xargs or rsync etc.

Have a look here about xargs and find

https://stackoverflow.com/questions/143171/how-can-i-use-xargs-to-copy-files-that-have-spaces-and-quotes-in-their-names

Share:
37,897

Related videos on Youtube

FriendlyFlashAmateur
Author by

FriendlyFlashAmateur

Web Developer

Updated on September 17, 2022

Comments

  • FriendlyFlashAmateur
    FriendlyFlashAmateur almost 2 years

    I am using the following command:

    \cp -uf /home/ftpuser1/public_html/ftparea/*.jpg /home/ftpuser2/public_html/ftparea/
    

    And I am getting the error:

    -bash: /bin/cp: Argument list too long
    

    I have also tried:

    ls /home/ftpuser1/public_html/ftparea/*.jpg | xargs -I {} cp -uf {} /home/ftpuser2/public_html/ftparea/
    

    Still got -bash: /bin/ls: Argument list too long

    ANy ideas?

  • FriendlyFlashAmateur
    FriendlyFlashAmateur almost 15 years
    I used find /home/ftpuser1/public_html/ftparea/ -name "*jpg" -exec cp -uf "{}" /home/ftpuser2/public_html/ftparea/ and got the following error find: missing argument to `-exec'
  • FriendlyFlashAmateur
    FriendlyFlashAmateur almost 15 years
    apologies these are two different directories should be ftpuser1 and ftpuser2
  • FriendlyFlashAmateur
    FriendlyFlashAmateur almost 15 years
    Just tried this: ls /home/ftpuser1/public_html/ftparea/*.jpg | xargs -I {} cp -uf {} /home/ftpuser2/public_html/ftparea/ Still got -bash: /bin/ls: Argument list too long
  • FriendlyFlashAmateur
    FriendlyFlashAmateur almost 15 years
    I am getting find: missing argument to `-exec' /home/ftpuser1/public_html/ftparea/ -name '*jpg' -exec cp -uf "{}" /home/ftpuser2/public_html/ftparea/ +
  • Sandokas
    Sandokas almost 15 years
    You are missing the last argument of cp, the answerer told you right. Double check your implementation. Note that in this answer the dot in "*.jpg" is missing, this could lead to misbehaviors (cp a dir named "myjpg" for example). Note then that may be paranoic but safer to specify closely what you are going to copy using -type file (preventing dirs, symlinks and so on to be affected)
  • FriendlyFlashAmateur
    FriendlyFlashAmateur almost 15 years
    After closer inspection i missed the “\;” to finish the command that -exec should execute. Silly me!
  • Dennis Williamson
    Dennis Williamson almost 15 years
    I rearranged the arguments to cp to fix that error.
  • Shawn Chin
    Shawn Chin almost 15 years
    @AlberT: thanks for the heads re the missing dot. That was a typo. Answer updated.
  • chris
    chris almost 15 years
    Find and echo * result in the same output -- the key here is using xargs not just passing all 1 billion command line arguments to the command the shell's trying to fork.
  • chris
    chris almost 15 years
    I have no idea why this was down-voted. It's the only answer that seems to be explaining why this is happening. Maybe because you didn't suggest using xargs as an optimization?
  • Greg Hewgill
    Greg Hewgill almost 15 years
    Oh, you're quite right, of course ls will have the same problem! I've changed to find which won't.
  • goldPseudo
    goldPseudo almost 15 years
    added in the xargs solution, but i'm still worried the downvotes are because of something blatantly wrong in my details and noone wants to tell me what it is. :(
  • William Pursell
    William Pursell almost 15 years
    echo * will fail if there are too many files, but find will succeed. Also, using find -exec with + is equivalent to using xargs. (Not all find support +, though)
  • JasonK
    JasonK over 14 years
    +1 for the good external resource on subject.
  • Jan Vlcinsky
    Jan Vlcinsky over 10 years
    xargs seems to be much more efficient, as resulting number of command calls is much smaller. In my case, I see 6-12 times better performance when using args then when using -exec solution with growing number of files is the efficiency growing.