Is there a way to move many files quickly in Python?
Solution 1
What platform are you on? And does it really have to be Python? If not, you can simply use system tools like mv
(*nix) , or move
(windows).
$ stat -c "%s" file
382849574
$ time python -c 'import shutil;shutil.move("file","/tmp")'
real 0m29.698s
user 0m0.349s
sys 0m1.862s
$ time mv file /tmp
real 0m29.149s
user 0m0.011s
sys 0m1.607s
$ time python -c 'import shutil;shutil.move("file","/tmp")'
real 0m30.349s
user 0m0.349s
sys 0m2.015s
$ time mv file /tmp
real 0m28.292s
user 0m0.015s
sys 0m1.702s
$ cat test.py
#!/usr/bin/env python
import shutil
shutil.move("file","/tmp")
shutil.move("/tmp/file",".")
$ cat test.sh
#!/bin/bash
mv file /tmp
mv /tmp/file .
# time python test.py
real 1m1.175s
user 0m0.641s
sys 0m4.110s
$ time bash test.sh
real 1m1.040s
user 0m0.026s
sys 0m3.242s
$ time python test.py
real 1m3.348s
user 0m0.659s
sys 0m4.024s
$ time bash test.sh
real 1m1.740s
user 0m0.017s
sys 0m3.276s
Solution 2
Edit:
In my own state of confusion (which JoshD helpfully remedied), I forgot that shutil.move
accepts directories, so you can (and should) just use that to move your directory as a batch.
Solution 3
If you just want to move the directory, you can use shutil.move. It'll be pretty freakin' quick (if it's on the same filesystem) because it's just a rename operation.
allyourcode
Updated on June 19, 2022Comments
-
allyourcode almost 2 years
I have a little script that moves files around in my photo collection, but it runs a bit slow.
I think it's because I'm doing one file move at a time. I'm guessing I can speed this up if I do all file moves from one dir to another at the same time. Is there a way to do that?
If that's not the reason for my slowness, how else can I speed this up?
Update:
I don't think my problem is being understood. Perhaps, listing my source code will help explain:
# ORF is the file extension of the files I want to move; # These files live in dirs shared by JPEG files, # which I do not want to move. import os import re from glob import glob import shutil DIGITAL_NEGATIVES_DIR = ... DATE_PATTERN = re.compile('\d{4}-\d\d-\d\d') # Move a single ORF. def move_orf(src): dir, fn = os.path.split(src) shutil.move(src, os.path.join('raw', dir)) # Move all ORFs in a single directory. def move_orfs_from_dir(src): orfs = glob(os.path.join(src, '*.ORF')) if not orfs: return os.mkdir(os.path.join('raw', src)) print 'Moving %3d ORF files from %s to raw dir.' % (len(orfs), src) for orf in orfs: move_orf(orf) # Scan for dirs that contain ORFs that need to be moved, and move them. def main(): os.chdir(DIGITAL_NEGATIVES_DIR) src_dirs = filter(DATE_PATTERN.match, os.listdir(os.curdir)) for dir in src_dirs: move_orfs_from_dir(dir) if __name__ == '__main__': main()
-
JoshD over 13 yearsI think he wants to move rather than copy... maybe. In that case a simple move is much faster than a copy then delete.
-
Srikar Appalaraju over 13 yearsTechnically 'copy' should be faster than 'move'. Since move does 'copy+delete'. One way to speed your program.
-
JoshD over 13 years@movieyoda: I take it you've not moved 20GB directories then copied the same 20GB directory, have you? Move (on the same disk) is simply a rename.
-
Glenn Maynard over 13 yearsThere's no particular reason that would be any faster than doing it in Python; it's generally going to be I/O-bound.
-
ghostdog74 over 13 yearsIn all the tests on my linux box, using system
mv
is faster than Pythonshutil.move
. -
AndiDog over 13 yearsOn the same filesystem, to be exact.
-
AndiDog over 13 yearsOh and by the way,
shutil.move
doestry: os.rename(...) except OSError: ...copy and delete...
automatically, so there's no reason for usingos.rename
in 99% of the cases. -
JoshD over 13 years@AndiDog: Thanks for clarifying those details. I'll update the answer with the more accurate information.
-
JoshD over 13 years@user131527: It sounds like he has a script that's locating particular files and moving them. In that case (since he's already in python)
shutil.move(stuff)
is cleaner & safer to write thanos.system('mv stuff');
Once you're already running python interpreter, the difference is moot since shutil.move just calls the system's move. -
ghostdog74 over 13 yearsThat's why i ask whether using Python is a definite must, right? If not, using the shell's mv command instead of Python.
-
allyourcode almost 13 yearsI'm sure this can be done in shell (I assume it's Turing complete), but I see no reason why this should be slow in Python.