What's faster? Moving files from one drive to another, or moving files in the same drive?

11,682

Solution 1

Moving a file inside the same partition (or same file-system) won't really move anything.

All it would do would be to delete its entry in the file-table and create another. The file itself will not be physically moved, so the operation will be almost instantaneous, no matter the size of the file.

Solution 2

It depends on a bit more than just the physical hardware layout. In general, there are four cases to consider:

  1. Moving a file within a single filesystem (IOW, within a single ‘drive’ by the Windows definition of the term ‘drive’).
  2. Copying a file within a single filesystem.
  3. Moving or copying a file between two filesystems that are on the same physical storage device.
  4. Moving or copying a file between two filesystems that are on separate physical storage devices.

In general, the first case is always going to be the fastest, because short of some atypical situations, it just amounts to updating some of the filesystem metadata. The only two exceptions you would likely ever encounter are dealing with in-line data transformations (such as the in-line compression supported by NTfS) where the source and destination have different rules for such transformations, and dealing with certain networked filesystems (such as older versions of NFS), with both cases becoming equivalent to the third case.

The speed of the second case depends on the filesystem involved. If it supports reflinks (like ZFS and BTRFS do), then it can be just as fast as the first case (because it essentially becomes the first case). If it does not, then it will generally be equivalent to the third case instead.

The third case will usually be the slowest case, because the system has to read the data from the device, store it temporarily in RAM, and then write it back out to the device somewhere else. Some storage protocols may support ‘device-side copy’ functionality (some SCSI devices support this for example, as do most intelligent networked filesystem protocols), in which case this can potentially be rather fast, though usually still slower than the first case.

The fourth case is where things get really interesting, because it’s performance depends almost entirely on the specifics of the hardware setup of the physical storage devices. Some easy examples of this:

  • In a classic PATA setup with both storage devices on the same cable, the performance is actually marginally worse than the third case. This is because you have a single data path shared by both devices, and on top of the read/write cycle you would normally have for the same device, you end up with some extra overhead just for managing the two devices at the same time.
  • In a relatively standard SATA setup with both storage devices on the same AHCI SATA controller, performance will be significantly better than the third case, but still nowhere near peak device bandwidth. This is largely due to limitations in the AHCI spec that make it at best challenging to handle multiple devices simultaneously on a single controller (the implications are not bad enough to make it a problem for consumer usage, but are part of the reason that SCSI still reigns supreme in enterprise usage).
  • With a typical enterprise SAS setup, the performance will be relatively close to the peak bandwidth of the slower of the two devices, provided it’s the only thing running at the time. SAS is quite simply exponentially more efficient than SATA.
  • With a pair of very nice NVMe devices, just the right hardware layout on the mainboard, and proper support in the OS, 99% of the transfer can actually run at peak bandwidth of the slower of the two devices. This setup is hard to put together, but allows you to leverage a little-used feature of PCI-e that allows two devices to transfer data directly without needing to bother the host.

Solution 3

tldr: Moving from one drive to another drive can be faster, especially with rotational storage!

What really matters isn't if the move is staying within the same drive. What matters is if the move is staying within the same filesystem.

This is more correct than referring to partitions as do some comments. (Take for example an lvm partition with two LVs in it. Same penalty moving between them as would be between two partitions.)

This pedantry is rather important: Drive, Partition, Filesystem, are not interchangeable despite most people taking that for granted.

There is no spoon:

Moves aren't what you think they are, inside a filesystem. In fact, the notion of physical location, inside a filesystem, is false. The whole hierarchical structure is much a facade put on to help us humans visualize things. You should think of all files within a filesystem as randomly numbered objects in a database table. One of the fields in that table is the random number, another is the data itself, another permissions, and one is the full file path.

Moving a file then is just as simple as updating the tiny field that says the path to that file. And that's what it is in most filesystems.

"Compressed Dirs" and "Encrypted Dirs" add more I/O to a move. As could BTRFS subvolumes, and some other trick features. But generally this is how a move works in most filesystems.

Once you move a file from one filesystem to another filesystem however, the entirety of that file's contents have to be read and written to disk. And if you have two disks to share that IO load it will be much faster. Especially considering all the head movement in a rotational drive.

Share:
11,682

Related videos on Youtube

Santiago
Author by

Santiago

Updated on September 18, 2022

Comments

  • Santiago
    Santiago over 1 year

    Let's suppose I have a 2GB file that I want to move, and I have two hard drives that are exactly the same. Would it be faster to move the file from one drive to another, or will it be faster-moving the file inside the same drive?

    I'm asking this because I suppose it will be faster transferring from one drive to another even though they are exactly the same, as you are performing a read operation on one drive and a write operation on another, instead of making a read/write operation on the same drive. Can someone answer this, and, if I'm correct, explain to me why more technically?

    Thanks!

    • Moab
      Moab almost 3 years
      Moving is always faster on the same drive, copying is much slower on the same drive.
    • Jeff Zeitlin
      Jeff Zeitlin almost 3 years
      Moving on the same drive rarely, if ever, actually moves the file data; usually, it just moves the entry in the directory.
    • Thomas Weller
      Thomas Weller almost 3 years
      @Moab: moving from one drive letter to another drive letter will involve copying, even if it's the same physical drive.
    • Michael
      Michael almost 3 years
      @JeffZeitlin: Only if it’s the same volume/partition.
    • Jeff Zeitlin
      Jeff Zeitlin almost 3 years
      @Michael - Correct, if pedantic; most Windows users equate "drive" with "volume assigned a drive letter", even if there are multiple volumes/partitions on the same physical device.
    • phuclv
      phuclv almost 3 years
      @ThomasWeller Windows allows you to mount any volume to any folder in NTFS just like Unix. So just stop using "drive letter" and use "volume" or "partition" instead
    • Billy left SE for Codidact
      Billy left SE for Codidact almost 3 years
      @Moab Always? Make two partitions on one magnetic drive, and format to your preferences. move a file from one to the other....... Moving a file WITHIN THE SAME FILESYSTEM is fast because you don't move much. Moving to and from the same drive incurs an io contention penalty that would not exist from one drive to another.
    • Panzercrisis
      Panzercrisis almost 3 years
      @Moab Out of curiosity, why would copying be much slower on the same drive? Is that just when the computer's already pushed for resources, and having to do everything itself pushes it further, or are you referring to something else?
    • Moab
      Moab almost 3 years
      @Panzercrisis Copy requires read and write on the same drive partition, Move only modifies the Master File Table.
    • FeRD
      FeRD almost 3 years
      @Panzercrisis When copying you have to read into a memory buffer, then write that buffer back out to the destination drive. If that destination drive is the same physical device, you're going to encounter some degree of interface contention mixing reads and writes. Less if it's an SSD (on a spinning magnetic disk the seeks for every read/write context switch will destroy performance), but still more than you'd get with one device reading data at max interface bandwidth, and a different one just writing continuously.
    • Austin Hemmelgarn
      Austin Hemmelgarn almost 3 years
      @Moab Copying does not always require actually reading and writing all the file data, some filesystems allow copy-on-write references to blocks, which if used correctly makes a copy no more expensive than a move. Of course, Windows has no options for this, but almost all modern UNIX-like systems (including macOS) support at least one filesystem that provides such functionality.
  • agtoever
    agtoever almost 3 years
    I think this is true, except in some edge cases, such as moving a file on a ntfs formatted drive where one directory is compressed and/or encrypted and the other one isn’t. See for example here.
  • iBug
    iBug almost 3 years
    Suggested change: "the same partition" → "the same volume" since they are actually different.
  • manassehkatz-Moving 2 Codidact
    manassehkatz-Moving 2 Codidact almost 3 years
    Even more interesting: Moving an entire directory tree with thousands of files is (when done right, OS dependent) just one entry. Moving (copy/delete) thousands of files between drives takes a while, even if the files are tiny.
  • Thomas Weller
    Thomas Weller almost 3 years
    @pipe: in order to understand, you should know under which circumstances things will work the way you understood it and that there still might be circumstances when the answer might not apply. Sometimes you might think you have a drive and you're moving from that drive to the same drive, but in fact it's a WebDAV mapped network share or whatever and it suddenly behaves different. This will cause more confusion than a correct statement right away.
  • Tom Yan
    Tom Yan almost 3 years
    @iBug the more universal way to put is probably "the same filesystem"
  • Ross Presser
    Ross Presser almost 3 years
    Back in the late 80s, we had a group of DOS machines with 3.5" floppies and network cards (10Mbps) only. Copying/moving files from one network share to another -- by definition, this required using the host and its RAM -- was faster than copying to a floppy.
  • Austin Hemmelgarn
    Austin Hemmelgarn almost 3 years
    @RossPresser Yes, at that time I would have been surprised if a network share was slower than a floppy disk. 10Mbps, even factoring in protocol overhead, is blazing fast compared to the roughly 240 kbps transfer speeds that a 3.5 inch floppy drive could do. It’s possible in modern times to similarly have faster copies between network shares than local disks, but it’s atypical outside of an enterprise setting (because your home file server’s storage stack is almost certainly no faster than what you have on the client system).
  • jaskij
    jaskij almost 3 years
    With asset streaming being introduced in games, with hardware support in GPUs, one would hope that P2P DMA in PCIe would have relatively fast improvements in support, both in Windows and in NVMe drives. So what was little-used a few years back, has come to the fore.
  • Matija Nalis
    Matija Nalis almost 3 years
    Also note that 3rd case will be much slower than 4th case when we're talking of rotating media as HDDs (as opposed to SSDs), due to having to seek all the time, killing performance.
  • Carsten S
    Carsten S almost 3 years
    -1 for misuse of “exponentially”. Just kidding, great, thorough answer.
  • Barmar
    Barmar almost 3 years
    @ThomasWeller Perhaps the best way to express this is that moving within the same filesystem can never be worse than moving between filesystems, and in most cases is much better.
  • pabouk - Ukraine stay strong
    pabouk - Ukraine stay strong almost 3 years
    Case 4 - two PATA drives on a single cable will normally be faster than the case 3. Normal PATA drives use a write-back cache (receive the sector content and write it to the disk later) and read-ahead (read the following sectors ahead so their data will be then available immediately). --- Unfortunately I do not know if the classic ATA (without TCQ) allows to send a command, free the bus and finish the command later. This would make an additional help to optimize operations of two drives on a single bus.
  • Peter Cordes
    Peter Cordes almost 3 years
    "the file table"? What is this, FAT? In standard Unix-like FS, the inode stays the same, but a reference do it is removed for the original directory, and a similar name->inode mapping added to the new directory. (And the "change time" updated in the file's inode, but the inode is still the same one holding the file's metadata (permissions / ownership, other timestamps, ref-count, and a pointer to the actual data.)) Moving to another directory is just like making a hard-link, except you unlink the original. Perhaps when you say "file table", you meant "directory"?
  • Peter Cordes
    Peter Cordes almost 3 years
    With modern rotational drives having ~200MB/s read/write bandwidth, even fully serialized competition for SATA-III ~600MB/s would keep up with the disks, only needing a total of 400MB/s of IO bandwidth. (With device caches allowing burst transfers). Maybe you meant with SSDs that are bottlenecked by the SATA interface?
  • Peter Cordes
    Peter Cordes almost 3 years
    (Similarly, with one NVMe and one SATA, or two NVMe, DRAM bandwidth should be sufficient bounce the data through host memory and still achieve full speed for most of the copy. It adds some latency to the pipeline so the last write doesn't make it to the dst as early. Although you do end up with the data in pagecache in RAM, so if it's not too huge you could copy it again without having to re-read it from anywhere.)
  • harrymc
    harrymc almost 3 years
    @PeterCordes: File-table is used here in a generic manner, meaning a disk table that keeps the information about the file, no matter the file-system format.
  • Peter Cordes
    Peter Cordes almost 3 years
    @harrymc: In that case, that's not how it works in normal mainstream FSes. The same file-table entry stays allocated, only references to it (from different directories) are modified.
  • Austin Hemmelgarn
    Austin Hemmelgarn almost 3 years
    @PeterCordes The issue with two SATA devices isn’t the SATA link, it’s the controller itself. The AHCI spec has a number of performance issues (the really big one is that it’s not concurrent-access safe, meaning that on all modern x86 systems you need locking or some other form of synchronization between all the CPU cores to safely access it), and as a result inter-device transfers on a single AHCI controller run at measurably less than the theoretical peak bandwidth (though the effect is more pronounced with fast SSDs than traditional hard drives).
  • Peter Cordes
    Peter Cordes almost 3 years
    Sure, I understood your point as being that AHCI makes it hard / impossible to saturate the SATA links on multiple devices. But if you don't need to saturate the links to avoid having the device buffers empty or fill, then wouldn't device-side caches for read-prefetch / write-behind be able to absorb hiccups in transfers to/from the host and mostly sustain sequential transfers at the media bandwidth? Correct me if I'm wrong, but you'd only lose real throughput for a 2-device xfer when either the link is the bottleneck (so losing any xfer time hurts), or device buffers can't hide the bubbles.
  • Austin Hemmelgarn
    Austin Hemmelgarn almost 3 years
    @PeterCordes In general yes, that’s correct. And if you’re running a true bulk sequential transfer (like cloning a disk), you generally won’t have any issues with AHCI and SATA hard drives. Copying files is pretty consistently not like that though, unless you’re only copying a few very big files. There are a lot of associated metadata reads and writes, (hopefully) proper write barriers, and the files are generally not going to all be in one place in the right order, so you have latency in multiple places affecting your throughput simultaneously, which makes the AHCI overhead worse.
  • Peter Cordes
    Peter Cordes almost 3 years
    Ah yes, I was thinking of moving a file not many small to medium-sized files.
  • John B. Lambe
    John B. Lambe almost 3 years
    @iBug No, that would make the answer inaccurate. Moving files between partitions of the same volume would require reading and writing the file contents (unlike a move within the same filesystem), and would generally be no faster (and is likely to be slower) than moving/copying between volumes.
  • iBug
    iBug almost 3 years
    @JohnB.Lambe I'm afraid we aren't consistent about the concept of a "volume". For most common cases, a volume is simply a partition on a disk. Some uncommon examples include LVM logical volumes (which is technically equivalent to a partition) or Windows dynamic disks. More complex cases may be ZFS or CephFS where it's less of a "partition".
  • M. Y. Zuo
    M. Y. Zuo almost 3 years
    I think it'd be wise to preface that by excluding so called 'fusion drives' and the like that use hybrid systems, for example of SSD and hard disk.
  • gronostaj
    gronostaj almost 3 years
    Please read other answers before posting your own. That will not only let you avoid repeating what others have already said, but may also clear up some misconceptions of yours.
  • CHP
    CHP almost 3 years
    Well, but I wanted to point this way in another direction to help others.
  • briantist
    briantist almost 3 years
    This only applies to Windows XP, the very first sentence of that question say that Windows 7 and up are not affected: When XP clients move files on the same volume, the permissions are moved with it. With Windows 7 clients and up, when a file is moved, the permissions are inherited.