Using DD for disk cloning

552,108

Solution 1

dd is most certainly the best cloning tool, it will create a 100% replica simply by using the following command. I've never once had any problems with it.

dd if=/dev/sda of=/dev/sdb bs=32M

Be aware that while cloning every byte, you should not use this on a drive or partition that is being used. Especially applications like databases can't cope with this very well and you might end up with corrupted data.

Solution 2

To save space, you can compress data produced by dd with gzip, e.g.:

dd if=/dev/hdb | gzip -c  > /image.img.gz

You can restore your disk with:

gunzip -c /image.img.gz | dd of=/dev/hdb

To save even more space, defragment the drive/partition you wish to clone beforehand (if appropriate), then zero-out all the remaining unused space, making it easier for gzip to compress:

mkdir /mnt/hdb
mount /dev/hdb /mnt/hdb
dd if=/dev/zero of=/mnt/hdb/zero

Wait a bit, dd will eventually fail with a "disk full" message, then:

rm /mnt/hdb/zero
umount /mnt/hdb
dd if=/dev/hdb | gzip -c  > /image.img.gz

Also, you can get a dd process running in the background to report status by sending it a signal with the kill command, e.g.:

dd if=/dev/hdb of=/image.img &
kill -SIGUSR1 1234

Check your system - the above command is for Linux, OSX and BSD dd commands differ in the signals they accept (OSX uses SIGINFO - you can press Ctrl+T to report the status).

Solution 3

CAUTION: dd'ing a live filesystem can corrupt files. The reason is simple, it has no understanding of the filesystem activity that may be going on, and makes no attempt to mitigate it. If a write is partially underway, you will get a partial write. This is usually not good for things, and generally fatal for databases. Moreover, if you screw up the typo-prone if and of parameters, woe unto you. In most cases, rsync is an equally effective tool written after the advent of multitasking, and will provide consistent views of individual files.

However, DD should accurately capture the bit state of an unmounted drive. Bootloaders, llvm volumes, partition UUIDs and labels, etc. Just make sure that you have a drive capable of mirroring the target drive bit for bit.

Solution 4

When using dd to clone a disk which may contain bad sectors, use conv=noerror,sync to ensure that it doesn't stop when it encounters an error, and fills in the missing sector(s) with null bytes. This is usually the first step I take if trying to recover from a failed or failing disk - get a copy before doing any recovery attempts, and then do recovery on the good (cloned) disk. I leave it to the recovery tool to cope with any blank sectors that couldn't be copied.

Also, you may find dd's speed can be affected by the bs (block size) setting. I usually try bs=32768, but you might like to test it on your own systems to see what works the fastest for you. (This assumes that you don't need to use a specific block size for another reason, e.g. if you're writing to a tape.)

Solution 5

To clone a disk, all you really need to do is specify the input and output to dd:

dd if=/dev/hdb of=/image.img

Of course, make sure that you have proper permissions to read directly from /dev/hdb (I'd recommend running as root), and that /dev/hdb isn't mounted (you don't want to copy while the disk is being changed - mounting as read-only is also acceptable). Once complete, image.img will be a byte-for-byte clone of the entire disk.

There are a few drawbacks to using dd to clone disks. First, dd will copy your entire disk, even empty space, and if done on a large disk can result in an extremely large image file. Second, dd provides absolutely no progress indications, which can be frustrating because the copy takes a long time. Third, if you copy this image to other drives (again, using dd), they must be as large or larger than the original disk, yet you won't be able to use any additional space you may have on the target disk until you resize your partitions.

You can also do a direct disk-to-disk copy:

dd if=/dev/hdb of=/dev/hdc

but you're still subject to the above limitations regarding free space.

As far as issues or gotchas go, dd, for the most part, does an excellent job. However, a while ago I had a hard drive that was about to die, so I used dd to try and copy what information I could off it before it died completely. It was then learned that dd doesn't handle read errors very well - there were several sectors on the disk that dd couldn't read, causing dd to give up and stop the copy. At the time I couldn't find a way to tell dd to continue despite encountering a read error (though it appears as though it does have that setting), so I spent quite a bit of time manually specifying skip and seek to hop over the unreadable sections.

I spent some time researching solutions to this problem (after I had completed the task) and I found a program called ddrescue, which, according to the site, operates like dd but continues reading even if it encounters an error. I've never actually used the program, but it's worth considering, especially if the disk you're copying from is old, which can have bad sectors even if the system appears fine.

Share:
552,108

Related videos on Youtube

rclanan
Author by

rclanan

Updated on September 17, 2022

Comments

  • rclanan
    rclanan almost 2 years

    There's been a number of questions regarding disk cloning tools and dd has been suggested at least once. I've already considered using dd myself, mainly because ease of use, and that it's readily available on pretty much all bootable Linux distributions.

    What is the best way to use dd for cloning a disk? I did a quick Google search, and the first result was an apparent failed attempt. Is there anything I need to do after using dd, i.e. is there anything that CAN'T be read using dd?

    • Kyle Cronin
      Kyle Cronin about 15 years
      Looks like you got the Spolsky Bump: joelonsoftware.com/items/2009/05/29.html
    • warren
      warren almost 15 years
      didn't see this on here when I asked (and answered) a similar question on superuser - superuser.com/questions/11453/…
    • Sam Watkins
      Sam Watkins over 9 years
      It's ironic that Joel linked to the question as a good example of server-fault, although none of the answers were good. There was not one answer among 25 (excluding comments) with the right dd options for skipping bad blocks - which is essential when cloning disks for recovery. I added a better answer, which can clone disks having bad blocks: dd if=/dev/sda of=/dev/sdb bs=4096 conv=sync,noerror
    • Marco
      Marco about 7 years
      I think dd restore might "fail" if talking about drive geometry dependent file systems and restore is done on non identical hard drives? I experienced some failures on dd restore, and I think this was the problem in my case.
    • Display Name
      Display Name about 6 years
      This method will also save the unallocated part of a disk, if any. If you don't need it, you can tell it where to stop (count parameter for dd, or pipe the data through head -c … for anything else; but I'm not sure how to find the exact number of bytes yet, or at least a good upper bound)
    • Display Name
      Display Name about 6 years
      Also note than most of the times, dd program is unnecessary. You can use head or cat for reading from a block device, and write to a regular file using a redirect. And you can write to a block device with tee. Then you won't have to guess a good block size, and it's likely going to be faster too. And if the source disk is in bad condition, then dd isn't a really good option too, use ddrescue to save data from damaged disks instead.
    • Cadoiz
      Cadoiz over 3 years
  • Eddie
    Eddie about 15 years
    Of course, as long as /dev/sdb is at least as large as /dev/sda...
  • Tim Williscroft
    Tim Williscroft about 15 years
    add a "bs=100M conv=notrunc" and it's much faster in my experience.
  • Alnitak
    Alnitak about 15 years
    @Eddie - and of course the partition table will be copied too, so if sdb is larger you'll have unused space at the end.
  • Svish
    Svish about 15 years
    @Tim, What does conv=notrunc do exactly?
  • Emmanuel BERNAT
    Emmanuel BERNAT about 15 years
    notrunc means (according to the man page) : do not trunc the output file. I don't understand how it can be faster. You can follow the progression of the operation with : # dd if=/dev/sda of=/dev/sdb & pid=$! # kill -USR1 $pid; sleep 1; kill $pid
  • Alberto
    Alberto about 15 years
    Play with bs a little when you start. I've encountered systems where any bs > 4k slowed the process down for some reason. If both drives are internal (ide/sata) it's probably not an issue, but if there's a network share or a USB disk involved, take care on the block size.
  • bandi
    bandi about 15 years
    just be very careful with the 'i' and 'o' letters...
  • iSee
    iSee about 15 years
    If you're cloning the whole hard disk, you're also cloning the boot loader.
  • Paul de Vrieze
    Paul de Vrieze about 15 years
    You can actually also use a read-only mount. A filesystem can be remounted with: mount -o remount,ro /path/to/device
  • shylent
    shylent about 15 years
    Oh yes. I've used dd a great many times and it always gives me shivers, when I think just how will that feel when I'll eventually get those "i" and "o" wrong..
  • Kyle Cronin
    Kyle Cronin about 15 years
    Good point, I added a note in my answer about that.
  • sleske
    sleske about 15 years
    I used ddrescue to scrape data off a dying hard drive, and can confirm that it's awesome.
  • Steve Schnepp
    Steve Schnepp about 15 years
    Does this also work with "modern" fs such a BTRFS, NILFS, [whatever you can dream of] ?
  • andyhammar
    andyhammar about 15 years
    DD works on block devices, a level of abstraction lower than the file system, so it should, yes. I haven't actually tried it, though. Hmm, NILFS looks interesting, I'll have to take a look at that.
  • andyhammar
    andyhammar about 15 years
    Sorry, just checked out NILFS' homepage and realised what you might have meant - can you use DD to copy a snapshot from a NILFS filesystem? I don't know, but it'd be interesting to find out.
  • LiraNuna
    LiraNuna almost 15 years
    You can always use 'sync' to sync the file system to the hdd before running dd.
  • Deleted
    Deleted almost 15 years
    I suspect that sync is not the answer to file corruption problems. What happens if a deamon or something writes more files after the sync, during the dd operation?
  • Alex Bolotov
    Alex Bolotov almost 15 years
    It's a good idea to umount the drive first (or remount as read-only) but it's not always possible
  • jldugger
    jldugger almost 15 years
    In which case, you use rsync and let it do file handle magic to get a consistent file and let Copy On Write semantics handle the incoming writes.
  • davr
    davr over 14 years
    I'm not sure what having multiple CPUs/cores has to do with using rsync to copy files?
  • davr
    davr over 14 years
    This is not quite correct. The 'remote_machine' command is missing something, such as > disk_backup.img or |dd of=/dev/sdb or something else, depending on what you want to do. I'm guessing you don't want to dump a disk image to stdout.
  • davr
    davr over 14 years
    If you have a disk with bad sectors, you really should be using 'ddrescue' instead of dd. It's much more efficient, and has a much better chance of recovering more data. (Don't get it confused with dd_rescue, which is not as good)
  • davr
    davr over 14 years
    You already told the way to overcome the third drawback...resize the partitions. Enlarging a partition is generally a safe and fast operation (versus shrinking or moving, which is slow and more dangerous since it's moving data around).
  • jldugger
    jldugger over 14 years
    Apologies, I intended to refer to the concept of multitasking.
  • mistiry
    mistiry over 13 years
    Nobody seems to know this trick... dd is an asymmetrical copying program, meaning it will read first, then write, then back. You can pipe dd to itself and force it to perform the copy symmetrically, like this: dd if=/dev/sda | dd of=/dev/sdb. In my tests, running the command without the pipe gave me a throughput of ~112kb/s. With the pipe, I got ~235kb/s. I've never experienced any issues with this method. Good luck!
  • Michal Bernhard
    Michal Bernhard about 13 years
    ...dd provides absolutely no progress indications... - well this is not true - there is kinda tricky way how to show progress - you have to find out pid of dd process ('ps -a | grep dd') and then send signal USR1 to this process - 'kill -USR1 <dd_pid_here>'(without <>) which force dd to show progress information.
  • trijezdci
    trijezdci almost 13 years
    OS X doesn't have watch available and -USR1 kills dd. The following command works though: while [ true ]; do killall -INFO dd; sleep 30; done
  • sourcenouveau
    sourcenouveau almost 13 years
    +1 Piping through gzip can save a lot of time and bandwidth!
  • Gauthier
    Gauthier over 12 years
    "several sectors on the disk that dd couldn't read": I think that conv=sync,noerror would help.
  • Tozz
    Tozz over 12 years
    gzipping will not work with a disk that has been used for some time, as it will be filled with either current or deleted data. gzip will only work if the empty space is zero'ed, which is only the case with a brand new disk.
  • bbqchickenrobot
    bbqchickenrobot over 12 years
    welll, just a thouhgt, but couldn't u just use gparted to resive the partition/disk being copied down to whatever is used- then drop dd? Assuming it's a onetime image it should mitigate this issue.
  • Hecter
    Hecter about 12 years
    Added bs=32M, to save future readers the agony of cloning a disk with the obsolete default block size.
  • psusi
    psusi about 12 years
    @Mistiry, that's not the meaning of the word symmetric.
  • Confusion
    Confusion about 12 years
    Upvoted for the trick to fill remaining space with zeroes. Smart!
  • Jesse
    Jesse almost 12 years
    I should also note that adding 'bs=1M' to the dd command will usually greatly improve speed.
  • kaoD
    kaoD over 11 years
    @MichalBernhard I just registered to upvote that comment. SO awesome.
  • stuartc
    stuartc over 11 years
    +1 for the kill -SIGUSR1 %1, and the OSX dd command happily accepts SIGUSR1... super useful, thanks!
  • John K. N.
    John K. N. over 11 years
    +1 for Kill -SIGUSR1 1234 I was looking for that.
  • Rag
    Rag over 11 years
    @bandi Time seems to slow to a crawl in the second before you hit enter..
  • SiXoS
    SiXoS about 11 years
    I'd like to add that running dd on a mounted filesystem WILL NOT CORRUPT the files on the mounted filesystem, but what is meant here is that the copy of the filesystem will necessarily be in a known good state.
  • SiXoS
    SiXoS about 11 years
    And throw in gzip on both ends to further minimize the sent data.
  • Marco
    Marco about 11 years
    What is the advantage of using dd compared to cat /dev/sda > /dev/sdb or pv /dev/sda > /dev/sdb?
  • Robbie Mckennie
    Robbie Mckennie almost 11 years
    While useful, i doubt its practicality for a beginner.
  • Robbie Mckennie
    Robbie Mckennie almost 11 years
    I don't think this is very practical for a novice, they may be better served with the pv command.
  • Ben
    Ben almost 11 years
    @Marco - dd allows you to specify a block size (stackoverflow.com/questions/150697/is-dd-better-than-cat)
  • rclanan
    rclanan almost 11 years
    Not really relevant to the question, but its a neat shell trick using a sub-shell and high (higher than stderr) file descriptors to convey data out of it, +1
  • Edward Groenendaal
    Edward Groenendaal almost 11 years
    I was referring to this page myself for different dd options when cloning disks, so it seemed a suitable place at the time to put the end result of what I used for cloning, especially since I thought it was rather neat myself :)
  • dlyk1988
    dlyk1988 almost 11 years
    @bandi I have scripted dd so that before executing it displays a prompt saying "Are you absolutely sure about if and of?". It requires to type "yes" in order to continue. Of course this is after I got bummed out of 3Tb of perfectly good data.
  • trijezdci
    trijezdci over 10 years
    For status info on OS X, use kill -SIGINFO <dd_pid_here> since -USR1 on OS X will kill dd. Thanks for the tip @MichalBernhard!
  • daniele
    daniele almost 10 years
    If you're restoring a partition, don't forget to run partprobe -s, or the OS won't be informed about changes to the partition table.
  • Citizen Kepler
    Citizen Kepler almost 10 years
    I found that you can also send a SIGINFO useing CTRL-T in dd. It's easier than the while loop and the OSX 10.6 cd I have does not have killall. Learned this from en.wikipedia.org/wiki/Unix_signal#Sending_signals
  • Sam Watkins
    Sam Watkins over 9 years
    It should be conv=sync,noerror The sync option is needed or else blocks with errors will be removed rather than copied as zeros.
  • Sam Watkins
    Sam Watkins over 9 years
    it's very important to use the options conv=sync,noerror so that it skips bad blocks properly... unless you happen to be certain that your disk has no bad blocks, which is not true in general.
  • Michael Hampton
    Michael Hampton over 9 years
    Actually it was mentioned before, in one answer and at least two comments.
  • Sam Watkins
    Sam Watkins over 9 years
    The conv=sync,noerror options are essential, they allow dd to skip bad blocks and zero them out in the image so things are aligned correctly. Props to the very few people who commented something about that.
  • Sam Watkins
    Sam Watkins over 9 years
    should not use a large block size if attempting to skip bad blocks, or it will skip too much. 4096 is large enough.
  • Sam Watkins
    Sam Watkins over 9 years
    @Michael the answer you linked does not contain a full example command, and has other faults - it's not a good idea to use large block size with conv=sync,noerror as it will skip too much data for each bad block. These options are essential for "recovery cloning", and it's no good having to search the comments for them. The most popular answer is adequate if the disks have no bad blocks, e.g. for cloning a pristine disk, but not for recovery.
  • Sam Watkins
    Sam Watkins over 9 years
    @Michael the example I gave is a command I have used several times for professional disk recovery. While there are other tools which might do a slightly better job, the example I gave is better at disk recovery cloning than every other dd example here, while also being suitable for cloning an error-free disk. Therefore, I consider my answer to be the best here on how to "use DD for disk cloning". I did not add info on monitoring progress, compression, etc., because I wanted to keep it simple, and focus on providing a short answer which gets the basics right.
  • Martin Geisler
    Martin Geisler over 9 years
    Using rsync will ensure that the internal data in the destination filesystem is consistent. It will not ensure that the data in the files is consistent — to do that, you would need to lock the files and any programs that write to the files would need to respect these locks.
  • endolith
    endolith over 9 years
    GNU ddrescue provides progress indicator without any special options, and you can stop the copy and resume where you left off.
  • Mike Causer
    Mike Causer over 8 years
    Should it be: dd if=/dev/hdb | gzip -c > /image.img.gz ?
  • Yuji
    Yuji about 8 years
    This, I needed this! SD card was previously used for video capture and was full of crap, compression didn't help at all.
  • Johann
    Johann almost 7 years
    Some help choosing between dd_rescue and ddrescue: askubuntu.com/a/211579/50450
  • James
    James almost 7 years
    A less tricky way to get progress with dd is to add the option status=progress
  • haliphax
    haliphax almost 7 years
    You could also use status=progress to have it report its progress on stderr while it's cloning.
  • BonsaiOak
    BonsaiOak over 6 years
    This is not an answer.
  • drzymala
    drzymala over 4 years
    Or add a status=progress argument to the dd invocation.
  • drzymala
    drzymala over 4 years
    @Mistiry, try again with a block size bigger than the default 512 bytes. It will be even faster and no pipe needed. Add this argument: bs=1M. I tried the pipe method and it did not change the transfer rate.
  • mug896
    mug896 about 4 years
    echo $! > /tmp/pid
  • Cadoiz
    Cadoiz over 3 years
    Pay attention: Using that gzip in between (even with --fast) ramped my cpu to 100% and slowed down transmission speed by a factor of at least 7. For real HDD imaging, it could be worth considering to use at least a parallel implementation: superuser.com/questions/400634/fastest-gzip-utility
  • Cadoiz
    Cadoiz over 3 years
    Consider this for discussion of the fastest bs settings: serverfault.com/questions/147935/…
  • Cadoiz
    Cadoiz over 3 years
    @michael you can also change the dd command to directly obtain the pid: dd if=/dev/zero of=/dev/null& pid=$!.
  • Cadoiz
    Cadoiz over 3 years
    @james The option status=progress is not supported everywhere, but a good call if so.
  • fuero
    fuero about 3 years
    Nowadays, there's status=progress (GNU Coreutils version 8.24 or above)
  • Joe
    Joe over 2 years
    Add status=progress. This parameter will show you where dd is at in the cloning process. Like so: dd if=/dev/sda of=/dev/sdb bs=32M status=progress