Accidentally removed a libvirt image file, can I recreate it?

5,021

Since you haven't shut down the VM, then the process using that image file still has the file open and it hasn't actually been deleted yet. As long as the process keeps running, you should be able to recover it.

For this answer I have a kvm image called testdelete. The VM is up, but I have deleted the file.

First you need to find the process using the file. The easiest way is with lsof.

# lsof | grep /var/lib/libvirt/images/testdelete.img
qemu-kvm  29627      qemu    9u      REG                9,0  2147483648     399357 /var/lib/libvirt/images/testdelete.img (deleted)

This tells me it's process 29627 and file descriptor 9. Let's look at this

# cd /proc/29627/fd
# ls -l 9
lrwx------ 1 qemu qemu 64 Jul 21 18:13 9 -> /var/lib/libvirt/images/testdelete.img (deleted)

OK, good. That matches. Now let's recover it! You need a disk with enough free space to hold the whole image

Ideally your VM should be as quiescent as possible; because we're copying the raw disk image we do run a risk of corruption if some processes are writing to the disk. We can try to minimise this risk by sending a STOP signal.

# kill -STOP 29627

This effectively "freezes" the process. The backup we're now taking would be the equivalent of what happens after a hard crash; on reboot the OS will fsck (or equivalent) to recover.

Now we can copy the data

# dd if=9 of=/home/sweh/recovered.img bs=1M
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 5.74931 s, 374 MB/s

That looks perfect; the disk image was 2Gb and that's what it copied.

Does this image look good?

# cd /home/sweh
# sfdisk -l recovered.img 
Disk recovered.img: cannot get geometry

Disk recovered.img: 261 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

   Device Boot Start     End   #cyls    #blocks   Id  System
recovered.img1          0+     65-     66-    524288   82  Linux swap / Solaris
recovered.img2   *     65+    261-    196-   1571840   83  Linux
recovered.img3          0       -       0          0    0  Empty
recovered.img4          0       -       0          0    0  Empty

Yup, that looks like my partition table. At this point you can do other tests to verify the image looks good.

And that's it! You have recovered your image file.

NOTE: In this example I'm going to kill the existing qemu process. That step is irrevocable because it causes the disk to be freed up. If you want to do some "parallel run" testing then you can create a new image file and virsh define a new VM to use that.

Let's get the VM restarted with this. Destroy the old VM, copy the datafile into place and restart it.

# virsh destroy testdelete
# cp -v recovered.img /var/lib/libvirt/images/testdelete.img
`recovered.img' -> `/var/lib/libvirt/images/testdelete.img'
# virsh start testdelete
Domain testdelete started

Can we connect to the console?

# virsh console testdelete
Connected to domain testdelete
Escape character is ^]

CentOS release 6.8 (Final)
Kernel 2.6.32-642.3.1.el6.x86_64 on an x86_64

dhcp226.spuddy.org login: 

Recovery complete :-)

Share:
5,021

Related videos on Youtube

PolkaRon
Author by

PolkaRon

Updated on September 18, 2022

Comments

  • PolkaRon
    PolkaRon over 1 year

    I accidentally removed the wrong image file in my /var/lib/libvirt/images directory. I'm not sure how to recreate one or to undo my removal. Any hints?

    • thrig
      thrig almost 8 years
      Absent time travel, now might be a good time to look into backup options, though that can be tricky with big binary files that may have open filesystems within them.
    • Stephen Harris
      Stephen Harris almost 8 years
      Is the VM that was backed by that image still running? (If so, do not shut it down).
    • PolkaRon
      PolkaRon almost 8 years
      Yeah, I am not shutting it down. I want to be able to get it to export its image file while it is on
    • ctrl-alt-delor
      ctrl-alt-delor almost 8 years
      /var where you put stuff that should not be backed up. Therefore I assume that it can be regenerated, or is in the wrong place.
    • Peter Cordes
      Peter Cordes almost 8 years
      Related: stackoverflow.com/questions/4171713/…. But that's talking about a file that's being appended only, not random-access.
  • roaima
    roaima almost 8 years
    You might want to include kill -STOP in the process so that the recovered image is from a paused guest rather then a running one. (The filesystem will have less chance of corruption through change while being copied).
  • roaima
    roaima almost 8 years
    Welcome to U&L. Part of the attraction of this site is that we try to provide useful solutions rather than just "maybe this might work" answers.
  • Stephen Harris
    Stephen Harris almost 8 years
    Good point; I meant to add a step about quiescence, but didn't think of using SIGSTOP. That's nice. I've updated the answer with that hint. Thanks!
  • Stephen Harris
    Stephen Harris almost 8 years
    Well, a data saver anyway ;-) Glad to help!
  • ctrl-alt-delor
    ctrl-alt-delor almost 8 years
    should be possible, in theory, to create a hard-link on the original file-system. Then no additional storage is needed, and no worries about corruption. (not sure in practice)
  • Peter Cordes
    Peter Cordes almost 8 years
    @StephenHarris: Last I looked, it seems to be intentional that you can't link open file descriptors back into the filesystem.. linkat(2) will let you link a tmp file into the filesystem if it never had any links (i.e. opened with open("/some/dir", O_TMPFILE|..., 0666)), so it's possible but denied on purpose for security reasons. Interesting idea with debugfs. You might be able to use it on an ext4 without replaying the journal...
  • Peter Cordes
    Peter Cordes almost 8 years
    Also BTW, you could use dd conv=sparse to save disk space for the output. Or use cp --sparse=always. You could run fstrim inside the VM to issue discards for unused blocks of the disk image (which may result in the file having holes punched in it, depending on the VM host etc). This will make unused parts of it read as zero.
  • Alessio
    Alessio almost 8 years
    +1, very nice answer. On the VM, I'd stop any services that might write to files (especially binary files, e.g. mysql) and then run sync. I'd also try to make a tar.gz or rsync backup of the filesystems to another machine (the host if nothing else is available). On the host, I'd use virsh suspend rather than kill -STOP (and virsh resume to restart it).
  • Stephen Harris
    Stephen Harris almost 8 years
    Yup, virsh suspend may work as well. I just tested; it leaves the image file open so can be used on my machine, but I'm not sure that'll always be true in every version. At least kill -STOP should always work regardless of libvirt version.
  • Alessio
    Alessio almost 8 years
    virsh suspend will always leave the VM running but suspended. It probably uses SIGSTOP to do it. The advantage is that you don't have to look up the PID yourself and there may be other stuff that needs to be done to safely suspend a VM that the signal alone won't do (i'd have to look at the libvirt source to be sure, and it's too later for that right now).