How do I convert a Linux disk image into a sparse file?

111

Solution 1

First of all, sparse files are only handled transparently if you seek, not if you write zeroes.

To make it more clear, the example from Wikipedia

dd if=/dev/zero of=sparse-file bs=1k count=0 seek=5120

does not write any zeroes, it will open the output file, seek (jump over) 5MB and then write zero zeroes (i. e. nothing at all). This command (not from Wikipedia)

dd if=/dev/zero of=sparse-file bs=1k count=5120

will write 5MB of zeroes and will not create a sparse file!

As a consequence, a file that is already non-sparse will not magically become sparse later.

Second, to make a file with lots of zeroes sparse, you have to cp it

cp --sparse=always original sparsefile

or you can use tar's or rsync's --sparse option as well.

Solution 2

Perhaps the easiest way to sparsify a file in place would be to use fallocate utility as follows:

fallocate -v --dig-holes {file_name}

fallocate(1) is provided by util-linux package on Debian.

Solution 3

Editing my answer for completeness:

  1. Balloon empty FS space with zeroes (WARNING: this changes your disk image):

losetup --partscan --find --show disk.img

Assume it gives /dev/loop1 as the disk and there is only one partition, otherwise we need to repeat this for every partition with mountable FS in it (ignore swap partition etc.).

mkdir -p /mnt/tmp mount /dev/loop1p1 /mnt/tmp dd if=/dev/zero of=/mnt/tmp/tempfile

Let it finish to failure with ENOSPC.

/bin/rm -f /mnt/tmp/tempfile umount /mnt/tmp losetup -d /dev/loop1

  1. Copy into a sparse image:

'dd' has an option to convert a file with zeroes to a sparse file:

dd if=disk.img of=disk-sparse.img conv=sparse

Solution 4

Do you mean that your ddrescue created image is, say, 50 GB and in reality something much less would suffice?

If that's the case, couldn't you just first create a new image with dd:

dd if=/dev/zero of=some_image.img bs=1M count=20000

and then create a filesystem in it:

mkfsofyourchoice some_image.img

then just mount the image, and copy everything from the old image to new one? Would that work for you?

Solution 5

PartImage can create disk images that only store the used blocks of a filesystem, thus drastically reducing the required space by ignoring unused block. I don't think you can directly mount the resulting images, but going:

image -> partimage -> image -> cp --sparse=alway

Should produce what you want (might even be possible to stick the last step, haven't tried).

Share:
111

Related videos on Youtube

user2468807
Author by

user2468807

Updated on September 17, 2022

Comments

  • user2468807
    user2468807 almost 2 years

    I faced an interview and was asked the following question :

    Given n stairs, how many number of ways can you climb if u use either 1 or 2 at a time?

    I think recursion might be useful?.. Is there any other method?

    • Maroun
      Maroun about 11 years
      Indeed. Recursion is a good approach for this problem. As you know, every recursive method can be written as a non-recursive one. (For this specific problem, this can be achieved by some temp variables and loops - Think about it).
    • sigpwned
      sigpwned about 11 years
      I don't think you've provided enough information to get a good answer. Also, this is not really a "programming" question. You might find better answers on a different Stack Exchange site, like math.stackexchange.com. For future reference, you're also much more likely to get a positive reaction to your question if you use proper spelling and grammar. If you want people to take the time to answer your question thoughtfully, you should take the time to ask your question thoughtfully.
    • Dave
      Dave about 11 years
      The more interesting problem is that this looks solvable with a single equation. Consider factorials and triangular numbers.
    • Lion
      Lion about 11 years
      How is this related to C? The problem is language agnostic. It has nothing to do with any particular programming language.
  • hotei
    hotei almost 14 years
    Interesting - a downvote yet I notice there's no refutation of what I wrote. If it's accurate but unhelpful that's not a reason to downvote. If it's not accurate and not helpful then it does deserve it.
  • mihi
    mihi almost 14 years
    why reinvent the wheel? cp --sparse=always does the work fine
  • endolith
    endolith almost 14 years
    According to Wikipedia, writing zeros with dd will create a sparse file. Can you explain what "seeking" means?
  • hotei
    hotei almost 14 years
    @mihi: That's a good idea. I didn't know about the sparse option as it's not available in BSD flavors (freebsd.org/cgi/…) and I have never had the requirement to look at a Linux man page for cp (until today).
  • karthik
    karthik almost 14 years
    What about cat then? There is nothing in the man page about sparse files, so I assume cat /dev/zero > zero.file is perfectly OK to fill empty space with zeros?
  • mihi
    mihi almost 14 years
    @endolith: Updated my answer to make clear what the difference is to use dd for writing zeroes or for seeking.
  • mihi
    mihi almost 14 years
    @Ludwig Weinzierl: Yes, that cat command will fill your entire disk (or at least the amount not reserved for root or by quotas) with "real" zeroes, and create no sparse files.
  • hotei
    hotei almost 14 years
    @mihi: Any dd command with a count of 0 (zero) is guaranteed to do nothing by virtue of the count=0. Has nothing to do with sparse etc. I like where you're going with this but you need a better example.
  • mihi
    mihi almost 14 years
    @hotei: Even if you give count=0, it will still honour the seek option before writing zero bytes. And the example is from Wikipedia. A seek beyond the end of the disk will create a sparse file, regardless if you write after the seek or not.
  • endolith
    endolith about 13 years
    tar or rsync with sparse is still making a copy of the file, right? so you need space for two copies of it.
  • mihi
    mihi about 13 years
    @endolith you will need extra space, yes. but since you can compress the tarball, you will only need space for the original file and a compressed version of the sparse file.
  • endolith
    endolith about 11 years
    Just used this to reduce a file from 21G to 86M. :D
  • Dave
    Dave about 11 years
    Elegant. But this whole question probably belongs on math exchange.
  • Dave
    Dave about 11 years
    This is amazingly inefficient considering this has already been shown to be a fibonacci sequence. It can be solved with a single equation: floor( pow( 0.5 + SQRT5 * 0.5, (double) n ) / SQRT5 + 0.5 ) (with SQRT5 defined appropriately)
  • Perkins
    Perkins over 7 years
    Unfortunately the images created by partimage are not mountable without expanding them out again, making them suitable only for archival purposes.
  • Perkins
    Perkins over 7 years
    One way to have your compressed images and mount them too is to simply store them on a filesystem that supports native compression. Makes data recovery awful if you have a drive crash, but that's what backups are for, right?
  • Ruslan
    Ruslan about 7 years
    For some reason, fallocate --dig-holes resulted in 103GiB file from 299GiB original, while cp --sparse=always gave me 93GiB — all with the same SHA1 sum (sizes checked via du -B1G vs du --apparent-size -B1G). So fallocate seems to give inferior results.
  • endolith
    endolith almost 6 years
  • Lam Das
    Lam Das almost 6 years
    Yes, this option is not from the time when OP asked. This was more of "leave a bread crumb for other searchers"...:-)
  • mihi
    mihi almost 6 years
    depending on filesystem type, zerofree may be faster than mounting and writing zeroes to the filesystem, and making the disk image grow less if it already contained lots of zeroes.
  • Soruk
    Soruk about 3 years
    Under newer kernels, you can now fstrim a filesystem in a file, which appears to make the file sparse too, and is likely to be much quicker than writing zeros then copying the file as sparse. (Again, a bread crumb for other searchers.)