File is mysteriously empty. Options to recover?

linux files data-recovery

10,676

Solution 1

If you are using ext3 file system try following Carlo Wood's HOWTO

In few words,

Use ext3grep $IMAGE --ls --inode 2 | grep your_file to find the file you are looking for (where $IMAGE is your partition, for example /dev/sda2; you'll need ext3grep)
Find the file system block that contains the journal of unallocated space.
Find all journal descriptors referencing block which were found previously.
Copy the block with dd.
Edit the file to delete the trailing zeroes.
cat the file wherever you want

From the source:

"The chapter Manual recovery example

In the following example we will manually recover a small file. Only partial output is given in order to save space and to make the example more readable.

Using ext3grep $IMAGE --ls --inode we find the name of the file that we want to recover:

$ ext3grep $IMAGE --ls --inode 2 | grep carlo 3 end d 195457 D 1202352103 Thu Feb 7 03:41:43 2008 drwxr-xr-x carlo

$ ext3grep $IMAGE --ls --inode 195457 | grep ' bin$' | head -n 1 34 35 d 309540 D 1202352104 Thu Feb 7 03:41:44 2008 drwxr-xr-x bin

$ ext3grep $IMAGE --ls --inode 309540 | grep start_azureus 9 10 r 309631 D 1202351093 Thu Feb 7 03:24:53 2008 rrwxr-xr-x start_azureus

Obviously, inode 309631 is erased and we have no block numbers for this file:

$ ext3grep $IMAGE --print --inode 309631 [...] Inode is Unallocated Group: 19 Generation Id: 2771183319 uid / gid: 1000 / 1000 mode: rrwxr-xr-x size: 0 num of links: 0 sectors: 0 (--> 0 indirect blocks).

Inode Times: Accessed: 1202350961 = Thu Feb 7 03:22:41 2008 File Modified: 1202351093 = Thu Feb 7 03:24:53 2008 Inode Modified: 1202351093 = Thu Feb 7 03:24:53 2008 Deletion time: 1202351093 = Thu Feb 7 03:24:53 2008

Direct Blocks:

Therefore, we will try to look for an older copy of it in the journal. First, we find the file system block that contains this inode:

$ ext3grep $IMAGE --inode-to-block 309631 | grep resides Inode 309631 resides in block 622598 at offset 0xf00.

Then we find all journal descriptors referencing block 622598:

$ ext3grep $IMAGE --journal --block 622598 [...] Journal descriptors referencing block 622598: 4381294 26582 4381311 28693 4381313 28809 4381314 28814 4381321 29308 4381348 30676 4381349 30986 4381350 31299 4381374 32718 4381707 1465 4381709 2132 4381755 2945 4381961 4606 4382098 6073 4382137 6672 4382138 7536 4382139 7984 4382140 8931

This means that the transaction with sequence number 4381294 has a copy of block 622598 in block 26582, and so on. The largest sequence number, at the bottom, should be the last data written to disk and thus block 8931 should be the same as the current block 622598. In order to find the last non-deleted copy, one should start at the bottom and work upwards.

If you try to print such a block, ext3grep recognizes that it's a block from an inode table and will print the contents of all 32 inodes in it. We only wish to see inode 309631 however; so we use a smart grep:

$ ext3grep $IMAGE --print --block 8931 | grep -A15 'Inode 309631' --------------Inode 309631----------------------- Generation Id: 2771183319 uid / gid: 1000 / 1000 mode: rrwxr-xr-x size: 0 num of links: 0 sectors: 0 (--> 0 indirect blocks).

Inode Times: Accessed: 1202350961 = Thu Feb 7 03:22:41 2008 File Modified: 1202351093 = Thu Feb 7 03:24:53 2008 Inode Modified: 1202351093 = Thu Feb 7 03:24:53 2008 Deletion time: 1202351093 = Thu Feb 7 03:24:53 2008

Direct Blocks:

This is indeed the same as we saw in block 622598. Next we look at smaller sequence numbers until we find one with a 0 Deletion time. The first one that we find (bottom up) is block 6073:

$ ext3grep $IMAGE --print --block 6073 | grep -A15 'Inode 309631' --------------Inode 309631----------------------- Generation Id: 2771183319 uid / gid: 1000 / 1000 mode: rrwxr-xr-x size: 40 num of links: 1 sectors: 8 (--> 0 indirect blocks).

Inode Times: Accessed: 1202350961 = Thu Feb 7 03:22:41 2008 File Modified: 1189688692 = Thu Sep 13 15:04:52 2007 Inode Modified: 1189688692 = Thu Sep 13 15:04:52 2007 Deletion time: 0

Direct Blocks: 645627

The above is automated and can be done much faster with the command line option --show-journal-inodes. This option will find the block that the inode belongs to, then finds all copies of that block in the journal, and subsequently prints only the requested inode from each of these block (each of which contains 32 inodes, as you know), eliminating duplicates:

$ ext3grep $IMAGE --show-journal-inodes 309631 Number of groups: 75 Minimum / maximum journal block: 1115 / 35026 Loading journal descriptors... done Journal transaction 4381435 wraps around, some data blocks might have been lost of this transaction. Number of descriptors in journal: 30258; min / max sequence numbers: 4379495 / 4382264 Copies of inode 309631 found in the journal:

--------------Inode 309631----------------------- Generation Id: 2771183319 uid / gid: 1000 / 1000 mode: rrwxr-xr-x size: 0 num of links: 0 sectors: 0 (--> 0 indirect blocks).

Inode Times: Accessed: 1202350961 = Thu Feb 7 03:22:41 2008 File Modified: 1202351093 = Thu Feb 7 03:24:53 2008 Inode Modified: 1202351093 = Thu Feb 7 03:24:53 2008 Deletion time: 1202351093 = Thu Feb 7 03:24:53 2008

Direct Blocks:

--------------Inode 309631----------------------- Generation Id: 2771183319 uid / gid: 1000 / 1000 mode: rrwxr-xr-x size: 40 num of links: 1 sectors: 8 (--> 0 indirect blocks).

Inode Times: Accessed: 1202350961 = Thu Feb 7 03:22:41 2008 File Modified: 1189688692 = Thu Sep 13 15:04:52 2007 Inode Modified: 1189688692 = Thu Sep 13 15:04:52 2007 Deletion time: 0

Direct Blocks: 645627

The file is indeed small: only one block. We copy this block with dd as shown before:

$ dd if=$IMAGE bs=4096 count=1 skip=645627 of=block.645627 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 0.0166104 seconds, 247 kB/s

and then edit the file to delete the trailing zeroes, or copy the first 40 bytes (the given size of the file):

$ dd if=block.645627 bs=1 count=40 of=start_azureus 40+0 records in 40+0 records out 40 bytes (40 B) copied, 0.000105397 seconds, 380 kB/s

$ cat start_azureus cd /usr/src/azureus/azureus ./azureus &

Recovered!"

Solution 2

Try testdisk and photorec, but the way I understand your writing is is probably the hard way to learn the value of regular backups. Also you might want to boot from CD to prevent the harddisk from being changed even further. I personally like System Rescue Disk for this, but it is largely command line based.

Solution 3

Use Caine a special linux distribution for digital forensics. It's plenty of tools for file and hard disk recovery.

10,676

jcbwlkr

Software Engineer

Updated on September 18, 2022

Comments

jcbwlkr over 1 year

I have seen several posts about recovering deleted files, but this situation is different. My wife had a file called Journal.odt in which she kept a lot of important personal information such as special memories about our kids. The other day when she tried to open it in OpenOffice it complained about the format. I had her hit cancel and back out. When I cat the file it is completely empty. ls says the file is 0 bytes.

Had she accidentally selected all of the text in the file, hit backspace and saved it there would still be the OpenOffice meta information in the file.

I immediately shut her laptop down to prevent making any more changes to disk until I can think of something to do.

I have done some complicated things in the past such as using dd to recover raw text off the disk but I have no idea what to do here. Since odt files aren't flat text I can't just pipe the whole disk through grep.

Any suggestions would be greatly appreciated.

Also if anyone has any insight as to what might have gone wrong I would love to hear it.

Thanks
- Tim almost 12 years
  
  It would be different if the file was accidentally deleted or something, but when in a text editor, etc. saving the file often writes "in-place" effectively wiping anything that could have been recovered with forensic strength recovery. It would have been better if you did not shut the system down immediately, I bet a couple of presses of control+z (built in "undo" function in Open Office) would have rectified the issue.
- jcbwlkr almost 12 years
  
  @Tim I see your point, but unfortunately the file had been emptied out days before. Last modified time on the file was a few days prior. In my description when she opened it in OO it was already empty. Thanks though.
- Tim almost 12 years
  
  Not trying to beat a dead horse, or kick a man when he is down, but I suspect this experience will get you looking into a backup solution. Take a look at "Areca Backup" for a simple, Linux compatible backup application.
- jippie almost 12 years
  
  Disk full perhaps? Check with df -h
- Gilles 'SO- stop being evil' almost 12 years
  
  @Tim If the file is 0 bytes, it isn't an OO document; Ctrl+Z would have done nothing, since the file wasn't saved as is by OO. @jacobwalker0814 ODT files are zip files, so recovery tools like testdisk have a chance of finding them; but there's no guarantee, and even if the data is still there you may have to wade through a lot of other zip files. And for the future, do back up!
- Tim almost 12 years
  
  @Gilles I'll jot this down, in the case that I ever forget what a zero byte file is.
- gokhan acar almost 12 years
  
  @Tim I tried areca about a year ago. It was lovely in theory, but when I ran into problems, there was very little support on the forum and I gave up. Is the user community more lively now?
- Tim almost 12 years
  
  Haven't payed much attention to the community, sorry.
jcbwlkr almost 12 years

Thanks. I will look in to that distro and see if it has something. Do you have any reccomendations on specific tools or ways to approach the issue? The problem here is that the file wasn't deleted which many tools seem to address; it just lost it's contents.
PsyStyle almost 12 years

Open Office sometimes creates an hidden file wich contains the previous saved document. If you are lucky you can try to recover it using for example "extundelete" or "testdisk" cgsecurity.org/wiki/TestDisk
gokhan acar almost 12 years

Look in ~/.openoffice.org/3/user/backup/ or ~/.libreoffice.org/3/user/backup/ I wrote a script to clear these directories so that sensitive things I deleted weren't still there.