Is there any other reason for "no space left on device"?

72,819

Solution 1

My suspicion (see EDIT3) apparently was right: Adding acl support to the file system made rsync/dirvish think that all the files had changed. So instead of making an incremental backup and just create hard links to the already existing files, it tried to create a full backup which of course failed because the hard disk did not have enough space for that.

So the error message was actually correct.

After starting again with an empty backup disk, the incremental backups worked as before.

Solution 2

I see that dummzeuch find a solution to his problem but there is actually one more case I found where disk can have enough inodes/free space and still showing "no space left on the device" while attempting to transfer certain directories.

This is caused by hash collisions on block devices formatted with ext4 file system where directory indexing is enabled too especially where single directory hosts more than 100k files in it and name of the files are generated from the same algorithm (cache files, md5sum file names etc.)

Solution is to try with another directory indexing algorithm:

tune2fs -E "hash_alg=tea" /dev/blockdev_name

or to disable completely directory indexing for that block device (may hurt performance)

tune2fs -O ^dir_index /dev/blockdev_name

Another solution is to see what is filling the directory with such files and fix the software.

Possible solution is split content of folder with huge volume of files in it to multiple separate subfolders.

Full description of the problem is presented by Axel Wagner here

http://blog.merovius.de/2013/10/20/ext4-mysterious-no-space-left-on.html

Cheers.

Solution 3

Looking at the 2% of inodes left made me think about the root reserves that the EXT filesystem imposes. You may want to check these out:

  1. "Reserved space for root on a filesystem - why?"
  2. "Reasonable size for “filesystem reserved blocks” for non-OS disks?"

I would try to .tar.gz some of the older backups hoping that it would reduce the number of inodes in use.

Solution 4

There is a 2GB size limit on the directory itself - i.e. if you have so many files that the directory size is >2GB (NOT the size of the files IN the directory), you'll have an issue. Having said that, with only 2.8M inodes used, that shouldn't be an issue. Usually happens around 15M inodes.

So this may not be much help - but try ext4 on your backup device?

Solution 5

Increase your Inotify watchers limit in sysctl:

fs.inotify.max_user_watches=100000 

And reboot, or do the sysctl -w version of that also.

That'll usually do it. Something has too many files open in the kernel, and the error is totally misleading. Dropbox is a classic example of this.

Share:
72,819

Related videos on Youtube

dummzeuch
Author by

dummzeuch

Updated on September 18, 2022

Comments

  • dummzeuch
    dummzeuch over 1 year

    I am using Dirvish on a Ubuntu server system for backing up a hd to an external usb 3.0 drive. Until a few days ago, everything worked fine, but now every backup fails with "no space left on device (28)" and "file system full". Unfortunately it is not that simple: There is > 500 GB free on the device.

    Details:

    rsync_error:

    rsync: write "/mnt/backupsys/shd/gesichert1/20130223_213242/tree/<SomeFilename1>.eDJiD9": No space left on device (28)
    rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32)
    rsync: write "/mnt/backupsys/shd/gesichert1/20130223_213242/tree/<SomeFilename2>.RHuUAJ": No space left on device (28)
    rsync: write "/mnt/backupsys/shd/gesichert1/20130223_213242/tree/<SomeFilename3>.9tVK8Z": No space left on device (28)
    rsync: write "/mnt/backupsys/shd/gesichert1/20130223_213242/tree/<SomeFilename4>.t3ARSV": No space left on device (28)
    [... some more files ...]
    rsync: connection unexpectedly closed (2712185 bytes received so far) [sender]
    rsync error: error in rsync protocol data stream (code 12) at io.c(605) [sender=3.0.9]
    

    the log looks pretty much as usual until it hits:

    <SomeFilename1>
    <SomeFilename2>
    <SomeFilename3>
    <SomeFilename4>
    <PartOfAFilename>filesystem full
    write error, filesystem probably full
    broken pipe
    RESULTS: warnings = 0, errors = 1
    

    But, as said above, there is lots of space on the device:

    df -h
    /dev/sdg1       2.7T  2.0T  623G  77% /mnt/backupsys/shd
    

    and also there are lots of inodes left:

    df -i
    /dev/sdg1      183148544 2810146 180338398    2% /mnt/backupsys/shd
    

    The device is mounted as rw:

    mount
    /dev/sdg1 on /mnt/backupsys/shd type ext3 (rw)
    

    The process is running as root.

    I was about to say that I haven't changed anything but that's not quite true: I have switched on acl for the drive I am backing up:

    /dev/md0 on /mnt/md0 type ext4 (rw,acl)
    

    Could that be the problem? If yes, how? root still has full access to the files.

    EDIT:

    I just checked the temp directories:

    • /tmp contains only a .webmin folder that is empty
    • /var/tmp is empty

    the file system where these directories reside has plenty of free space and inodes:

    df -h
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/sda1       289G   55G  220G  20% /
    
    df -i
    Filesystem        Inodes   IUsed     IFree IUse% Mounted on
    /dev/sda1       19202048  167644  19034404    1% /
    

    EDIT2:

    The directories are quite large, but not > 2 GB. The one where the backup fails is not even one of the largest, it contains 7530 files.

    EDIT3:

    One information which I did not consider relevant when posting this question:

    The day before the backups started to fail I had activated acls on the file systems that were backed up. I assume now that this triggered Dirvish (or rsync) to think all the files had changed so the list of files that were to copied rather than hard linked was very large. This could possibly mean that some buffers were too small.

    Today a full backup to an empty disk worked flawlessly. I'll try an incremental backup next. This will show whether activating acls was the cause for the problem.

  • dummzeuch
    dummzeuch over 11 years
    Checked /tmp and /var/tmp. See edits.
  • dummzeuch
    dummzeuch over 11 years
    The directories are not that large. Seed edits.
  • Jenny D
    Jenny D over 11 years
    Your edits don't show the actual size of the directories. Try this: find /mnt/backupsys/shd -type d -exec ls -ld {} \; to see the actual size of the directories.
  • dummzeuch
    dummzeuch over 11 years
    You might have been right. Unfortunately I had already rebooted the computer because of a kernel update before I read your suggestion. Afterwards I started the backup and it is still running happily. I'll see whether it finishes and also what happens with the next scheduled one.
  • Dennis
    Dennis over 11 years
    Also look at quota (user limits). Not sure why you're using rsync for a local backup, though. :~/
  • Steve
    Steve almost 11 years
    This fixed a problem I was seeing - I have Dropbox and anything else inotify-driven would fail with the "No space left on device" message.
  • Deve
    Deve almost 8 years
    The before-last column of df output is the percentage of Inodes used, so 2% of the Inodes are used, and 98% are left.