NFSv4 "Too many levels of symbolic links" error

15,214

Solution 1

About the problem

You can have a problem where two or more files have the same readdir cookie.

This problem is more common when using a NFS filesystem (v3 or v4) over an EXT4 backend and with a lot of files in the same directory (more than 50000). It problem can also occur when using GlusterFS instead of NFS.

PS: This problem can occur also with only few files inside a single directory, but this last case is very very improbable.

In this case, you will see Too many levels of symbolic links errors even if you have no symlinks inside your directory. You can prove this verifying that the following command returns no output:

find /mnt/storage/aaaaaaa_aaa/bbbb/cccc_ccccc -type l

To check if you're getting this specific problem, run the above command:

$ ls /mnt/storage/aaaaaaa_aaa/bbbb/cccc_ccccc >/dev/null
ls: reading directory .: Too many levels of symbolic links

After, check your syslog (/var/log/syslog) for entries like:

[400000.200000] NFS: directory /mnt/storage/aaaaaaa_aaa/bbbb/cccc_ccccc
contains a readdir loop. Please contact your server vendor.
The file: DDDDDDDDDD has duplicate cookie COOKIE_NUMBER.

The problem is related to the readdir function of the readdir API, that uses the readdir cookie to quickly locate a file inside a directory. The NFS server uses this API while communicating with EXT4 backends.

A complete and excellent explanation about the duplicate cookie problem (actually, a hash collision problem) can be found at Widening ext4's readdir() cookie.

A related bug report can be found at NFS client reports a 'readdir loop' with a corrupt name.

If you can reboot your system, the good news is that, according to David Hedberg, this problem is already solved in newer Ubuntu kernel versions (>= 3.2.0-60-generic). You may need to update your NFS server also (the solution only works if both NFS server and Kernel are updated).

PS: If you really love Operating Systems, you can check the kernel/nfs patchs at http://comments.gmane.org - 32/64 bit llseek hashes.

Solution

Update your kernel and NFS kernel server and reboot the system:

apt-get -y dist-upgrade
reboot

If you can't reboot the system, you can also detect the file with the duplicated readdir cookie (check your syslog) and move it to another dir (or rename it to change it's cookie/hash).

Solution 2

Somewhere you have a symbolic link that points back to its parent. Use this to find it:

find /mnt/storage -type l -exec ls -l {} \;

Once you do, then perhaps you can figure out how to correct it.

Share:
15,214

Related videos on Youtube

user1434058
Author by

user1434058

Updated on September 18, 2022

Comments

  • user1434058
    user1434058 almost 2 years

    Both machines are running Ubuntu 12.04

    Remote NFSv4 Client

    $ ls /mnt/storage/aaaaaaa_aaa/bbbb/cccc_ccccc gives this error:
    ls: reading directory .: Too many levels of symbolic links
    

    How can I fix this?

    When error occurs ls start listing the files, however PHP brakes.

    On the NFSv4 Server

    In /etc/fstab:

    /mnt/storage    /srv/storage    none    bind    0 0
    

    In /etc/exports

    /srv         192.168.1.0/24(rw,async,insecure,no_subtree_check,crossmnt,fsid=0,no_root_squash)
    /srv/storage   192.168.1.0/24(rw,async,nohide,insecure,no_subtree_check,no_root_squash)
    

    ERROR

    root@ds:root@ds:/mnt/storage/foreign_dbs/imdb/imdb_htmls# ls -l | head
    ls: reading directory .: Too many levels of symbolic links
    total 10302840
    -rw-r--r-- 1 root root  10484 Jul  5 13:56 0019038.gz
    -rw-r--r-- 1 root root  16264 Mar 30 00:31 0259701.gz
    -rw-r--r-- 1 root root  13784 Mar 30 14:20 1000000.gz
    -rw-r--r-- 1 root root  12741 Mar 30 13:04 1000003.gz
    -rw-r--r-- 1 root root  12794 Mar 30 12:40 1000004.gz
    -rw-r--r-- 1 root root  13123 Mar 30 12:07 1000005.gz
    -rw-r--r-- 1 root root  13183 Mar 30 12:04 1000006.gz
    -rw-r--r-- 1 root root  13443 Jul  4 01:16 1000007.gz
    -rw-r--r-- 1 root root  12968 Mar 30 11:05 1000008.gz
    

    I came across it in PHP. scandir would return 1612577.gz & 1612579.gz, but skips 1612578.gz and yet the file types and properties are identical on them

    and this only happens on the nfs client, works 100% on the server

    • Jee Garin
      Jee Garin almost 12 years
      What's ls -l /mnt/storage/aaaaaaa_aaa/bbbb/?
    • user1434058
      user1434058 almost 12 years
      Works fine. There are few folders there. cccc_ccccc has many files under it. Also same command works on the NFS server itself fine.
    • Jee Garin
      Jee Garin almost 12 years
      What's the output? cccc_ccccc is probably a relative symlink that is causing a loop.
    • user1434058
      user1434058 almost 12 years
      800k sequentially numbered files, example: -rw-r--r-- 1 root root 13459 Mar 30 06:26 9061.gz
    • Jee Garin
      Jee Garin almost 12 years
      Ah, then how about file /mnt/storage/aaaaaaa_aaa/cccc_ccccc?
    • MastaJeet
      MastaJeet almost 12 years
      Based on bugzilla.redhat.com/show_bug.cgi?id=790729 I'm thinking it is a kernel bug but it would be hard to figure out which patch in RHEL's version of the kernel would fix the problem under Ubuntu.
  • user1434058
    user1434058 almost 12 years
    No links found.