Why file-nr and lsof count on open files differs?

43,665

There seem to be two questions in play here. First, full documentation of the file-nr and file-max structures can be found at

https://www.kernel.org/doc/Documentation/sysctl/fs.txt

This defines the fields in that file as:

The three values in file-nr denote the number of allocated file handles, the number of allocated but unused file handles, and the maximum number of file handles. Linux 2.6 always reports 0 as the number of free file handles -- this is not an error, it just means that the number of allocated file handles exactly matches the number of used file handles.

Hopefully that's clear enough. The second question has been answered in the thread mentioned above (https://serverfault.com/questions/485262/number-of-file-descriptors-different-between-proc-sys-fs-file-nr-and-proc-pi) and seems to devolve to either

  1. "use lsof" and filter the output appropriately if you need to get a good approximation of file descriptors in use by a process or,
  2. traverse through the /proc filesystem (and still have to filter the output) in order to get a snapshot in time of the file descriptor use.

The difficulty of obtaining accurate metrics for this is significant, as the number of FD's in use at any given point can fluctuate very rapidly on a system.

The following thread suggests a filtering scheme for the 'lsof' approach:

https://serverfault.com/questions/396872/why-or-how-does-the-number-of-open-file-descriptors-in-use-by-root-exceed-ulim

Share:
43,665

Related videos on Youtube

Vasanth Nag K V
Author by

Vasanth Nag K V

Updated on September 18, 2022

Comments

  • Vasanth Nag K V
    Vasanth Nag K V over 1 year

    I am running into a problem all of a sudden; all my applications and the server was running fine and all of a sudden I see the number of open files shoot up.

    I am checking it with this command:

    cat /proc/sys/fs/file-nr
    

    When I check with this it shows 44544 0 128000, so 44544 is the number of open files.

    But when I check with this command - lsof | wc -l it shows - 28384.

    So which one is correct?

    My max open files limit is 65535

    ulimit -a
    open files                      (-n) 65535

    I want to know the top 5 processes that are using more open files. I can get this from lsof but the count shown here is very different from the other command I mentioned above.

    Can I get the details of the processes counted by this command cat /proc/sys/fs/file-nr?

    According to the below mentioned link it says we cannot, How to display open file descriptors but not using lsof command

    Is there a work around for me? I need to find which process started using more open files all of a sudden.

    UPDATE Sorry guys for the trouble . I found the mistake which i was doing i was NOT checking lsof|wc -l from root. that is the reason i was seeing a huge difference.

    still there is a difference between the output of file -nr and lsof | wc -l (from root). lsof count is more than file -nr count. the reason for this is , file -nr ignores some of the directories (which are considered as files by lsof) i found this reason by a litle research on google itself. anyways! thanks guys for all the help!

    • Admin
      Admin over 9 years
    • Admin
      Admin over 9 years
      Is lsof | wc -l really showing a negative number?
    • Admin
      Admin over 9 years
      that was not a negative sign, its a hyphen!
  • Vasanth Nag K V
    Vasanth Nag K V over 9 years
    updated my question. but thanks for all the info up there.
  • Thomas N
    Thomas N about 7 years
    Please read the manual pages on lsof and ulimit for the answer to your question.