Process running out of open file handles

7,128

The best way to tell how many open file descriptors your process has is to use:

$ ls /proc/8301/fd/ | wc -l

(Assuming PID 8301, like in your log.)

Running lsof will traverse the whole /proc tree and will try to resolve the names of all files (these are pseudo-symlinks and need a call to readlink each for resolution), so running lsof will take a long time (depending on how busy your machine is), so by the time you look at the result, it's possible everything has changed already. Using ls /proc/${pid}/fd/ will be quick (only one readdir call), so much more likely to capture something close to the current situation.

Regarding solving the problem, you may want to consider increasing the number of file descriptors allowed to your service, which you can do by setting the LimitNOFILE= directive in your systemd unit file.

Share:
7,128

Related videos on Youtube

Marged
Author by

Marged

Updated on September 18, 2022

Comments

  • Marged
    Marged over 1 year

    My application which is based on two Java processes that interchange data over a http connection runs ouf of files and produces this error message:

    Aug 14 11:27:40 server sender[8301]: java.io.IOException: Too many open files
    Aug 14 11:27:40 server sender[8301]: at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
    Aug 14 11:27:40 server sender[8301]: at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
    Aug 14 11:27:40 server sender[8301]: at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
    Aug 14 11:27:40 server sender[8301]: at org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpoint.java:455)
    Aug 14 11:27:40 server sender[8301]: at java.lang.Thread.run(Thread.java:748)
    

    Both processes are under control of SystemD. I checked the processes using cat /proc/5882/limits, the limits are defined like this:

    Limit                     Soft Limit           Hard Limit           Units
    Max cpu time              unlimited            unlimited            seconds
    Max file size             unlimited            unlimited            bytes
    Max data size             unlimited            unlimited            bytes
    Max stack size            8388608              unlimited            bytes
    Max core file size        0                    unlimited            bytes
    Max resident set          unlimited            unlimited            bytes
    Max processes             63434                63434                processes
    Max open files            4096                 4096                 files
    Max locked memory         65536                65536                bytes
    Max address space         unlimited            unlimited            bytes
    Max file locks            unlimited            unlimited            locks
    Max pending signals       63434                63434                signals
    Max msgqueue size         819200               819200               bytes
    Max nice priority         0                    0
    Max realtime priority     0                    0
    Max realtime timeout      unlimited            unlimited            us
    

    When I run lsof | grep pid | wc -l I have less than 2000 entries (I run lsof this way because of information retrieved from Discrepancy with lsof command when trying to get the count of open files per process)

    I don't have the slightest idea what I could check or increase further.

    • thrig
      thrig over 5 years
      linux has a variety of knobs unix.stackexchange.com/questions/84227 have you tuned them all?
    • ajeh
      ajeh over 5 years
      Wrap your method into try/catch and dump the number of file handles when an exception is trapped. This is a Q for SO, not USE.
    • Marged
      Marged over 5 years
      ajeh I search for a method to display the correct number of file handles on Linux level, because of this I didn't post the question on SO
    • Wildcard
      Wildcard over 5 years
      You linked to the post about the discrepancy with lsof when trying to get count of open files per process, but why not use the solution shown there? lsof -aKp "$pid"
    • Marged
      Marged over 5 years
      @Wildcard both approaches effectively return the same result, one of them is slightly faster My approach enables me to grep for the path I know my files are stored in and find possibly other pids using them too
  • Marged
    Marged over 5 years
    The limit is 4096, lsof shows less than 2000. I assume raising the limit does not solve the problem
  • filbranden
    filbranden over 5 years
    @Marged It's hard to tell, depending on how busy your service is, files might be coming and going faster than you can monitor them... Try to monitor it using the ls /proc/${pid}/fd/ | wc -l command, which will give you a much better estimate of the current number of files than running lsof on all processes of the machine. Furthermore, there's really not much of a downside in increasing the open file limit, so bumping it up to 8192 or maybe even something like 32768 might be just fine, regardless... Good luck!