Safest way to force close a file descriptor
Solution 1
Fiddling with a process with gdb
is almost never safe though may be
necessary if there's some emergency and the process needs to stay open
and all the risks and code involved is understood.
Most often I would simply terminate the process, though some cases may be different and could depend on the environment, who owns the relevant systems and process involved, what the process is doing, whether there is documentation on "okay to kill it" or "no, contact so-and-so first", etc. These details may need to be worked out in a post-mortem meeting once the dust settles. If there is a planned migration it would be good in advance to check whether any processes have problematic file descriptors open so those can be dealt with in a non-emergency setting (cron jobs or other scheduled tasks that run only in the wee hours when migrations may be done are easily missed if you check only during daytime hours).
Write-only versus Read versus Read-Write
Your idea to reopen the file descriptor O_WRONLY
is problematic as not
all file descriptors are write-only. John Viega and Matt Messier take a
more nuanced approach in the "Secure Programming Cookbook for C and C++"
book and handle standard input differently than standard out and
standard error (p. 25, "Managing File Descriptors Safely"):
static int open_devnull(int fd) {
FILE *f = 0;
if (!fd) f = freopen(_PATH_DEVNULL, "rb", stdin);
else if (fd == 1) f = freopen(_PATH_DEVNULL, "wb", stdout);
else if (fd == 2) f = freopen(_PATH_DEVNULL, "wb", stderr);
return (f && fileno(f) == fd);
}
In the gdb
case the descriptor (or also FILE *
handle) would need to
be checked whether it is read-only or read-write or write-only and an
appropriate replacement opened on /dev/null
. If not, a once read-only
handle that is now write-only will cause needless errors should the
process attempt to read from that.
What Could Go Wrong?
How exactly a process behaves when its file descriptors (and likely also
FILE *
handles) are fiddled behind the scenes will depend on the
process and will vary from "no big deal" should that descriptor never be
used to "nightmare mode" where there is now a corrupt file somewhere due
to unflushed data, no file-was-properly-closed indicator, or some other
unanticipated problem.
For FILE *
handles the addition of a fflush(3)
call before closing
the handle may help, or may cause double buffering or some other issue;
this is one of the several hazards of making random calls in gdb
without knowing exactly what the source code does and expects. Software
may also have additional layers of complexity built on top of fd
descriptors or the FILE *
handles that may also need to be dealt with.
Monkey patching the code could turn into a monkey wrench easily enough.
Summary
Sending a process a standard terminate signal should give it a chance
to properly close out resources, same as when a system shuts down
normally. Fiddling with a process with gdb
will likely not properly
close things out, and could make the situation very much worse.
Solution 2
open /dev/null with O_WRONLY, then dup2 to close the offending file descriptor and reuse it's descriptor for /dev/null. This way any reads or writes to the file descriptor will fail.
If you dup a descriptor to /dev/null
, any writes will not fail, but succeed, and the reads will succeed and return 0 (eof).
This may or may not be what you want.
On linux, you can also open a file with flags = 3 (O_WRONLY|O_RDWR
aka O_NOACCESS
) which will cause any read or write to fail with EBADF
.
The file will only be available for ioctls -- which brings up a danger not talked about in the other answer and comments: reads and writes are not the only operations done on file descriptors. (what about lseek
or ftruncate
?).
Update:
I found something better than the undocumented O_WRONLY|O_RDWR
: O_PATH = 010000000 / 0x200000
. According to the open(2) manpage:
O_PATH (since Linux 2.6.39) Obtain a file descriptor that can be used for two purposes: to indicate a location in the filesystem tree and to perform opera- tions that act purely at the file descriptor level. The file itself is not opened, and other file operations (e.g., read(2), write(2), fchmod(2), fchown(2), fgetxattr(2), mmap(2)) fail with the error EBADF. The following operations can be performed on the resulting file descriptor: * close(2); fchdir(2) (since Linux 3.5); fstat(2) (since Linux 3.6). * Duplicating the file descriptor (dup(2), fcntl(2) F_DUPFD, etc.).
Related videos on Youtube
Reinstate Monica
Updated on September 18, 2022Comments
-
Reinstate Monica over 1 year
Sometimes you need to unmount a filesystem or detach a loop device but it is
busy
because of open file descriptors, perhaps because of asmb
server process.To force the unmount, you can kill the offending process (or try
kill -SIGTERM
), but that would close thesmb
connection (even though some of the files it has open do not need to be closed).A hacky way to force a process to close a given file descriptor is described here using
gdb
to callclose(fd)
. This seems dangerous, however. What if the closed descriptor is recycled? The process might use the old stored descriptor not realizing it now refers to a totally different file.I have an idea, but don't know what kind of flaws it has: using
gdb
, open/dev/null
withO_WRONLY
(edit: an comment suggestedO_PATH
as a better alternative), thendup2
to close the offending file descriptor and reuse its descriptor for/dev/null
. This way any reads or writes to the file descriptor will fail.Like this:
sudo gdb -p 234532 (gdb) set $dummy_fd = open("/dev/null", 0x200000) // O_PATH (gdb) p dup2($dummy_fd, offending_fd) (gdb) p close($dummy_fd) (gdb) detach (gdb) quit
What could go wrong?
-
Sergiy Kolodyazhnyy over 5 yearsOne thing that could happen, you could loose buffered data that should have been written to that file descriptor. That of course depends on application.
-
Mark Plotnick over 5 yearsIn general, I'd rather just terminate the program (and tell any service manager controlling it not to restart it automatically) instead of making it suddenly get an EOF on one of its files. What if it had read the first 10 characters of the line
RunAsUid=12345
in a config file and you do the /dev/null thing right after that? -
Reinstate Monica over 5 years@MarkPlotnick if I could get it to return an access error or an I/O error would that solve the issue you raised?
-
Reinstate Monica over 5 years@MarkPlotnick - I haven't verified it, but the reason I open
/dev/null
with2
(=O_WRONLY
is so that reads should fail withEACCESS
instead ofEOF
-
Mark Plotnick over 5 yearsYes, opening it write-only would make reads fail, but that flag is 1. 2 is read-write.
-
Reinstate Monica over 5 yearsoops. I had meant
O_WRONLY
. I'll edit that. -
pizdelect over 5 years@afuna please update your question to use
0x200000
(O_PATH
). that's the way to go if you want an "opaque" file descriptor (on which not only reads and writes, but also ioctl, lseek, ftruncate and fchmod will fail.
-
-
Reinstate Monica over 5 yearsThanks! The point of using
O_WRONLY
is specifically to cause errors - we don't want the process to get an incorrectEOF
, like @MarkPlotnick commented above. -
Andrew Henle over 5 yearsOn linux, you can also open a file with flags = 3 (
O_WRONLY|O_RDWR
akaO_NOACCESS
) Ooof. That directly violates the POSIX standard: "Applications shall specify exactly one of the first five values (file access modes) below in the value ofoflag
" -
pizdelect over 5 yearsI did not claim that was portable ;-) It's an obscure hack, only used in
lilo
if I'm not mistaken. -
Admin about 2 yearsbut sure is fun bleeding the process of their handles and seeing them slowly die off, I do that on windows all the times with process explorer.
-
Admin about 2 yearsI just wanted a simpler way of yanking a file handle from unix and CLOSING it, muahaha (I'm almost sure the process explorer is injecting a debugger somewhat similar to gdb to do that, I mean, you could kill the handle from the kernel side)
-
Admin about 2 yearslinux has its own flavour of POSIX, its fine