How exactly does fopen(), fclose() work?

20,877

Solution 1

Disclaimer: I'm mostly unqualified to talk about this. It'd be great if someone more knowledgeable posted too.

Files

The details of how things like fopen() are implemented will depend a lot on the operating system (UNIX has fopen() too, for example). Even versions of Windows can differ a lot from each other.

I'll give you my idea of how it works, but it's basically speculation.

  • When called, fopen allocates a FILE object on the heap. Note that the data in a FILE object is undocumented - FILE is an opaque struct, you can only use pointers-to-FILE from your code.
  • The FILE object gets initialized. For example, something like fillLevel = 0 where fillLevel is the amount of buffered data that hasn't been flushed yet.
  • A call to the filesystem driver (FS driver) opens the file and provides a handle to it, which is put somewhere in the FILE struct.
    • To do this, the FS driver figures out the HDD address corresponding to the requested path, and internally remembers this HDD address, so it can later fulfill calls to fread etc.
      • The FS driver uses a sort of indexing table (stored on the HDD) to figure out the HDD address corresponding to the requested path. This will differ a lot depending on the filesystem type - FAT32, NTFS and so on.
      • The FS driver relies on the HDD driver to perform the actual reads and writes to the HDD.
  • A cache might be allocated in RAM for the file. This way, if the user requests 1 byte to be read, C++ may read a KB just in case, so later reads will be instantaneous.
  • A pointer to the allocated FILE gets returned from fopen.

If you open a file and never close it, some things will leak, yes. The FILE struct will leak, the FS driver's internal data will leak, the cache (if any) will leak too.

But memory is not the only thing that will leak. The file itself will leak, because the OS will think it's open when it's not. This can become a problem for example in Windows, where a file opened in write-mode cannot be opened in write-mode again until it's been closed.

If your app exits without closing some file, most OSes will clean up after it. But that's not much use, because your app will probably run for a long time before exiting, and during that time, it will still need to properly close all files. Also, you can't fully rely on the OS to clean up after you - it's not guaranteed in the C Standard.

Sockets

A socket's implementation will depend on the type of socket - network listen socket, network client socket, inter-process socket, etc.

A full discussion of all types of sockets and their possible implementations wouldn't fit here.

In short:

  • just like a file, a socket keeps some info in RAM, describing things relevant to its operation, such as the IP of the remote host.
  • it can also have caches in RAM for performance reasons
  • it can hold onto finite OS resources such as open ports, making them unavailable for use by other apps

All these things will leak if you don't close the socket.

The role of the OS in sockets

The OS implements the TCP/IP standard, Ethernet and other protocols needed to schedule/dispatch/accept connections and to make them available to user code via an API like Berkeley Sockets.

The OS will delegate network I/O (communication with the network card) to the network driver.

Solution 2

With VS2017 on Windows 10, you can see the internal by callstack:

ntdll.dll!NtCreateFile()   Unknown
KernelBase.dll!CreateFileInternal() Unknown
KernelBase.dll!CreateFileW()   Unknown
ucrtbased.dll!create_file(const wchar_t * const path, _SECURITY_ATTRIBUTES * const security_attributes, const `anonymous-namespace'::file_options options) Line 234 C++
ucrtbased.dll!_wsopen_nolock(int * punlock_flag, int * pfh, const wchar_t * path, int oflag, int shflag, int pmode, int secure) Line 702    C++
ucrtbased.dll!_sopen_nolock(int * punlock_flag, int * pfh, const char * path, int oflag, int shflag, int pmode, int secure) Line 852    C++
ucrtbased.dll!__crt_char_traits<char>::tsopen_nolock<int * __ptr64,int * __ptr64,char const * __ptr64 const & __ptr64,int const & __ptr64,int,int const & __ptr64,int>(int * && <args_0>, int * && <args_1>, const char * const & <args_2>, const int & <args_3>, int && <args_4>, const int & <args_5>, int && <args_6>) Line 109  C++
ucrtbased.dll!common_sopen_dispatch<char>(const char * const path, const int oflag, const int shflag, const int pmode, int * const pfh, const int secure) Line 172  C++
ucrtbased.dll!_sopen_dispatch(const char * path, int oflag, int shflag, int pmode, int * pfh, int secure) Line 204  C++
ucrtbased.dll!_sopen_s(int * pfh, const char * path, int oflag, int shflag, int pmode) Line 895 C++
ucrtbased.dll!__crt_char_traits<char>::tsopen_s<int * __ptr64,char const * __ptr64 const & __ptr64,int const & __ptr64,int const & __ptr64,int>(int * && <args_0>, const char * const & <args_1>, const int & <args_2>, const int & <args_3>, int && <args_4>) Line 109 C++
ucrtbased.dll!common_openfile<char>(const char * const file_name, const char * const mode, const int share_flag, const __crt_stdio_stream stream) Line 38   C++
ucrtbased.dll!_openfile(const char * file_name, const char * mode, int share_flag, _iobuf * public_stream) Line 67  C++
ucrtbased.dll!__crt_char_traits<char>::open_file<char const * __ptr64 const & __ptr64,char const * __ptr64 const & __ptr64,int const & __ptr64,_iobuf * __ptr64>(const char * const & <args_0>, const char * const & <args_1>, const int & <args_2>, _iobuf * && <args_3>) Line 109 C++
ucrtbased.dll!common_fsopen<char>(const char * const file_name, const char * const mode, const int share_flag) Line 54  C++
ucrtbased.dll!fopen(const char * file, const char * mode) Line 104  C++

Most code are in:

C:\Program Files (x86)\Windows Kits\10\Source\10.0.17763.0\ucrt\stdio\fopen.cpp
C:\Program Files (x86)\Windows Kits\10\Source\10.0.17763.0\ucrt\stdio\openfile.cpp
C:\Program Files (x86)\Windows Kits\10\Source\10.0.17763.0\ucrt\lowio\open.cpp

In _wsopen_nolock in open.cpp, there is:

// Allocate the CRT file handle.  Note that if a handle is allocated, it is
// locked when it is returned by the allocation function.  It is our caller's
// responsibility to unlock the file handle (we do not unlock it before
// returning).
*pfh = _alloc_osfhnd();

Finally, it calls Windows API CreateFileW which calls hiden API "NtCreateFile" whose assembly code is:

NtCreateFile:
00007FFFD81A0120 mov         r10,rcx  
00007FFFD81A0123 mov         eax,55h  
00007FFFD81A0128 test        byte ptr[7FFE0308h],1  
00007FFFD81A0130 jne         NtCreateFile+15h(07FFFD81A0135h)
00007FFFD81A0132 syscall
00007FFFD81A0134 ret
00007FFFD81A0135 int         2Eh  
00007FFFD81A0137 ret
00007FFFD81A0138 nop         dword ptr[rax + rax]

So finally it execute the syscall instruction which goes into kernel code.

Share:
20,877
Fabian
Author by

Fabian

Updated on August 25, 2022

Comments

  • Fabian
    Fabian over 1 year

    I was just wondering about the functions fopen, fclose, socket and closesocket. When calling fopen or opening a socket, what exactly is happening (especially memory wise)?

    Can opening files/sockets without closing them cause memory leaks?

    And third, how are sockets created and what do they look like memory wise?

    I'm also interrested in the role of the operating system (Windows) in reading the sockets and sending the data.