Understanding mmap

16,653

Answering things in order:

  1. It returns a pointer to the location in virtual memory, and virtual memory address space is allocated, but the file is not locked in any way unless you explicitly lock it (also note that locking the memory is not the same as locking the region in the file). An efficient implementation of mmap() is actually only possible from a practical perspective because of paging and virtual memory (otherwise, it would require reading the whole region into memory before the call completes).
  2. Not exactly, this ties into the next answer though, so I'll cover it there.
  3. Kind of. What's actually happening in most cases is that mmap() is providing copy-on-write access to that file's data in the page cache. As a result, the usual cache restrictions on data lifetime apply: if the system needs space, pages can be dropped (or flushed to disk if they're dirty) from the cache and need to be faulted in again.
  4. No, because of how virtual memory works. Each process has its own virtual address space, with its own virtual mappings. Every program that wants to communicate data will have to call mmap() on the same file (or shared memory segment), and they all have to use the MAP_SHARED flag.

It's worth noting that mmap() doesn't just work on files, you can also do other things with it such as:

  • Directly mapping device memory (if you have sufficient privileges). This is actually used on many embedded systems to avoid the need to write kernel mode drivers for new hardware.
  • Map shared memory segments.
  • Explicitly map huge pages.
  • Allocate memory that you can then call madvise(2) on which in turn lets you do useful things like prevent data from being copied to a child process on fork(2), or mark data for KSM, Linux's memory deduplication feature.
Share:
16,653

Related videos on Youtube

john
Author by

john

Updated on September 18, 2022

Comments

  • john
    john over 1 year

    I was going through documentation regarding mmap here and tried to implement it using this video.

    I have a few questions regarding its implementation.

    1. Does mmap provide a mapping of a file and return a pointer of that location in physical memory or does it return with an address of the mapping table? And does it allocate and lock space for that file too?

    2. Once the file is stored on that location in memory, does it stay there till munmap is called?

    3. Is the file even moved to memory or is it just a mapping table that serves as a redirection and the file is actually in the virtual memory - (disk)?

    4. Assuming it is moved to memory, can other processes access that space to read data if they have the address?

    • Basile Starynkevitch
      Basile Starynkevitch over 6 years
      You don't implement mmap but you are using it
  • john
    john over 6 years
    Thanks for such a detailed answer. Just a clarification on point 1. If I try to access the returned virtual memory address . It will first pass through the address map created for the process and then redirected to the actual location - which may be disk, cache or memory Secondly if the MAP_SHARED flag is on and the table for both processes return the same value for physical address then the file can be shared ?
  • Austin Hemmelgarn
    Austin Hemmelgarn over 6 years
    1. Yes, it will use the virtual memory mapping table. 2. The address in each process doesn't matter, just the fact that they have mapped the same region of the same file with MAP_SHARED.