How remap_pfn_range remaps kernel memory to user space?

24,045

Solution 1

It's simple really, kernel memory (usually) simply has a page table entry with the architecture specific bit that says: "this page table entry is only valid while the CPU is in kernel mode".

What remap_pfn_range does is create another page table entry, with a different virtual address to the same physical memory page that doesn't have that bit set.

Usually, it's a bad idea btw :-)

Solution 2

The core of the mechanism is page table MMU:

Related image1 http://windowsitpro.com/content/content/3686/figure_01.gif

or this:

Related image

Both picture above are characteristics of x86 hardware memory MMU, nothing to do with Linux kernel.

Below described how the VMAs is linked to the process's task_struct:

Related image http://image9.360doc.com/DownloadImg/2010/05/0320/3083800_2.gif

Related image
(source: slideplayer.com)

And looking into the function itself here:

http://lxr.free-electrons.com/source/mm/memory.c#L1756

The data in physical memory can be accessed by the kernel through the kernel's PTE, as shown below:

Image result for page protection flags linux kernel
(source: tldp.org)

But after calling remap_pfn_range() a PTE (for an existing kernel memory but to be used in userspace to access it) is derived (with different page protection flags). The process's VMA memory will be updated to use this PTE to access the same memory - thus minimizing the need to waste memory by copying. But kernel and userspace PTE have different attributes - which is used to control the access to the physical memory, and the VMA will also specified the attributes at the process level:

vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;

Share:
24,045
Admin
Author by

Admin

Updated on July 05, 2022

Comments

  • Admin
    Admin almost 2 years

    remap_pfn_range function (used in mmap call in driver) can be used to map kernel memory to user space. How is it done? Can anyone explain precise steps? Kernel Mode is a privileged mode (PM) while user space is non privileged (NPM). In PM CPU can access all memory while in NPM some memory is restricted - cannot be accessed by CPU. When remap_pfn_range is called, how is that range of memory which was restricted only to PM is now accessible to user space?

    Looking at remap_pfn_range code there is pgprot_t struct. This is protection mapping related struct. What is protection mapping? Is it the answer to above question?

  • user31986
    user31986 almost 9 years
    "part of that coincide with that of kernel's page table, which is NOT duplicated for each process" when you say that do you mean there is only one page-table copy for the kernel mapping that is used by all processes? Could you please elaborate more on how that could be done?
  • Peter Teoh
    Peter Teoh almost 9 years
    Perhaps read this: turkeyland.net/projects/overflow/intro.php and from the picture you can see that one process ONE set of page tables, whose base address will be loaded into the CR3 register. And for all those virtual addresses (kernel specifically) that is to be shared among different process, all these will have the same value pointing to the same physical page. hope that clear up.
  • Adam Miller
    Adam Miller over 8 years
    How does one hold the "mm semaphore"?
  • Peter Teoh
    Peter Teoh over 7 years
    This global variable is per-process, but multiple concurrent threads inside the process may acquire it, and thus locking is necessary via up_read() or down_read().