memory limit of the Linux kernel

8,338

Solution 1

The 1 GiB limit for Linux kernel memory in a 32-bit system is a consequence of 32-bit addressing, and it's a pretty stiff limit. It's not impossible to change, but it's there for a very good reason; changing it has consequences.

Let's take the wayback machine to the early 1990s, when Linux was being created. Back in those days, we'd have arguments about whether Linux could be made to run in 2 MiB of RAM or if it really needed 4 whole MiB. Of course, the high-end snobs were all sneering at us, with their 16 MiB monster servers.

What does that amusing little vignette have to do with anything? In that world, it's easy to make decisions about how to divide up the 4 GiB address space you get from simple 32-bit addressing. Some OSes just split it in half, treating the top bit of the address as the "kernel flag": addresses 0 to 231-1 had the top bit cleared, and were for user space code, and addresses 231 through 232-1 had the top bit set, and were for the kernel. You could just look at the address and tell: 0x80000000 and up, it's kernel-space, otherwise it's user-space.

As PC memory sizes ballooned toward that 4 GiB memory limit, this simple 2/2 split started to become a problem. User space and kernel space both had good claims on lots of RAM, but since our purpose in having a computer is generally to run user programs, rather than to run kernels, OSes started playing around with the user/kernel divide. The 3/1 split is a common compromise.

As to your question about physical vs virtual, it actually doesn't matter. Technically speaking, it's a virtual memory limit, but that's just because Linux is a VM-based OS. Installing 32 GiB of physical RAM won't change anything, nor will it help to swapon a 32 GiB swap partition. No matter what you do, a 32-bit Linux kernel will never be able to address more than 4 GiB simultaneously.

(Yes, I know about PAE. Now that 64-bit OSes are finally taking over, I hope we can start forgetting that nasty hack. I don't believe it can help you in this case anyway.)

The bottom line is that if you're running into the 1 GiB kernel VM limit, you can rebuild the kernel with a 2/2 split, but that directly impacts user space programs.

64-bit really is the right answer.

Solution 2

I want to add a little to Warren Young's excellent answer, because things are actually worse than he writes.

The 1GB kernel address space is further divided into two parts. 128MB are for vmalloc and 896MB for lowmem. Never mind what it actually means. When allocating memory, kernel code must choose which of these it wants. You can't just get memory from whichever pool has free space.

If you choose vmalloc, you're limited to 128MB. Now 1GB doesn't look so bad...

If you choose lowmem, you're limited to 896MB. Not so far from 1GB, but in this case, all allocations are rounded up to the next power of 2. So a 2.3MB allocation actually consumes 4MB. Also, you can't allocate more than 4MB in one call when using lowmem.

64-bit really is the right answer.

Share:
8,338

Related videos on Youtube

Andrew Falanga
Author by

Andrew Falanga

Updated on September 18, 2022

Comments

  • Andrew Falanga
    Andrew Falanga almost 2 years

    I have a perplexing problem. I have a library which uses sg for executing customized CDBs. There are a couple of systems which routinely have issues with memory allocation in sg. Usually, the sg driver has a hard limit of around 4mb, but we're seeing it on these few systems with ~2.3mb requests. That is, the CDBs are preparing to allocate for a 2.3mb transfer. There shouldn't be any issue here: 2.3 < 4.0.

    Now, the profile of the machine. It is a 64 bit CPU but runs CentOS 6.0 32-bit (I didn't build them nor do I have anything to do with this decision). The kernel version for this CentOS distro is 2.6.32. They have 16gb of RAM.

    Here is what the memory usage looks like on the system (though, because this error occurs during automated testing, I have not verified yet if this reflects the state when this errno is returned from sg).

    top - 00:54:46 up 5 days, 22:05,  1 user,  load average: 0.00, 0.01, 0.21
    Tasks: 297 total,   1 running, 296 sleeping,   0 stopped,   0 zombie
    Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
    Mem:  15888480k total,  9460408k used,  6428072k free,   258280k buffers
    Swap:  4194296k total,        0k used,  4194296k free,  8497424k cached
    

    I found this article from Linux Journal which is about allocating memory in the kernel. The article is dated but does seem to pertain to 2.6 (some comments about the author at the head). The article mentions that the kernel is limited to about 1gb of memory (though it's not entirely clear from the text if that 1gb each for physical and virtual or total). I'm wondering if this is an accurate statement for 2.6.32. Ultimately, I'm wondering if these systems are hitting this limit.

    Though this isn't really an answer to my problem, I'm wondering about the veracity of the claim for 2.6.32. So then, what is the actual limit of memory for the kernel? This may need to be a consideration for troubleshooting. Any other suggestions are welcome. What makes this so baffling is that these systems are identical to many others which do not show this same problem.

  • Andrew Falanga
    Andrew Falanga over 9 years
    Thanks. This writeup is great. I have run into the 2/2 split commonly used in Windows. At that time, I learned that Linux used a 3/1 split. I wished I'd thought of that when reading the article, I think I would have connected the dots. So ... this sounds like I'll have to keep this in mind. It's probably not far out of reach to think these systems are hitting the limits considering the nature of the tests. The big question is, why aren't the other systems also experiencing this. Thanks again.
  • Warren Young
    Warren Young over 9 years
    @AndrewFalanga: Actually, modern Windows uses a fuzzy 3/1 split, too.
  • dmckee --- ex-moderator kitten
    dmckee --- ex-moderator kitten over 9 years
    Some of us were able to combine the memory from three different machines inherited from the SSC to get a 12 MB server. So much memory we could do anything we wanted...
  • user
    user over 9 years
    "Yes, I know about the x86 segmented memory model. Now that 32-bit OSes are finally taking over, I hope we can start forgetting that nasty hack."
  • Warren Young
    Warren Young over 9 years
    There are twice as many doublings between 32- and 64-bits as between 16- and 32, which doubles the amount of time we have to put off such hacks, all else being equal. But all else is not equal, what with the sunsetting of Moore's Law. We got two decades out of 32-bit x86 computing. We might get centuries out of 64-bit. A single-pass read of 2⁶⁴ bytes of RAM at today's DRAM bandwidths would take about 30 years. Where is the bandwidth increase going to come from to enable us to approach the 64-bit limit?
  • Andrew Falanga
    Andrew Falanga over 9 years
    I have a question related to your answer. For this space of memory named lowmem, is this where memory from calls such as kmalloc and kzmalloc come from?
  • ugoren
    ugoren over 9 years
    @AndrewFalanga, yes, these functions use lowmem.
  • ogur
    ogur over 9 years
    @WarrenYoung The 3/1 split you are referring to is about physical address space, not virtual address space.
  • Warren Young
    Warren Young over 9 years
    @AndrewJ.Brehm: On a 32-bit system without PAE enabled, isn't that a distinction without a difference?
  • ogur
    ogur over 9 years
    No. Virtual address space and physical address space are two completely distinct things. Changing the physical address space doesn't change that difference. Plain 32 bit maps 4 GB per process to 4 GB physical (plus swap), PAE maps 4 GB per process to 32 GB physical (or whatever it is, plus swap). The 3/1 split you were refering to affects plain 32 bit machines with 4 GB physical address space, roughly 1 GB of which is needed for addressable devices other than main memory. That split exists whether or not the OS supports virtual memory.
  • Warren Young
    Warren Young over 9 years
    @AndrewJ.Brehm: First, we are talking about an OS that supports virtual memory: Linux. Anything else would be off-topic. Second, a program running on a 32-bit Linux machine with 4 GiB of physical RAM, no PAE, and a 3/1 split will not be able to allocate more than 3 GiB of memory, no matter whether you consider it in physical or virtual terms. Thus, the niggly distinction makes no practical difference.