64-bit Linux doesn't recognize my RAM between 3 and 32 GB

19,355

Solution 1

First, if your BIOS/UEFI does not detect correctly your RAM, then your OS won't do any better. There's no need to go any further if your BIOS display incorrect information about your setup.

=> You probably have at least an hardware problem.

EDIT: From your dmesg | grep memory, it seems that you have in fact an hardware problem, located in your embedded bios. At least, Linux has detected it and warns you about it : WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 13295MB of RAM. It also seems that one of your 4 ram module is incorrectly recognised or inserted.

You can either report it to your manufacturer, upgrade your bios and change your motherboard. There's many chance that with less RAM, you won't encounter this bug.

As a side note, you may agree with this famous quote from Linus Torvalds about BIOS makers :

BIOS writers are invariably totally incompetent crack-addicted monkeys

Second, when your BIOS is OK with what you really have on your motherboard, you can take a look on Linux at /proc/meminfo. It's often very clear about what your linux system know and do with your memory. Here is what I have on my 64bit / 8 Gb of RAM :

$ cat /proc/meminfo 
MemTotal:        8175652 kB
MemFree:         5476336 kB
Buffers:           63924 kB
Cached:          1943460 kB
SwapCached:            0 kB
[...]

About the boot process and what is used/freed by linux kernel, you can grep it from dmesg :

$ dmesg | grep Memory
[    0.000000] Memory: 8157672k/8904704k available (6138k kernel code, 534168k absent, 212864k reserved, 6896k data, 988k init)

EDIT : As Gilles said, with dmidecode --type memory, you can have details about your hardware configuration. It looks like this for a 4x2Gb system :

$ sudo dmidecode --type memory
# dmidecode 2.9
SMBIOS 2.6 present.

Handle 0x0020, DMI type 16, 15 bytes
Physical Memory Array
    Location: System Board Or Motherboard
    Use: System Memory
    Error Correction Type: None
    Maximum Capacity: 32 GB
    Error Information Handle: Not Provided
    Number Of Devices: 4

Handle 0x0022, DMI type 17, 28 bytes
Memory Device
    Array Handle: 0x0020
    Error Information Handle: Not Provided
    Total Width: 64 bits
    Data Width: 64 bits
    Size: 2048 MB
    [...]
[This block is repeated for each module]

Solution 2

Search /var/log/dmesg for memory map (grep for 'e820') and count how many memory is reported there as usable. This is what BIOS tells to loaded OS for memory.

(This is correct only for old-styled boot. I don't know how the memory is reported if EFI-styled boot is used, but I guess there is similar report.)

Also, reporting 16GB by BIOS while 32GB is installed means some weirdness in memory setup. Try to reduce installed memory to 4 or 8 GB and compare effects.

Share:
19,355

Related videos on Youtube

user
Author by

user

Updated on September 18, 2022

Comments

  • user
    user over 1 year

    My problems were caused by a faulty memory module and quite possibly a broken kernel binary.


    I just now booted my PC with basically brand new hardware. I've been running Debian 6.0 AMD64 before, and no change there (literally; I just unplugged the hard disks from the old motherboard and reconnected them to the new one), but found something curious:

    • I have physically installed 4 x 8 GB of RAM
    • UEFI/BIOS setup reports 16383 MB of RAM
    • Linux free -m reports 2985 MB of RAM

    2985 MB seems too close to the magical 3 GB mark for it to be purely coincidence, but uname -r prints 2.6.32-5-amd64; clearly a 64-bit kernel, which is all that has ever been installed on the system drive I'm using. The new motherboard is an Asus M5A97 Pro, which has four DDR3 slots supposedly supporting 8 GB modules. The memory modules themselves are identical, four Corsair XMS3 PC12800 8 GB, purchased together.

    I haven't looked around the UEFI setup in detail, but did browse through it and saw nothing that seemed like it would need changing to enable large amounts of RAM.

    Edit: Further confirmation that I really am running 64-bit:

    # file `which free`
    /usr/bin/free: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped
    #
    

    What's up with this, and what can I do about it?

    Edit 2: dmesg, dmidecode and meminfo, as requested. I don't have physical access to the system right now, so will have to wait until tonight to pull out some modules and see what that does. (Note that dmidecode reports 3 x 8GB plus one empty DIMM slot. Also note the MTRR mismatch message from the kernel, leading to a loss of 13 GB, which at least adds up with what the motherboard itself is reporting.)

    # dmidecode --type memory
    # dmidecode 2.9
    SMBIOS 2.7 present.
    
    Handle 0x0026, DMI type 16, 23 bytes
    Physical Memory Array
            Location: System Board Or Motherboard
            Use: System Memory
            Error Correction Type: Multi-bit ECC
            Maximum Capacity: 32 GB
            Error Information Handle: Not Provided
            Number Of Devices: 4
    
    Handle 0x0028, DMI type 17, 34 bytes
    Memory Device
            Array Handle: 0x0026
            Error Information Handle: Not Provided
            Total Width: 64 bits
            Data Width: 64 bits
            Size: 8192 MB
            Form Factor: DIMM
            Set: None
            Locator: DIMM0
            Bank Locator: BANK0
            Type: <OUT OF SPEC>
            Type Detail: Synchronous
            Speed: 1333 MHz (0.8 ns)
            Manufacturer: Manufacturer0
            Serial Number: SerNum0
            Asset Tag: AssetTagNum0
            Part Number: Array1_PartNumber0
    
    Handle 0x002A, DMI type 17, 34 bytes
    Memory Device
            Array Handle: 0x0026
            Error Information Handle: Not Provided
            Total Width: 64 bits
            Data Width: 64 bits
            Size: 8192 MB
            Form Factor: DIMM
            Set: None
            Locator: DIMM1
            Bank Locator: BANK1
            Type: <OUT OF SPEC>
            Type Detail: Synchronous
            Speed: 1333 MHz (0.8 ns)
            Manufacturer: Manufacturer1
            Serial Number: SerNum1
            Asset Tag: AssetTagNum1
            Part Number: Array1_PartNumber1
    
    Handle 0x002C, DMI type 17, 34 bytes
    Memory Device
            Array Handle: 0x0026
            Error Information Handle: Not Provided
            Total Width: 64 bits
            Data Width: 64 bits
            Size: 8192 MB
            Form Factor: DIMM
            Set: None
            Locator: DIMM2
            Bank Locator: BANK2
            Type: <OUT OF SPEC>
            Type Detail: Synchronous
            Speed: 1333 MHz (0.8 ns)
            Manufacturer: Manufacturer2
            Serial Number: SerNum2
            Asset Tag: AssetTagNum2
            Part Number: Array1_PartNumber2
    
    Handle 0x002E, DMI type 17, 34 bytes
    Memory Device
            Array Handle: 0x0026
            Error Information Handle: Not Provided
            Total Width: Unknown
            Data Width: 64 bits
            Size: No Module Installed
            Form Factor: DIMM
            Set: None
            Locator: DIMM3
            Bank Locator: BANK3
            Type: Unknown
            Type Detail: Synchronous
            Speed: Unknown
            Manufacturer: Manufacturer3
            Serial Number: SerNum3
            Asset Tag: AssetTagNum3
            Part Number: Array1_PartNumber3
    #
    ======================================================================
    # cat /proc/meminfo
    MemTotal:        3056820 kB
    MemFree:         1470820 kB
    Buffers:          390204 kB
    Cached:           194660 kB
    SwapCached:            0 kB
    Active:           488024 kB
    Inactive:         419096 kB
    Active(anon):     231112 kB
    Inactive(anon):    96660 kB
    Active(file):     256912 kB
    Inactive(file):   322436 kB
    Unevictable:           0 kB
    Mlocked:               0 kB
    SwapTotal:             0 kB
    SwapFree:              0 kB
    Dirty:                 8 kB
    Writeback:             0 kB
    AnonPages:        322320 kB
    Mapped:            33012 kB
    Shmem:              5472 kB
    Slab:             613952 kB
    SReclaimable:     597404 kB
    SUnreclaim:        16548 kB
    KernelStack:        2384 kB
    PageTables:        19472 kB
    NFS_Unstable:          0 kB
    Bounce:                0 kB
    WritebackTmp:          0 kB
    CommitLimit:     1528408 kB
    Committed_AS:     621464 kB
    VmallocTotal:   34359738367 kB
    VmallocUsed:      294484 kB
    VmallocChunk:   34359429080 kB
    HardwareCorrupted:     0 kB
    HugePages_Total:       0
    HugePages_Free:        0
    HugePages_Rsvd:        0
    HugePages_Surp:        0
    Hugepagesize:       2048 kB
    DirectMap4k:        9216 kB
    DirectMap2M:     2054144 kB
    DirectMap1G:     1048576 kB
    #
    ======================================================================
    # dmesg | grep -i memory
    [    0.000000] WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 13295MB of RAM.
    [    0.000000] WARNING: at /tmp/buildd/linux-2.6-2.6.32/debian/build/source_amd64_none/arch/x86/kernel/cpu/mtrr/cleanup.c:1092 mtrr_trim_uncached_memory+0x2e6/0x311()
    [    0.000000]  [<ffffffff814f7f1e>] ? mtrr_trim_uncached_memory+0x2e6/0x311
    [    0.000000]  [<ffffffff814f7f1e>] ? mtrr_trim_uncached_memory+0x2e6/0x311
    [    0.000000]  [<ffffffff814f7f1e>] ? mtrr_trim_uncached_memory+0x2e6/0x311
    [    0.000000] initial memory mapped : 0 - 20000000
    [    0.000000] init_memory_mapping: 0000000000000000-00000000bdf00000
    [    0.000000] PM: Registered nosave memory: 000000000009d000 - 000000000009e000
    [    0.000000] PM: Registered nosave memory: 000000000009e000 - 00000000000a0000
    [    0.000000] PM: Registered nosave memory: 00000000000a0000 - 00000000000e0000
    [    0.000000] PM: Registered nosave memory: 00000000000e0000 - 0000000000100000
    [    0.000000] PM: Registered nosave memory: 00000000bd94d000 - 00000000bd99c000
    [    0.000000] PM: Registered nosave memory: 00000000bd99c000 - 00000000bd9a6000
    [    0.000000] PM: Registered nosave memory: 00000000bd9a6000 - 00000000bdade000
    [    0.000000] PM: Registered nosave memory: 00000000bdade000 - 00000000bdaef000
    [    0.000000] PM: Registered nosave memory: 00000000bdaef000 - 00000000bdb02000
    [    0.000000] PM: Registered nosave memory: 00000000bdb02000 - 00000000bdb04000
    [    0.000000] PM: Registered nosave memory: 00000000bdb04000 - 00000000bdb0d000
    [    0.000000] PM: Registered nosave memory: 00000000bdb0d000 - 00000000bdb13000
    [    0.000000] PM: Registered nosave memory: 00000000bdb13000 - 00000000bdb75000
    [    0.000000] PM: Registered nosave memory: 00000000bdb75000 - 00000000bdd78000
    [    0.000000] Memory: 3046732k/3111936k available (3075k kernel code, 4728k absent, 60476k reserved, 1879k data, 584k init)
    [    1.636730] Freeing initrd memory: 9501k freed
    [    1.647370] Freeing unused kernel memory: 584k freed
    [    4.876602] [TTM] Zone  kernel: Available graphics memory: 1528410 kiB.
    [    4.876615] [drm] radeon: 256M of VRAM memory ready
    [    4.876617] [drm] radeon: 512M of GTT memory ready.
    [   25.571018] VBoxDrv: dbg - g_abExecMemory=ffffffffa051d6c0
    #
    

    Grepping for e820 shows a bunch of ranges, topping out with e820 update range: 00000000bdf00000 - 000000043f000000 (usable) ==> (reserved). 43f000000 is 16 GiB, bdf00000 is 3039 MiB. I do not see that being coincidental.

    # dmesg | grep -i e820
    [    0.000000]  BIOS-e820: 0000000000000000 - 000000000009d800 (usable)
    [    0.000000]  BIOS-e820: 000000000009d800 - 00000000000a0000 (reserved)
    [    0.000000]  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
    [    0.000000]  BIOS-e820: 0000000000100000 - 00000000bd94d000 (usable)
    [    0.000000]  BIOS-e820: 00000000bd94d000 - 00000000bd99c000 (ACPI NVS)
    [    0.000000]  BIOS-e820: 00000000bd99c000 - 00000000bd9a6000 (ACPI data)
    [    0.000000]  BIOS-e820: 00000000bd9a6000 - 00000000bdade000 (reserved)
    [    0.000000]  BIOS-e820: 00000000bdade000 - 00000000bdaef000 (ACPI NVS)
    [    0.000000]  BIOS-e820: 00000000bdaef000 - 00000000bdb02000 (reserved)
    [    0.000000]  BIOS-e820: 00000000bdb02000 - 00000000bdb04000 (ACPI NVS)
    [    0.000000]  BIOS-e820: 00000000bdb04000 - 00000000bdb0d000 (reserved)
    [    0.000000]  BIOS-e820: 00000000bdb0d000 - 00000000bdb13000 (ACPI NVS)
    [    0.000000]  BIOS-e820: 00000000bdb13000 - 00000000bdb75000 (reserved)
    [    0.000000]  BIOS-e820: 00000000bdb75000 - 00000000bdd78000 (ACPI NVS)
    [    0.000000]  BIOS-e820: 00000000bdd78000 - 00000000bdf00000 (usable)
    [    0.000000]  BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
    [    0.000000]  BIOS-e820: 00000000fec10000 - 00000000fec11000 (reserved)
    [    0.000000]  BIOS-e820: 00000000fec20000 - 00000000fec21000 (reserved)
    [    0.000000]  BIOS-e820: 00000000fed00000 - 00000000fed01000 (reserved)
    [    0.000000]  BIOS-e820: 00000000fed61000 - 00000000fed71000 (reserved)
    [    0.000000]  BIOS-e820: 00000000fed80000 - 00000000fed90000 (reserved)
    [    0.000000]  BIOS-e820: 00000000fef00000 - 0000000100000000 (reserved)
    [    0.000000]  BIOS-e820: 0000000100001000 - 000000043f000000 (usable)
    [    0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved)
    [    0.000000] e820 update range: 00000000bdf00000 - 000000043f000000 (usable) ==> (reserved)
    [    0.000000] update e820 for mtrr
    # 
    

    EDIT 3/4 -- partial success:

    • Upgrading the UEFI BIOS from version 0705 x64 08/23/2011 to 1007 02/10/2012 did not help: the exact same problem remained.
    • Removing one DIMM module (I took a lucky guess at which slot was #4: the one farthest from the CPU) allowed the BIOS to detect and use the remaining 24 GB, although a three-DIMM configuration is not "recommended" according to the diagram in the user's manual. Notably, seating one of the remaining DIMMs in slot #4 still allowed it to be used, so the slot is fine. Reseating the "original" DIMM into that slot dropped me back at my starting point.
    • Booting from the Debian 6.0.3 AMD64 installation CD into a rescue environment and checking its dmesg output shows no similar MTRR errors. Also, in that environment, with 3 x 8GB installed, 24 GB (plus or minus epsilon times pi or thereabouts; I didn't do the exact math) shows up as usable according to free.
    • Upgrading/reinstalling the kernel (there was a minor upgrade available) seems to have fixed the MTRR issues as well. dmesg now reports 26198016 KB total, and no MTRR errors, which is in line with what I would expect with 3 x 8GB installed. free -m now reports 24114 MB total RAM, which quite frankly is close enough for me.

    This smells like a barfed DIMM, plus a kernel that for whatever reason was damaged; that latter may have happened during the power outage (though I must say that's an odd way for the kernel to break!). The non-working DIMM will go back to the reseller as soon as I talk to them (hopefully tomorrow).

    (hopefully) FINAL EDIT

    I RMA'd one of the two pairs of DIMMs, it was accepted by the reseller as damaged and they sent me a new pair, which seems to work just fine. So I'm now basically at where I originally intended nearly a month ago (although a large fraction of that time was not really due to the reseller), with 32 GB RAM usable; free -m reports 32194 MB total memory, and the kernel reports 34586624k RAM on initialization, both of which are well in line with my expectations.

    • Admin
      Admin about 12 years
      From your first statement it sounds like you moved hard disks with an installed OS to a new system board? A really good test would be to download a live distro and boot into it. Slax, DSL, Ubuntu, or whatever. If that recognizes the right amount of RAM then you will likely be encountering HAL / udev issues. At that point you will save much more time backing up and reinstalling than trying to fix it. Unless you are a geek like me and want to waste hours or days on it :}
    • Admin
      Admin about 12 years
      Please post the output of dmidecode --type memory and the first hundred lines or so of the output of dmesg (make sure to include anything that looks like it's about memory).
    • Admin
      Admin about 12 years
      WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 13295MB of RAM. Well, there's your missing 13G.
    • Admin
      Admin about 12 years
      @Mat, not the other missing 16G, however. Those will likely take a bit more looking around for.
    • Admin
      Admin about 12 years
      Your BIOS doesn't see one of the sticks, and it's not unusual to have to pair sticks (in specific slots too). You've got a buggy BIOS, possibly one mis-installed/damaged RAM stick, (guessing) one that's not useable because it's not paired, and two that are active but only partly reachable because of the BIOS problem.
    • Admin
      Admin about 12 years
      I would be interested in what a debian live (/ubuntu, since it's the next closest thing) boot would tell, since that can be used to easily distinguish between problems with your hardware and problems with your configuration.
    • Admin
      Admin about 12 years
      My current plan is to (tonight) flash the UEFI to whatever is most recent from Asus, and if that alone doesn't cause all the RAM to show up, physically remove all the RAM sticks and re-add them one by one (checking of course against pairing instructions in the motherboard manual, but I'm pretty sure it stated that an arbitrary number of modules could be installed) to see if/when I get unexpected readouts. If I'm lucky, maybe the firmware upgrade alone will fix this.
    • Admin
      Admin about 12 years
      Updated question with new results under "edit 3".
  • user
    user about 12 years
    See my edit for e820 data. Physically removing memory modules to see what that does will have to wait until tonight. The only DDR3 modules I have are 8 GB each.
  • Netch
    Netch about 12 years
    Well, seems that's enough now - you have both hardware and software working properly. The last action is to install correct memory module to fill it and make dual-channel working. Congratulations.