ECC registered vs ECC unbuffered

52,917

Solution 1

Well, if you use only 16GB RAM - which is not a server RAM range - you will be fine with pretty standard any desktop RAM/sys.

If it is only a storage server, you won't even need that much CPU performance.

Like you said, go with Sandy bridge, it will give you a cool, performant and reliable system.

Speaking of 16GB RAM ranges, you don't have to worry about ECC stuff.

Solution 2

ECC seems to correct only single bits errors.

Correct. To correct more errors would require more bits. As it is, you already use 10 bits to store 8 bits of information, 'wasting' 20% of the memory chips to allow to a single bit correction and up to two bits of error detection.

It works as follows. Imagine a 0 or an 1. If I read either then I just have to hope I read the right thing. If a 0 got flipped to a 1 by some cosmic radiation or by a bad chip then I will never know.

In the past we tried to solve that with parity. Parity was adding a ninth bit per 8 bits stored. We checked how many zeros and how many 1 were in the byte. The ninth was set to make that a even number. (for even parity) If you ever read a byte and the number was wrong, then you knew something was wrong. You do not know which bit was wrong though.

ECC expanded on that. It uses 10 bits and a complex algorithm to discover when a single bit has flipped. It also knows what the original value was. A very simple way to explain how it does that would be this:

Replace all 0s with 000. Replace all 1s with 111.

Now you can read six combinations:
000
001
010
100
101
111

We are never 100% sure what was originally stored. If we read 000 then that might have been just the 000 which we were expecting, or all three bits might have flipped. The latter is very unlikely. Bits do not randomly flip, though it does happen. Let say that happens one in ten times for some easy calculations (reality is much less). That works out to the following chances of reading the correct value:

000 -> Either 000 (99.9% sure), or a triple flip (1/1000 chance)

001 -> We know something has gone wrong. But it either was 000 and one bit flipped (1:10 chance), or it was 111 and two bits have flipped (a 1:100 chance). So let's treat it as if we read 000 but log the error.

010 -> Same as above.

100 -> Same as above.

011 -> Same as above, but assuming it was a 111

101 -> Same as above, but assuming it was a 111

110 -> Same as above, but assuming it was a 111

111 -> Either 111 (99.9% sure), or a triple flip (1/1000 chance)

111 -> Either 000 (99.9% sure), or a triple flip (1/1000 chance)

ECCs does similar tricks but does it more efficiently. For 8 bits (one byte) they only use 10 bits to detect and correct.


ECC registered RAM is only usable with workstation / server boards ECC unbuffered is usable on Intel Xeon lga1155 or AMD AM3+ on Asus boards.

I already mentioned what the ECC part was, now the registered vs unbuffered part.

In modern CPUs the memory controller is on the CPU die, starting long ago for AMD Opteron chips and with the Core i series for Intel. Most desktop CPUs then talk directly to the DIMM sockets holding the RAM. It works and no extra logic is needed. That is cheap to build, and the speed is high because there's no delay going from the memory controller to the RAM.

But a memory controller can only drive a limited current at high speeds. This means that there is a limit to how many memory sockets can be added to a motherboard. (And to make it more complex, to how much the DIMMs can use, which leads to memory ranks. I will skip that since this is already long).

On server boards you often want to use more memory than a desktop system. Therefore a "register" buffer is added to the memory. Reads from the chips on the DIMM first get copied to this buffer. A clock cycle later this buffer connects to the memory controller to transfer the data.

This buffer/register delays things, making memory slower. That is undesirable and thus it is only used/needed on boards that have a lot of memory banks. Most consumer boards do not need this, and most consumer CPU's do not support it.

Directly connected, unbuffered RAM vs. buffered/registered RAM isn't a case where one is better or worse than the other. They just have different trade-offs in terms of how many memory slots you can have. Registered RAM allows more RAM at the cost of some speed (and possibly expense). In most cases where you need as much memory as possible, that extra memory more than compensates for the RAM running at a slightly slower speed.

The doubt I'm having is (mainly concerning asus am3+ board): is ECC-unbuffered RAM as good as ECC-registered RAM (from the point of view of safety and reliability) ? Or is it a worse choice. I don't care much for the speed.**

From the standpoint of safety and stability, ECC-unbuffered and ECC-registered are the same.


More details: server will use a server case with up to 24 x 3 ½'' drives and should consume as little as possible.

24 drives are going to consume a lot of power. How much depends on the drives. My 140GB 15K RPM SAS drive is drawing a mere 10 watt at idle, same as the 1TB SATA 7k2 disk. At use both draw more.

Multiply that by 24. 24x10 Watt at idle means 240 watts just keeping the disks platters spinning, overcoming air resistance. Double-ish that for in use.


LGA1155 seems to be in that sense a better bet (TDP ~ 20-95W) versus the others (>80W) for twice the price.

Intel is better at low power CPU's, at the time of writing and for the CPU's you mentioned.

Any suggestion is welcome. Let's say less than 120W at idle (~ with 10 hard disks out of 24).

If you go for FreeBSD, look hard at ZFS. It can be great. Many of its more advanced features (e.g. deduplication and/or compression) use serious CPU power, and want plenty of memory. ZFS for basic use with ZRAID will do fine on both CPU sets you mentioned and with 16 GB, but if you turn on features like deduplication you should look carefully into the recommended memory needed for your disk capacity; up to 5GB per TB of storage is recommended by some guides.

Two more things:

  1. I did not see anything about connecting the drives. Some boards may go up to 10 SATA ports. But for anything over that, you will need add-in cards. If you consider hardware RAID then it might be best to plan that from the beginning.
  2. Drive failure: Should you use SATA port multipliers then look carefully how they act if a SATA drive fails. It often is not pretty. Not a big problem for a home setup, but very much not enterprise grade. You may need to consider how individual drives handle errors too. The reason some drives are labeled as being for "NAS" or "RAID" use is that they handle errors differently than regular drives. With no RAID, you want the drive to retry as many times as possible. With RAID, you want the drive to fail quickly, so you can read from another copy.

Solution 3

Two separate issues.

ECC Vs non-ECC

  • use ECC wherever uptime is important
  • costs more -- need (multiples of) 9 chips instead of 8
  • motherboard must support it to use it

Registered Vs Unbuffered:

  • Can have (much) more total RAM installed with Registered DIMMs
    • Less electrical strain on the memory controller interface
  • But all DIMMs installed must be registered or not
    • must remove unbuffered DIMMS if upgrading to Registered
  • Also is more expensive, and a cycle slower to access
    • Unbuffered is slightly lower latency, if that matters
    • all random accesses take many cycles anyway
    • Note absolute access latency (time in nanoseconds) hasn't improved much over history of DRAM use in PC's
      • cost, capacity and bandwidth vastly improved instead
      • memory caches hides the latency for most memory accesses anyway
    • Longer latency hurts single-thread 'real-time' performance most
      • usually doesn't affect 'server' use cases much
    • No/minimal difference in bandwidth and overall performance
      • sequential access bandwidth unaffected
      • L2/L3 caches mean actual access patterns mostly replace rows at a time in the cache, so are usually 'burst' accesses anyway
Share:
52,917

Related videos on Youtube

user51166
Author by

user51166

Updated on September 18, 2022

Comments

  • user51166
    user51166 almost 2 years

    I would like to build a storage server (based on GNU/Linux or FreeBSD) which will be on all the time. To prevent data corruption (which is unlikely to happen as I never had such a problem, but better be safe than sorry) I would like to use ECC RAM.

    Although not as good as EDD (?) (which is way more expensive) and provides additional protection. ECC seems to correct only single bits errors.

    ECC registered RAM is only usable with workstation / server boards such as Intel Xeon or AMD interlagos/magny-cours/valencia g34 or c32.

    ECC unbuffered is usable on Intel Xeon lga1155 or AMD AM3+ on Asus boards.

    The second option will be way much cheaper on the processor and motherboard side, and I doubt I will need more than 16GB of RAM (4x4 GB ECC unbuffered are the largest affordable sticks).

    The doubt I'm having is (mainly concerning asus am3+ board): is ECC-unbuffered RAM as good as ECC-registered RAM (from the point of view of safety and reliability) ? Or is it a worse choice. I don't care much for the speed.

    More details: server will use a server case with up to 24 x 3.5'' drives and should consume as little as possible. LGA1155 seems to be in that sense a better bet (TDP ~ 20-95W) versus the others (>80W) for twice the price. Any suggestion is welcome. Let's say less than 120W at idle (~ with 10 hard disks out of 24).

    • Admin
      Admin over 12 years
      Asking on SuperUser will get you SuperUser answer. Ask on ServerFault will get you ServerFault answer. Get my drift?
    • Admin
      Admin over 12 years
      The FAQ states hardware questions can be made on superuser ...
    • Admin
      Admin over 12 years
      @hydroparadise Check the FAQ - we allow all hardware questions.
    • Admin
      Admin over 12 years
      Sry, I thought that was assumed. Was only making mention that from the OS side there could be different considerations in how ECC is addressed because this will utlimately become a server application.
    • Admin
      Admin almost 11 years
      Barely. Most of the time the chipset will handle ECC correction (if any). You do not need to tap into those from the OS at all. (You can though, using DMI to get information on ECC or QPI errors.)
  • user51166
    user51166 over 12 years
    Thank for your quick reply. I though above 4GB RAM you would need ECC. It's sure for 256GB RAM or half a TB of RAM ECC is a MUST. But I though 16GB was kinda the limit ... Anyway it's strange ... a Xeon 4C/8T Sandybridge costs 100$ less (at my place) than an equivalent desktop CPU. Total cost is ~ the same. No drawbacks on ECC here. Are you sure that ECC is not needed ???
  • inf
    inf over 12 years
    @user51166 100% sure. 16GB is like the standard nowadays on mid/high end rigs.
  • user51166
    user51166 over 12 years
    The fact it is the defacto standard doesn't necessarily mean that it's reliable enough. Already read about cases on the internet where all data on disk became corrupted on the HDD not because of the SATA controller, but because of the bad (non-ecc) RAM.
  • inf
    inf over 12 years
    @user51166 Tell the guy who said that, that non-ECC definitely was not his problem.
  • user51166
    user51166 over 12 years
    That rassures me since I already have a home server to do other (less important) things. However I am going to use ECC ram for the NAS. But: is REGISTERED > UNBUFFERED ? Will be reliability compromised using UNBUFFERED over REGISTERED ? Is REGISTERED safer ? I know I'm talking about a change of one bit flipping once every week but still ... The problem about whole drive currupted is surely due to a bad stick not the random bit flip caused by comic rays ... where non-ecc RAM corrupts everything, ECC RAM will perform a shut down or reboot ...
  • inf
    inf over 12 years
    @user51166 Registered is the same as buffered but not the same as ECC, and yes registered RAM is safer, that's why it is only used in highend servers and is also more expensive, however just like with ECC, you just won't need it unless you run a data-center. If you can't live without ECC then just go with it. However I think you question is answered and we shouldn't discuss this in the comments here.
  • user51166
    user51166 over 12 years
    >> Registered is the same as buffered but not the same as ECC ? What do you mean ? I checked the store and see "Registered ECC" and "Unbuffered ECC".
  • inf
    inf over 12 years
    @user51166 yeah, like you said, there could have also been writen "Registered ECC" and "Unregistered ECC".
  • user51166
    user51166 over 12 years
    Strangely enough at the stores Registered ECC RAM is even cheaper than Unbuffered ECC RAM ... However the mainboards and CPU supporting Registered ECC RAM cost twice or more than systems using UDIMM ECC. If I understood then Registered ECC > Unbuffered ECC > Unbuffered non-ECC, right ?
  • inf
    inf over 12 years
    @user51166 yeah, right.
  • Nullpointer42
    Nullpointer42 almost 11 years
    Upvoting as this actually answers the question, while the other is more practical/anecdotal advice. We'll ignore that it meanders a bit before getting to the register vs unbuffered part ;)
  • ganesh
    ganesh almost 11 years
    Aye, It does meander. I tried to be thorough but I really should not become a writer. (either of fiction or of manuals).
  • Nullpointer42
    Nullpointer42 almost 11 years
    Heh, we'll also ignore that I originally stopped reading when you started addressing power and other concerns . . . ;)
  • Greg Smith
    Greg Smith almost 11 years
    This is an excellent answer, it can't be any shorter and still address all the questions so well. I just did a moderate edit that cleaned your text up that will help once it's applied. Even removed your apology to trim two lines--the accepted answer here was not very helpful. I did expand briefly on ZFS deduplication concerns and drive error handling while I was in there. Decided not to get into vibration, even though that's going to be an issue with 24 drives too.
  • Mahmoud Al-Qudsi
    Mahmoud Al-Qudsi over 9 years
    Thank you for clarifying registered vs unbuffered, and the reasons to go with RDIMM.
  • ganesh
    ganesh over 9 years
    "Registered ECC > Unbuffered ECC" No/sometimes. Unbuffered/unregistered ECC will actually be faster. Buffered/registered ECC will be at least one register action slower but you can add more DIMMs to a memory channel. (And more memory CAN make your system faster, even if latency increases). So rather then a hard "Yes/no" the proper answer is "it depends".
  • Imran Juma
    Imran Juma about 7 years
    I think ECC is a need if you run a server 24/7, because there is no every day reboot with memory check, so hard memory errors (hardware failure) can stay undetected and in worst case corrupt entire databases. On the other hand ECC detects memory errors and reboots the server if they occur, so they can't affect data integrity and you will be notified about them immediately. I don't think this depends on server size, if you run your server 24/7 and you don't want to lose your data, then ECC is a must.
  • Metaxis
    Metaxis almost 3 years
    ECC is critical on a storage server, regardless of how much there is.
  • Metaxis
    Metaxis almost 3 years
    ECC has nothing at all to do with uptime. You want ECC on a storage server for data integrity.