ECC memory – what’s the deal?

I remember back in the nineties we were all trying to get the ECC memory for the computers we built. The ECC memory was expensive and we all discussed whether a particular configuration would justify the expense of ECC memory or might just survive without. The amounts of memory at the time were measured in megabytes, not gigabytes, like now. So we all thought that some time in the future, in five years or so, the ECC memory will cost the same as the non-ECC memory and all computers will finally come equipped with ECC memory by default, because the amounts of memory would simply require the use of error correction.

What is ECC memory?
Error-correcting code memory – Error Checking & Correction, ECC – is a type of computer memory that detects and corrects the most common data corruption as the data is passed in and out of the memory. ECC memory has additional memory banks that store checksums of data stored in the memory.

At the time, the calculations showed that with the “typical” desktop the error rate in the memory would be sufficiently low and not present a danger. However, the amount of memory in a typical computer has increased by several orders of magnitude since then. Only while we talked about a few hundred megabytes of memory the errors were negligible. Once you step over the gigabyte threshold, memory errors become a statistical reality. Without the ECC memory, we accumulate errors in our data and algorithms every single day.

It is surprising that with the current state of technology we are not using ECC memory everywhere, just as I thought back in the nineties we would. At least, for your own good, do get ECC memory on the computers you use.… -->

continue reading →