Coauthor of the original Rowhammer exploit here. ECC remains a highly effective method for turning this from a security issue to a reliability issue, mostly. As an individual owner of a server, if that server has ECC and you expect to notice machine halts due to uncorrectable ECC errors, the security implications for you are modest.
Now, if you are a cloud provider that provides VMs on multitenant hosts, your threat model may be different.
Either way, avoid machines without ECC. TRR was a lame duck even when Rowhammer was still fresh, and bits flipping in DRAM will not go away unless the economics in DRAM manufacturing change (e.g. not).
AMD now took Intel's market segmentation approach and is disabling ECC on most Ryzen CPUs. Only Pro and Threadrippers have it guaranteed, then some boards with some desktop Ryzens.
I thought the same as you, but the situation's less dire thankfully. Wrote a post about it a while back that was top of HN: https://sunshowers.io/posts/am5-ryzen-7000-ecc-ram/
Possibly interestingly, the motherboard (ASRock X670E Taichi) you mention there now has ECC listed on the ASRock page:
https://www.asrock.com/mb/AMD/X670E%20Taichi/index.asp#Speci...
"Supports" might mean you can run unbuffered ECC UDIMMs but without ECC? Even Intel can run ECC UDIMMs in non-ECC mode. Also some manufacturers don't distinguish between "on-die ECC" (DDR5) and real ECC.
No, the memory controller believes ECC is activated. See my post :)
Oh awesome. That's really fantastic.
The Riptide motherboard I use also has ECC listed now. https://pg.asrock.com/mb/AMD/B650E%20PG%20Riptide%20WiFi
Less dire for sure, but having to pay double the cost for ECC modules is still pretty dire.
That is not true. The last time I have looked ECC DDR5 UDIMM modules had a price higher by at most 50%.
Nevertheless that is still excessively high. While in the beginning for DDR5 there were only 80-bit modules, which could claim a +25% higher price, now there are 72-bit modules, like in the previous generations, which can justify at most a +12.5% price increase.
I looked right when I posted that and found 2x32GB non-ECC 5600MHz for $165, exactly half the $330 price of the sticks listed in the post. I spent several minutes looking for cheaper ECC at the same specs and couldn't find any.
Trying again, I can find some Kingston sticks that are $120 each, so that's about 50% higher. Amazon's search is really bad, by the way. But that's not the "at most" price. And a month ago they were $140 each.
Also the spec sheet says they are 72 bit modules made out of 20 2GB x8 chips? That is baffling. Is 10% of the memory going unused? https://www.kingston.com/datasheets/KSM56E46BD8KM-32HA.pdf
Edit: Micron data sheets suggest that UDIMMs have 2x13 command/address pins and RDIMMs have 2x6, so that's one piece of the puzzle. Apparently UDIMMs can do x64 and x72, and RDIMMs can do x72 and x80.
When DDR5 was first introduced, there were only x8 chips.
Because the DDR5 channels must have a width of 32, 36 or 40 bits, with x8 chips one had to use 40-bit channels, even if only 36-bit channels are needed, so indeed 10% of the memory capacity remained unused.
Meanwhile, about a year ago, at least Micron has also introduced x4 chips. There are such ECC UDIMMs, using both x8 and x4 chips, which waste no memory.
On the market there are both modules made only with x8 chips, which do not use a part of the memory, and modules with a combination of x8 and x4 chips, without unused memory.
Nothing changed since first Ryzen launch:
- Desktop Ryzen CPUs support ECC, but implementation by motherboard vendors is not mandatory
- Laptop and G-series Ryzen CPUs only support ECC in pro variant
- Threadripper has ECC support
edit: not confirmed but supposedly laptop and APUs starting with 6000 series all support ECC.
7840HS laptop CPU definitely doesn't have ECC, only the PRO variant.
What is weird is that AMD has published for almost 2 years specifications that all the Ryzen 6000 series and all the Ryzen 7000 series of laptop CPUs (Rembrandt and Phoenix) support ECC, then suddenly and silently they have removed the statement about ECC support from all their specifications.
For the current Ryzen 8000 laptop series (Hawk Point), the ECC support has been specified as missing from the beginning.
There are also a few oddball desktop SKUs that are actually a G-series processor with the GPU disabled (primarily ones below the "600" tier, e.g. the Ryzen 3 4100 or Ryzen 5 5500), which also lack ECC support.
Is it disabled or is it simply not certified?
Last I checked ECC was not certified to work on most "consumer" oriented hardware, but AMD didn't make any attempt to actually disable it.
Last I checked, DESKTOP AMD CPUs have working (not disabled, but not 'supported') ECC with DDR5 UDIMMs (5v source, not 12v server ram). Desktop BOARDS, depends on the HW + BIOS; initial firmware revisions didn't do ECC but for many boards on some brands it does work. I haven't checked recently.
No longer true.
At least since Zen 3 (2020), all AMD desktop Ryzen CPUs have ECC support clearly included in their specifications, so it is official support, not just a non-disabled ECC.
This change was around the same time when Intel has begun to support ECC in some Alder Lake desktop CPUs (and in their successors), so it might have been a response to Intel's decision.
So now the ECC support depends strictly on the motherboard manufacturer. The best chance to find motherboards with ECC support is at ASUS and at ASRock (including ASRock Rack, which offers server boards for Ryzen CPUs).
Is that a change? I think what you described applies equally to every generation of Zen processors: Pro-branded chips have ECC capability officially, laptop chips don't have it, and consumer-branded chips have it unofficially with ECC capability optional for motherboards.
Already since a few generations, at least since Zen 3, the desktop Ryzen CPU have official ECC support, not unofficial ECC support like the first Ryzen.
This can be verified easily on the AMD site by reading the CPU specifications.
For laptop CPUs, there has been a time interval between the beginning of 2022 and the autumn of 2023 when ECC support was specified for all mobile Rembrandt and Phoenix CPUs, but then the ECC support has been removed suddenly from the specifications of all non-Pro laptop CPUs.
All the Ryzen 7000 series desktop processors support ECC, I think. Check each model's specs to be sure.
Asus motherboards for those CPUs also support it, as stated in the BIOS manuals I looked over. It requires changing the BIOS ECC setting from Auto to Enable.
I have done this on one such system, and the appropriate EDAC messages showed up in the Linux boot log.
I would use ECC memory if I could. I used to use a TR 2920x with ECC but now I'm on a Ryzen 7950x with non-ECC. Unbuffered ECC memory is the only one supported by Ryzen, and it's slower or more expensive or both compared to the equivalent-capacity non-ECC memory. The latest Threadripper lineup supports Registered ECC, but Threadripper is overkill (cost, threads, PCIe lanes) for home users like myself.
Have you considered a 7950X3D? The larger CPU cache might make up for the RAM speed difference.
I didn't want to have to deal with the non-uniform CCDs. Of course the two on a 7950x aren't uniform either due to silicon lottery (eg on mine the first CCD clocks 100MHz higher than the other on all-core load and 200MHz higher on single-core load), but that is a small difference. It would presumably be more pronounced on the 7950x3d since only one has access to the extra cache. So I would be using it "sub-optimally" if I didn't `taskset` / cgroup everything to run on one CCD or the other.
I wonder what workload needs more than eight dual-threaded cores, but has trouble if the additional cores have more cache or the RAM isn't factory-overclocked, and doesn't care about data integrity.
Keep in mind that for my 5950X I had to buy Micron Rev E 16Gbit x8 due based DIMMs, rated 3200CL22, running 3600CL18. I.e., they just don't ship with XMP presets.
Overclock it yourself. It's not that hard.
The advantage of buying RAM with XMP presets is that the reseller who created the preset has tested the sticks with that overclock and binned the original chips accordingly. When you buy RAM that is only rated for the default speed (as ECC server RAM is), you have no guarantee that all sticks will overclock the same amount, so in the worst case one stick will bring all the other sticks down to its level.
That's not anything new though.
My local supplier (https://www.scorptec.com.au) has a fair amount of both, with ECC currently being about double the price (ugh).
When the AM4 generation was current, the difference in price was a lot less though. :/
Still it's worth it for piece of mind. Especially if you're undervolting the cpu. ;)
How much of that is it being actually slower and how much is it just ECC memory not being sold at pre-overclocked speeds?
This seems like all the argument necessary to require all computers - and I do mean all computers - to require ECC memory. The security risk is simply too great, and everything is too integrated not to make this change. Even a "gamer" on a pure gaming computer will have some crucial information on that machine, so I simply do not see how we've gone this far without making this change.
The reality is that Rowhammer remains one of the hardest ways to compromise a machine, largely because the software stack most people are running isn't great.
Which belongs to the gamer user, so it can be extracted or encrypted by every malware running on that system. No need for esoteric attacks like rowhammer at this point. Unless you think about, for example, exploring user machine via rowhammer in js running in a browser tab, but as far as I know that was never made practical.
I've heard that DDR2 is immune to rowhammer. Is that actually the case, or is it just because nobody's looked at it? Is SRAM the only thing that's truly immune?
I have a few questions I haven't found satisfactory answers for in the existing papers:
* Are modern patrol read engines guided by the memory access patterns to respond to RowHammer style attacks?
* How aggressive would a patrol read engine have to scan the DRAM to safely stay ahead of RowHammer induced bit-flips?
* Would larger ECC words than the traditional 64+8 with multi-bit error correction change the game and allow us to build more reliable systems from DRAM with pattern vulnerabilities?
I would expect that increased crash rate of multi-tenant hosts would be something that would be detected and investigated by the cloud provider. At the same time targeting a specific tenant would require a lot of luck.