return to table of content

ZenHammer: Rowhammer attacks on AMD Zen-based platforms

tdullien
34 replies
21h18m

Coauthor of the original Rowhammer exploit here. ECC remains a highly effective method for turning this from a security issue to a reliability issue, mostly. As an individual owner of a server, if that server has ECC and you expect to notice machine halts due to uncorrectable ECC errors, the security implications for you are modest.

Now, if you are a cloud provider that provides VMs on multitenant hosts, your threat model may be different.

Either way, avoid machines without ECC. TRR was a lame duck even when Rowhammer was still fresh, and bits flipping in DRAM will not go away unless the economics in DRAM manufacturing change (e.g. not).

treprinum
19 replies
20h41m

AMD now took Intel's market segmentation approach and is disabling ECC on most Ryzen CPUs. Only Pro and Threadrippers have it guaranteed, then some boards with some desktop Ryzens.

treprinum
1 replies
5h58m

"Supports" might mean you can run unbuffered ECC UDIMMs but without ECC? Even Intel can run ECC UDIMMs in non-ECC mode. Also some manufacturers don't distinguish between "on-die ECC" (DDR5) and real ECC.

sunshowers
0 replies
1h27m

No, the memory controller believes ECC is activated. See my post :)

Dylan16807
3 replies
15h48m

Less dire for sure, but having to pay double the cost for ECC modules is still pretty dire.

adrian_b
2 replies
12h22m

That is not true. The last time I have looked ECC DDR5 UDIMM modules had a price higher by at most 50%.

Nevertheless that is still excessively high. While in the beginning for DDR5 there were only 80-bit modules, which could claim a +25% higher price, now there are 72-bit modules, like in the previous generations, which can justify at most a +12.5% price increase.

Dylan16807
1 replies
10h30m

I looked right when I posted that and found 2x32GB non-ECC 5600MHz for $165, exactly half the $330 price of the sticks listed in the post. I spent several minutes looking for cheaper ECC at the same specs and couldn't find any.

Trying again, I can find some Kingston sticks that are $120 each, so that's about 50% higher. Amazon's search is really bad, by the way. But that's not the "at most" price. And a month ago they were $140 each.

Also the spec sheet says they are 72 bit modules made out of 20 2GB x8 chips? That is baffling. Is 10% of the memory going unused? https://www.kingston.com/datasheets/KSM56E46BD8KM-32HA.pdf

Edit: Micron data sheets suggest that UDIMMs have 2x13 command/address pins and RDIMMs have 2x6, so that's one piece of the puzzle. Apparently UDIMMs can do x64 and x72, and RDIMMs can do x72 and x80.

adrian_b
0 replies
9h55m

When DDR5 was first introduced, there were only x8 chips.

Because the DDR5 channels must have a width of 32, 36 or 40 bits, with x8 chips one had to use 40-bit channels, even if only 36-bit channels are needed, so indeed 10% of the memory capacity remained unused.

Meanwhile, about a year ago, at least Micron has also introduced x4 chips. There are such ECC UDIMMs, using both x8 and x4 chips, which waste no memory.

On the market there are both modules made only with x8 chips, which do not use a part of the memory, and modules with a combination of x8 and x4 chips, without unused memory.

c2h5oh
3 replies
20h15m

Nothing changed since first Ryzen launch:

- Desktop Ryzen CPUs support ECC, but implementation by motherboard vendors is not mandatory

- Laptop and G-series Ryzen CPUs only support ECC in pro variant

- Threadripper has ECC support

edit: not confirmed but supposedly laptop and APUs starting with 6000 series all support ECC.

treprinum
1 replies
18h51m

7840HS laptop CPU definitely doesn't have ECC, only the PRO variant.

adrian_b
0 replies
12h28m

What is weird is that AMD has published for almost 2 years specifications that all the Ryzen 6000 series and all the Ryzen 7000 series of laptop CPUs (Rembrandt and Phoenix) support ECC, then suddenly and silently they have removed the statement about ECC support from all their specifications.

For the current Ryzen 8000 laptop series (Hawk Point), the ECC support has been specified as missing from the beginning.

0xcde4c3db
0 replies
19h44m

There are also a few oddball desktop SKUs that are actually a G-series processor with the GPU disabled (primarily ones below the "600" tier, e.g. the Ryzen 3 4100 or Ryzen 5 5500), which also lack ECC support.

dralley
2 replies
20h15m

Is it disabled or is it simply not certified?

Last I checked ECC was not certified to work on most "consumer" oriented hardware, but AMD didn't make any attempt to actually disable it.

mjevans
1 replies
18h28m

Last I checked, DESKTOP AMD CPUs have working (not disabled, but not 'supported') ECC with DDR5 UDIMMs (5v source, not 12v server ram). Desktop BOARDS, depends on the HW + BIOS; initial firmware revisions didn't do ECC but for many boards on some brands it does work. I haven't checked recently.

adrian_b
0 replies
12h10m

No longer true.

At least since Zen 3 (2020), all AMD desktop Ryzen CPUs have ECC support clearly included in their specifications, so it is official support, not just a non-disabled ECC.

This change was around the same time when Intel has begun to support ECC in some Alder Lake desktop CPUs (and in their successors), so it might have been a response to Intel's decision.

So now the ECC support depends strictly on the motherboard manufacturer. The best chance to find motherboards with ECC support is at ASUS and at ASRock (including ASRock Rack, which offers server boards for Ryzen CPUs).

wtallis
1 replies
20h30m

Is that a change? I think what you described applies equally to every generation of Zen processors: Pro-branded chips have ECC capability officially, laptop chips don't have it, and consumer-branded chips have it unofficially with ECC capability optional for motherboards.

adrian_b
0 replies
12h16m

Already since a few generations, at least since Zen 3, the desktop Ryzen CPU have official ECC support, not unofficial ECC support like the first Ryzen.

This can be verified easily on the AMD site by reading the CPU specifications.

For laptop CPUs, there has been a time interval between the beginning of 2022 and the autumn of 2023 when ECC support was specified for all mobile Rembrandt and Phoenix CPUs, but then the ECC support has been removed suddenly from the specifications of all non-Pro laptop CPUs.

foresto
0 replies
10h28m

All the Ryzen 7000 series desktop processors support ECC, I think. Check each model's specs to be sure.

Asus motherboards for those CPUs also support it, as stated in the BIOS manuals I looked over. It requires changing the BIOS ECC setting from Auto to Enable.

I have done this on one such system, and the appropriate EDAC messages showed up in the Linux boot log.

Arnavion
7 replies
19h53m

I would use ECC memory if I could. I used to use a TR 2920x with ECC but now I'm on a Ryzen 7950x with non-ECC. Unbuffered ECC memory is the only one supported by Ryzen, and it's slower or more expensive or both compared to the equivalent-capacity non-ECC memory. The latest Threadripper lineup supports Registered ECC, but Threadripper is overkill (cost, threads, PCIe lanes) for home users like myself.

foresto
2 replies
10h23m

Have you considered a 7950X3D? The larger CPU cache might make up for the RAM speed difference.

Arnavion
1 replies
2h52m

I didn't want to have to deal with the non-uniform CCDs. Of course the two on a 7950x aren't uniform either due to silicon lottery (eg on mine the first CCD clocks 100MHz higher than the other on all-core load and 200MHz higher on single-core load), but that is a small difference. It would presumably be more pronounced on the 7950x3d since only one has access to the extra cache. So I would be using it "sub-optimally" if I didn't `taskset` / cgroup everything to run on one CCD or the other.

foresto
0 replies
1h20m

I wonder what workload needs more than eight dual-threaded cores, but has trouble if the additional cores have more cache or the RAM isn't factory-overclocked, and doesn't care about data integrity.

namibj
1 replies
16h43m

Keep in mind that for my 5950X I had to buy Micron Rev E 16Gbit x8 due based DIMMs, rated 3200CL22, running 3600CL18. I.e., they just don't ship with XMP presets.

Overclock it yourself. It's not that hard.

Arnavion
0 replies
1h26m

The advantage of buying RAM with XMP presets is that the reseller who created the preset has tested the sticks with that overclock and binned the original chips accordingly. When you buy RAM that is only rated for the default speed (as ECC server RAM is), you have no guarantee that all sticks will overclock the same amount, so in the worst case one stick will bring all the other sticks down to its level.

justinclift
0 replies
18h29m

and it's slower or more expensive or both compared to the equivalent-capacity non-ECC memory.

That's not anything new though.

My local supplier (https://www.scorptec.com.au) has a fair amount of both, with ECC currently being about double the price (ugh).

When the AM4 generation was current, the difference in price was a lot less though. :/

Still it's worth it for piece of mind. Especially if you're undervolting the cpu. ;)

account42
0 replies
1h49m

it's slower or more expensive or both compared to the equivalent-capacity non-ECC memory

How much of that is it being actually slower and how much is it just ECC memory not being sold at pre-overclocked speeds?

cbozeman
2 replies
13h39m

bits flipping in DRAM will not go away unless the economics in DRAM manufacturing change

This seems like all the argument necessary to require all computers - and I do mean all computers - to require ECC memory. The security risk is simply too great, and everything is too integrated not to make this change. Even a "gamer" on a pure gaming computer will have some crucial information on that machine, so I simply do not see how we've gone this far without making this change.

tdullien
0 replies
11h14m

The reality is that Rowhammer remains one of the hardest ways to compromise a machine, largely because the software stack most people are running isn't great.

maxcoder4
0 replies
6h43m

Even a "gamer" on a pure gaming computer will have some crucial information on that machine

Which belongs to the gamer user, so it can be extracted or encrypted by every malware running on that system. No need for esoteric attacks like rowhammer at this point. Unless you think about, for example, exploring user machine via rowhammer in js running in a browser tab, but as far as I know that was never made practical.

myself248
0 replies
13h40m

I've heard that DDR2 is immune to rowhammer. Is that actually the case, or is it just because nobody's looked at it? Is SRAM the only thing that's truly immune?

crest
0 replies
6h54m

I have a few questions I haven't found satisfactory answers for in the existing papers:

* Are modern patrol read engines guided by the memory access patterns to respond to RowHammer style attacks?

* How aggressive would a patrol read engine have to scan the DRAM to safely stay ahead of RowHammer induced bit-flips?

* Would larger ECC words than the traditional 64+8 with multi-bit error correction change the game and allow us to build more reliable systems from DRAM with pattern vulnerabilities?

c2h5oh
0 replies
21h2m

I would expect that increased crash rate of multi-tenant hosts would be something that would be detected and investigated by the cloud provider. At the same time targeting a specific tenant would require a lot of luck.

crotchfire
32 replies
21h57m

What about DIMMs with Error Correction Codes (ECC)? Previous work on DDR3 showed that ECC cannot provide protection against Rowhammer.

This is incredibly misleading. The paper they cite states:

When the ECC detection is used correctly 0.65%-7.42% of all bit flips still cause silent corruptions... On setup AMD-1, uncorrectable errors crash the system.

The attacker will need to cause dozens of machine halts in order to achieve even a single exploitable bitflip. Dozens of machine halts is not something that goes undetected.

Kudos for calling out JEDEC's terrible behavior on the rowhammer question, but we should not be downplaying ECC as a near-term solution.

reliabilityguy
12 replies
21h50m

--

crotchfire
11 replies
21h46m

It will detect (by crashing) enough to make exploitation impractical. That is the key point.

reliabilityguy
10 replies
21h39m

I would say that 60% success per trial is a good chance.

exmadscientist
9 replies
21h31m

In the process of generating one triple flip, many, many, many, many, many single and double flips will occur and will be caught. That is why ECC is still an effective defense. Attackers don't just get to go straight to their end game.

YetAnotherNick
5 replies
21h3m

You can cause any amount of single and double flip without worry. It's not a defence as the attacker can retry till ECC labels it as uncorrectable. AFAIK there is no cost in retrying.

exmadscientist
4 replies
18h14m

That's true, but none of it is silent. Corrected errors get reported and it will be obvious that something is going wrong to anyone who's paying attention.

YetAnotherNick
3 replies
17h37m

Reported where? There is no reporting in Ryzen CPUs.

namibj
1 replies
16h38m

Got a reference? Because my Zen3 desktop has the driver loaded and information shown, just not the bitflips but that may be due to excessively early refresh configuration.

adrian_b
0 replies
1h30m

Normally you should not see any bit flips, because they happen at intervals of several months or even less frequently, depending on location.

Only for some old modules, e.g. 5-years old or older, the frequency of errors can increase a lot, even up to many bit flips per day, which means that the offending module must be replaced.

This feature of identifying the aged modules is one of the main benefits of ECC.

I have not looked again at the AMD EDAC driver, which has been updated during last year, but previously, a couple of years ago, its feature of injecting errors for testing was broken on Ryzen (because it had not been updated since Bulldozer, at that time), so the only method to verify that error reporting is working was to overclock the memory in the BIOS settings, to ensure that errors will be generated. Obviously, for the test one should boot from read-only media, to avoid the corruption of the storage in the case of excessive errors.

theevilsharpie
0 replies
12h58m

Ryzen CPUs report ECC errors like any other modern CPU -- it raises a Machine Check Exception which the operating system is expected to handle. Linux and Windows will both handle and log any ECC errors that the CPU raises. Presumably the various BSDs do as well.

reliabilityguy
2 replies
21h24m

--

rightbyte
0 replies
21h9m

That's true for encryption too.

exmadscientist
0 replies
21h8m

The ECCploit paper has extensive discussion of all the ways their work is detected, and how they even use detection to probe the correction structure. This is not a silent attack. This is a proof that ECC is a penetrable defense. Which we all know! The question is how difficult it is and how stealthily it can be done.

But regardless, ECC still sounds the alarm when it's being attacked. If no one listens, there's not much ECC can do about that.

wolpoli
5 replies
20h7m

The attacker will need to cause dozens of machine halts in order to achieve even a single exploitable bitflip. Dozens of machine halts is not something that goes undetected.

Is there a process for the operations team managing the system to figure out that it was an attack and not just flaky hardware?

adrian_b
1 replies
11h58m

Memory bit flips are very rare.

Normally a memory error does not happen more than a few times per year, unless you have a huge amount of memory.

Therefore when 2 memory correctable or uncorrectable errors happen in the same day, that should be enough to trigger an immediate report to the user or administrator of the computer that either there is an ongoing RowHammer attack that must be stopped or one of the memory modules is approaching its end-of-life due to aging and it must be replaced before it will begin to have very frequent memory errors.

At least on server computers it should be easy to configure their logging system so that a second memory error per day, even if it was correctable, should immediately send an e-mail message and/or an SMS to the administrator.

wolpoli
0 replies
10h4m

If that's the case, then I guess they would take physical server offline. And if other machines started showing similar signs of failure, then they would analyze the logs for possible row hammer attack?

pixl97
0 replies
17h47m

The same workload starts crashing after migrating to multiple machines?

justinclift
0 replies
18h27m

Sounds like a process thing that would need to be developed by each team. So probably a mix of results there.

crotchfire
0 replies
17h51m

Sure: you replace the hardware with brand new hardware and it keeps happening. Then you know it's not the hardware.

transpute
5 replies
21h51m

Any recommendations for client devices with ECC memory?

wtallis
3 replies
21h4m

If it has ECC memory, it's going to be branded as a workstation or server or industrial device, not marketed as a consumer device.

Among consumer products, some AMD desktop CPUs and motherboards support ECC memory, and that's about it.

justinclift
2 replies
18h23m

For desktops, ASRock motherboards seem to be the common choice for people wanting ECC memory.

It's specifically mentioned on the ASRock motherboard pages under "Specifications". Some random examples:

https://www.asrock.com/mb/AMD/B650%20Pro%20RS/index.asp#Spec...

https://pg.asrock.com/mb/AMD/B650%20PG%20Lightning%20WiFi/in...

https://www.asrock.com/mb/AMD/X670E%20Taichi/index.asp#Speci...

These all have:

    Supports DDR5 ECC/non-ECC, un-buffered memory up to 7200+(OC)

jeffbee
1 replies
17h35m

I think it's worth investigating the level of "support" these boards offer for ECC. The ASRock Taichi for example does not have any ECC DIMMs in its "qualified" list.

justinclift
0 replies
17h3m

Interesting. Might be good for someone (not me!) to investigate then write in-depth info about. :)

As a data point, I'm using a previous generation ASRock AM4 motherboard with ECC and that definitely works.

I'm undervolting my cpu and ram, and very occasionally (every 6 months or so?) one of those seems to be generating a correctable ECC error that gets propagated to warning messages on my terminal. Haven't bothered investigating any further though. ;)

adrian_b
0 replies
1h50m

The laptops with ECC memory are expensive and they are available for now only with Intel CPUs (while it should be possible to use mobile AMD CPUs I have never seen any such product). They are sold as "mobile workstations" by Dell, Lenovo and HP. I have a Dell Precision mobile workstation laptop with ECC memory bought in 2016 and it still works fine. However I had to pay for it EUR 3000 in 2016 and now something similar would be even more expensive (it had an NVIDIA Quadro GPU and 32 GB of ECC memory).

For desktops it is much easier to choose ECC memory, because the additional cost (the cost of the memory modules is 50% higher for DDR5-4800) remains a small fraction of the cost of an entire computer.

What is needed is to buy a motherboard with ECC support.

An example of a good motherboard with ECC support is ASUS PRIME X670E-PRO WIFI (for AMD Ryzen). I have been using a similar ASUS motherboard with ECC memory from the previous X570 generation for the last 5 years and it still works fine.

There are several other such MBs, mainly at ASUS and ASRock.

For Intel Raptor Lake there are fewer and more expensive such motherboards, but they can be found at ASUS (Pro WS W680M-ACE SE) and at Supermicro, as "workstation motherboards".

p1necone
5 replies
21h31m

The attacker will need to cause dozens of machine halts in order to achieve even a single exploitable bitflip. Dozens of machine halts is not something that goes undetected.

If you're targeting a specific machine, if you're throwing the exploit at a few thousand machines shotgun style then you're still going to get your botnet - it'll just be smaller.

vlovich123
3 replies
21h29m

I think the point is that people with thousands of machines are probably going to notice if a meaningful chunk of them start halting.

SAI_Peregrinus
1 replies
20h35m

Yep, and desktop users will certainly notice. Only AMD has desktop (not workstation) ECC support.

riedel
0 replies
20h22m

If you are running windows 10 random halts and the CPU getting hot won't seem suspicious.

p1necone
0 replies
19h6m

Why do you need to target one person who has thousands of machines? What if I just want to pwn whatever random machines visit my dodgy website? Dismissing an exploit just because it only works some fraction of the time seems overly optimistic to me.

crotchfire
0 replies
21h28m

Can you point to any botnets which were built using rowhammer attacks?

Rowhammer and speculative execution attacks are incredibly labor-intensive and target-specific. They are targeted attacks for high-value targets.

jquery
0 replies
19h6m

Thanks for this. One reason I bought ECC for my home desktop was specifically for protection against Rowhammer (Zen2 TR platform), and that line made my heart race a bit. Very misleading.

VHRanger
31 replies
22h48m

Serious question: as an average person, are those hardware security issues (rowhammer, spectre, meltdown) an actual risk?

My understanding with spectre and meltdown was that it was an issue for escaping VMs and similar attacks - something AWS engineers should care about, but not me

gary_0
10 replies
22h29m

The solution is to disable JavaScript and not run any untrusted apps. And then move to a shack in the woods and live off the land, because you just cut yourself off from modern society.

rtehfm
3 replies
22h1m

Sounds like a recipe to become to the next Ted Kaczynski.

bitwize
1 replies
21h50m

Ted Kaczynski's views are pretty popular on Hackernews.

rustcleaner
0 replies
20h48m

Too bad we won't see Uncle Ted give a TED Talk. :^(

bee_rider
0 replies
21h46m

Just install Firefox, then noscript, and skip the bit about the shack.

tycho-newman
1 replies
21h40m

The fact that JavaScript is essential to modern society makes me want to cry.

berkes
0 replies
10h30m

I'm still not sure what's worse.

The fact that I must run JavaScript written by just about anyone, in order to live in a modern society, or the fact that I keep having to write code in JavaScript in order to run a (completely non-JS related) business.

spxneo
1 replies
21h46m

The year is 2024. Solar panels you installed from Alibaba begins to search for cell towers. Your local instance of LLM voice bot you built to keep you company is using a malicious npm package that suddenly communicates with the solar panels and starts sending packets to a Chinese server.

malfist
0 replies
18h52m

Your solar panels are talking to China, your lightswitch is part of a massive botnet promoting bitcoin on X, your car is selling your data to your insurance company to have an excuse to raise your rates, your browser is protecting your privacy by routing all your sensitive information through their servers for them to inspect.

Your phone is selling your location for antiabortion fanatics to harass you, or help your stalker find you. Your ISP is selling your browsing history to anyone with a dollar.

That databroker that everyone was selling too just went to bankrupt and the banks are selling your data to anyone with a penny.

We desperately need wholesale privacy regulation.

transpute
0 replies
22h15m

Some browsers (including Brave on iOS) can disable Javascript by default, to be enabled only on trusted sites where 3rd-party ads are blocked.

bee_rider
0 replies
21h50m

Noscript is annoying for like a week until you get the sites that you use frequently and basically trust whitelisted.

Sure, it isn’t perfectly safe. If HN or my employer goes evil, they can rowhammer me I guess. I’d expect it to cause a big todo, though, so I’m not that worried about it.

I don’t really understand why people seem to think disabling JS is a big hassle. Is this motivated reasoning by web devs or something?

It is not a big problem, and the sort of “ambient shittiness” of the internet greatly improved by doing it. Most sites work fine, they’ll default to some (better) less dynamic state, maybe some ads won’t load. For those sites that don’t work, you can make an exception or leave. Personally I’m now mostly visiting sites by people who don’t enjoy over complicating things, and who think about fallbacks. It is great!

ncann
2 replies
22h7m

The practical answer is that, if 99.9% of people out there has system that mitigates these issues, no one will bother using these exploits in the wild and you can turn off these mitigations to get the perf benefit and be reasonably sure that you won't get exploited. Unless you're targeted of course.

magnoliakobus
0 replies
21h39m

If 99.9% of people can be exposed to the same malicious code and not even be aware that it was running in the background, it's all the more reason for a malicious actor to expose the largest amount of people to it with relatively minimal risk.

berkes
0 replies
10h23m

But "we", being the average tech expert, also has no way to know when that 1% will hit.

It takes only one creative genious to turn the next security issue into a thing that does affect us all. Some worm that eats all linuxes, a virus that spreads through all bsds or something that installs crypto miners on every second android or so. We cannot know.

And so we cannot defend ourselves against that. And so it's useless to worry about it. But it will happen. Our systems are way too monoculture, both soft- and hardware, to be protected against a digital potato famine.

ls612
2 replies
20h20m

No. I've run the Rowhammer test in memtest86 on my PC after building it (as part of the whole memtest package to verify my XMP was stable) and got zero errors on 64GB of DDR5 memory over all the passes. If Memtest couldn't do it when trying its hardest to brute force it nobody doing drive-by javascript has any chance to exploit it.

userbinator
1 replies
15h6m

Could you tell us what DIMMs you're using? I thought Rowhammer-free RAM was a thing of the past, but if some manufacturer has fixed theirs to be immune, they deserve the extra sales and publicity.

ls612
0 replies
14h42m

Corsair Vengeance 2x32GB 5200Mhz. My understanding is that DDR5 in general is mostly immune to known rowhammer attacks because the on-chip ECC is good enough to fix any issues. This attack seems to work only with AMD Zen processors and not with Intel 12th-14th gen so I suspect DDR5 on intel is still good.

bee_rider
2 replies
21h4m

Everyone should install some kind of script whitelisting ad-on and only run JavaScript programs from websites that they really trust. I like noscript. I’m not sure what the Chrome pick is.

Other than that… we don’t often run random programs from the internet, right?

They’ve only scratched the surface for these sorts of bugs. Modern hardware is too complex to actually believe they’ll ever get them all.

sundvor
1 replies
20h2m

Definitely not a security expert here, but this is one of the reasons why I at least run ublock origin on just about everything - and recommend everyone do the same. The ad delivery networks is just such a huge vulnerability surface.

Noscript would be much better of course, I guess I'm just too lazy to go that extra step.

bee_rider
0 replies
3h44m

I’m not an expert either, but the I think the experts are not really very useful in this context.

At least, I typically see things about the trade off between usability and security and the need to enable certain use-cases. I think most security experts work in industry where their job is to figure out what can be done to patch things up within the constraint that their job doesn’t exist unless the company can do the stuff it needs to do to stay in business.

I don't really care about any of that, I just want to be able to read text from the internet without my system getting messed up. It is a much easier use-case, because static content is usually pretty safe (although I do think there have been vulnerabilities in font and image rendering libraries). We don’t need an expert to intelligently analyze things and balance against the interests of competing parties because there’s no need to push in the “open things up” direction for the most part.

Tuna-Fish
2 replies
22h3m

Before browsers got patched, meltdown could be used to steal browser encryption keys using js. This absolutely would have affected normal people.

mik1998
1 replies
21h59m

"could", theoretically. In practice, there has never been an observed exploitation of the supposed vulnerability.

mitigations=off

oynqr
0 replies
19h34m

Just don't do that on modern AMD processors, you'll lose performance.

transpute
1 replies
22h44m

First paragraph:

  This poses a significant risk as DRAM devices in the wild cannot easily be fixed, and previous work showed that Rowhammer attacks are practical, for example, in the browser, on smartphones, across VMs, and even over the network.

ngneer
0 replies
20h38m

That is just one view, namely the authors' view. You may wish to consider recent perfect 10 vulnerabilities for comparison, as these are far more likely to cause problems.

rgbrenner
0 replies
22h21m

You run untrusted code everyday inside a VM: your browser.

ngneer
0 replies
20h55m

No. As a sober hardware security researcher, most exploited vulnerabilities that would affect an average person are far more mundane and mostly software driven.

hathawsh
0 replies
22h32m

From a security perspective, a web browser is a kind of VM hypervisor, where each web site may have its own VMs. So yes, everyone can be affected.

Dalewyn
0 replies
20h59m

If you really are an average person, then no: Like most other supposed threats, you lose more to the fixes/mitigations than to the threat itself. They just make for great headlines and sensationalism, which is why you as an average person would hear about them at all.

Note that the average person wouldn't know WTF "DRAM" means, let alone "Rowhammer" or "Zen" or other esoteric industry terms.

AtNightWeCode
0 replies
7h48m

Some of these exploits can be used in a browser. They leave no trace. So it is hard to tell how much these exploits have been used in the past and how likely a wider attack will happen in the future.

Some of these exploits have been used in targeted attacks towards end users so the risk is not 0.

wmf
8 replies
23h46m

The real news appears to be that rowhammer is mostly fixed on DDR5.

dist-epoch
5 replies
23h23m

DDR5 is so fragile they had to include on-die ECC to make it work, even when ECC is not exposed externally.

samstave
2 replies
22h36m

May you please ELI5 why DDR5 is 'fragile' as you put it?

Was its design pushing material sciences such that the theory worked, but practical implementation required the 'crutch' of ECC?

adgjlsfhk1
0 replies
22h14m

basically. pushing the timing and sizes makes it likely that some of your bits will fail to be built correctly. rather than dropping the speed and sizes to get reliability, you just throw an extra chip on to give you redundancy.

RachelF
0 replies
13h20m

Take a look at the spec. The speeds are so high that they use some modem channel characterization features on the memory bus.

Linus was right about ECC being needed, with higher capacities and speeds and reduced feature size it's becoming a must.

jeffbee
1 replies
23h12m

That only brings DRAM into alignment with flash and magnetic storage, so it's not really a negative. Everything in your computer is converging on semiconductor with bounded probabilistic state + math.

titzer
0 replies
23h8m

It's always been that way, just how many nines of reliability we're talking about. E.g. at Google scale, bitflips in memory from cosmic rays and general noise happy every day. Everything has checksums on it.

samtheprogram
0 replies
23h44m

Not news, and per the article:

Furthermore, we show that ZenHammer can trigger Rowhammer bit flips on a DDR5 device for the first time.
merb
0 replies
23h36m

Well they said that it needs further testing. If it would be mostly fixed, it would mean that ecc could help even more. I mean the on-die-ecc probably already helps

DarkNova6
6 replies
23h17m

I know far too little about hardware security. Is this one of the many inevitable vulnerabilities that arise from CPU optimization and are of little feasibility in the real world?

rocqua
4 replies
20h13m

Arguably worse. This arises from the physics of DRAM. This occurs at a much lower level than an edge case of a feature that lets you leak info over a side channel. Instead this is just: the data is stored as a small charge in a grid by flipping nearby points on the grid alot you can leak some charge into your target charge.

The smaller the charge, and the closer together the charges, the easier rowhammer attacks are. Also, the smaller and closer together the charges, the faster, cheaper, denser, and efficient your RAM gets.

There are mitigations, but they are pushed to the limit.

KennyBlanken
3 replies
16h59m

From what I understand, it arises from DRAM manufacturers, interested in maximizing profits as much as possible, have been pushing the limits of how small they can make the RAM chip's features, and then backing off slightly until they felt ram was reliable "enough", but Rowhammer et al demonstrate it's very easy to cause bit flipping?

markhahn
1 replies
13h39m

"maximize profits" and "best product for customer" are dual. you specifically want small chip features - or don't you like speed, power efficiency, and low cost?

throwaway48476
0 replies
6h47m

The point of engineering is trade offs. No one is trying to make a worse DRAM.

rocqua
0 replies
11h47m

They push the size to the limit, and stop when random writing is unlikely to cause any bitflips. Stopping at the point rowhammer would be unlikely would be stopping earlier.

As others said, this isn't just about profits. It's about being able to compete on costs (i.e. being able to survive at all) and to compete on the best performance. This places the problem less at singular manufacturers and more at the whole industry.

dist-epoch
0 replies
23h11m

This is a RAM problem, not a CPU one.

anticensor
3 replies
22h12m

Yes, but you might end up in a huge loss in stability. Even a single bitflip might become a fatal error.

formerly_proven
2 replies
22h10m

If oyu get that many bitflips the system wasn't stable to begin with.

wtallis
1 replies
20h31m

I think the implication was that memory encryption could mean that a rowhammer-induced bitflip would be amplified into scrambling the entire word of memory, which is more likely to have catastrophic effects than a single bit flip. That would be true for any reasonable definition of "stable" that admits any susceptibility to rowhammer.

indeyets
0 replies
10h19m

But that’s a good thing. Sane state would be synced to disk and any successful bitflip will halt a system telling you that something bad is going on. It would be “catastrophic” for runtime but not for the data

swozey
1 replies
19h24m

I have a very vague understanding of all of these DDR bitflip attacks, but I found the original Hammertime paper and it's actually very easy to read. I haven't gone through all of it but it breaks things down to be better understood very well.

I've heard bitflipping a million times and never really got it (not that I made serious effort) until this.

https://comsec.ethz.ch/wp-content/files/hammertime_raid18.pd...

I feel like I just went through a 101 EE course. I had NO idea any of this was related to the actual hardware manufacturing imperfections, etc.

That explains the name Rowhammer. I've probably been under a rock and everyones knows this stuff.

Due to the extreme density of modern DRAM arrays, small manufacturing imperfections can cause weak electrical coupling between neighboring cells. This, combined with the minuscule capacitance of such cells, means that every time a DRAM row is read from a bank, the memory cells in adjacent rows leak a small amount of charge. If this happens frequently enough between two refresh cycles, the affected cells can leak enough charge that their stored bit value will “flip”, a phenomenon known as “disturbance error” or more recently as Rowhammer.
KennyBlanken
0 replies
16h55m

Due to the extreme density of modern DRAM arrays, small manufacturing imperfections can cause weak electrical coupling between neighboring cells.

This makes it sound like it's unavoidable and inherent to making DRAM. It isn't.

DRAM manufacturers have been pushing the limits to an extreme. That's why. Pursuit of profit. This is no different from Ford deciding the cost of settling Pinto lawsuits (from injuries and deaths) was less than the cost of fixing the car's design.

oldge
1 replies
22h40m

Does this work when full memory encryption, poisoning, and address xor is turned on?

reliabilityguy
0 replies
22h14m

With memory encryption it wont lead to system exploitation, just to a system crash.

So, with memory encryption you are safer.

binkHN
1 replies
19h23m

...ZenHammer could not trigger flips on nine out of ten devices ... more research is necessary to find more effective patterns for DDR5 devices.

So I guess DDR5 still has a little bit of time here. Anyone know if this also affects LPDDR5x?

wtallis
0 replies
16h31m

The DRAM interface is pretty well decoupled from the memory array itself. So whether you're looking at DDR5 or LPDDR5(x) or GDDR6(x) or HBM3(e) isn't the right question. What matters are the implementation details up to the manufacturer's discretion, such as on-die ECC.

axytol
0 replies
22h12m

They mention Zen 2 and 3, any info on Zen 1? Would it simply apply as well?