Remains to be seen how the microcode patch affects performance, and how these CPUs that have been affected by over-voltage to the point of instability will have aged in 6 months, or a few years from now.
More voltage generally improves stability, because there is more slack to close timing. Instability with high voltage suggests dangerous levels. A software patch can lower the voltage from this point on, but it can't take back any accumulated fatigue.
I was recently looking at building and buying a couple systems. I've always liked Intel. I went AMD this time.
It seemed like the base frequencies vs boost frequencies were much farther apart on Intel than with most of the AMDs. This was especially true on the laptops were cooling is a larger concern. So I suspect they were pushing limits.
Also, the performance core vs efficiency core stuff seemed kind of gimmicky with so few performance cores and so many efficiency cores. Like look at this 20 core processor! Oh wait, it's really an 8 core when it comes to performance. Hard to compare that to a 12 core 3D cached Ryzen with even higher clock...
I will say, it seems intel might still have some advantages. It seems AMD had an issue supporting ECC with the current chipsets. I almost went Intel because of it. I ended up deciding that DDR5 built in error correction was enough for me. The performance graphs also seem to indicate a smoother throughput suggesting more efficient or elegant execution (less blocking?). But on the average the AMDs seem to be putting out similar end results even if the graph is a bit more "spikey".
The E cores are about half as fast as the P cores depending on use case, at about 30% of the size. If you have a program that can use more than 8 cores, then that 8P+12E CPU should approach a 14P CPU in speed. (And if it can't use more than 8 cores then P versus E doesn't matter.) (Or if you meant 4P+16E then I don't think those exist.)
Only half of those cores properly get the advantage of the 3D cache. And I doubt those cores have a higher clock.
AMD's doing quite well but I think you're exaggerating a good bit.
Only if you use work stealing queues or (this is ridiculously unlikely) run multithreaded algorithms that are aware of the different performance and split the work unevenly to compensate.
Or if you use a single queue... which I would expect to be the default.
Blindly dividing work units across cores sounds like a terrible strategy for a general program that's sharing those cores with who-knows-what.
It’s a common strategy for small tasks where the overhead of dispatching the task greatly exceeds the computation of it. It’s also a better way to maximize L1/L2 cache hit rates by improving memory locality.
Eg you have 100M rows and you want to cluster them by a distance function (naively), running dist(arr[i], arr[j]) is crazy fast, the problem is just that you have so many of them. It is faster to run it on one core than dispatch it from one queue to multiple cores, but best to assign the work ahead of time to n cores and have them crunch the numbers.
It has always been a bad idea to dispatch so naively and dispatch to the same number of threads as you have cores. What if a couple cores are busy, and you spend almost twice as much time as you need waiting for the calculation to finish? I don't know how much software does that, and most of it can be easily fixed to dispatch half a million rows at a time and get better performance on all computers.
Also on current CPUs it'll be affected by hyperthreading and launch 28 threads, which would probably work out pretty well overall.
This is what the Intel Thread Director [0] solves.
For high-intensity workloads, it will prioritize assigning them to P-cores.
[0] https://www.intel.com/content/www/us/en/support/articles/000...
Then you no longer have 14 cores in this example, but only len(P) cores. Also most code written in the wild isn’t going to use an architecture-specific library for this.
The P cores being presented as two logical cores and E cores presented as a single logical core results in this kind of split already.
Yeah, the 20 core Intels are benchmarking about the same as the 12 core AMD X3Ds. But many people just see 20>12. Either one is more than fine for most people.
"Oh wait, it's really an 8 core when it comes to performance [cores]". So yes, should not be an 8 core all together, but like you said about 14 cores, or 12 with the 3D cache.
"And I doubt those cores have a higher clock."
I'm not sure what we're comparing them to. They should be capable of higher clock than the E cores. I thought all the AMD cores had the ability to hit the max frequency (but not necessarily at the same time). And some of the cores might not be able to take advantage of the 3D cache, but that doesn't limit their frequency, from my understanding.
Oh, just higher clocked than the E cores. Yeah that's true, but if you're using that many cores at once you probably only care about total speed.
You said 12 core with higher clock versus 8, so I thought you were comparing to the performance cores.
The cores under the 3D cache have a notable clock penalty on existing CPUs.
Right, but my point is it's misleading to call out higher core count and the advantages of 3D stacking. The 3D stacking mostly benefits the cores it's on top of, which is 6-8 of them on existing CPUs.
"The cores under the 3D cache have a notable clock penalty on existing CPUs."
Interesting. I can't find any info on that. It seems that makes sense though since the 7900X is 50 TDP higher than the 7900X3D.
"Right, but my point is it's misleading to call out higher core count and the advantages of 3D stacking"
Yeah, that makes sense. I didn't realize there was a clock penalty on some of the cores with the 3D cache and that only some cores could use it.
It's due to the stacked cache being harder to cool and not supporting as high of a voltage. So the 3D CCD clocks lower, but for some workloads it's still faster (mainly ones dealing with large buffers, like games, most compute heavy benchmarks fit in normal caches and the non 3D V-Cache variants take the win).
It’s kind of funny and reminiscent of the AMD bulldozer days where they had a ton of cores compared to the contemporary Intel chips, especially at low/mid price points but the AMD chips were laughably underwhelming for single core performance which was even more important then.
I can’t speak to the Intel chips because I’ve been out of the Intel game for a long time but my 5700X3D does seem to happily run all cores at max clock speed.
AMD has the advantage with regards to ECC. Intel doesn't support ECC at all on consumer chips, you need to go Xeon. AMD supports it on all chips, but it is up to the motherboard vendor to (correctly) implement. You can get consumer-class AM4/5 boards that have ECC support.
My understanding is that it's screwed up for multiple vendors and chipsets. The boards might say they support it, but there are some updates saying it's not. It seemed extremely hard to find any that actually supported it. It was actually easier to find new Intel boards supporting ECC.
yeah wendell put out a video a few weeks ago exploring a bunch of problems with asrock rack-branded server-market B650 motherboards and basically the ECC situation was exactly what everyone warns about: the various BIOS versions wandered between "works, but doesn't forward the errors", "doesn't work, and doesn't forward the errors", and (excitingly) "doesn't work and doesn't even post". We are a year and a half after zen4 launched and there barely are any server-branded boards to begin with, and even those boards don't work right.
https://youtu.be/RdYToqy05pI?t=503
I don't know how many times it has to be said but "doesn't explicitly disable" is not the same thing as "support". There are lots of other enablement steps that are required to get ECC to work properly, and they really need to be explicitly tested with each release (which if it is "not explicitly disabled", it's not getting tested). Support means you can complain to someone when it doesn't work right.
AMD churns AGESA really, really hard and it breaks all the time. Partners have to try and chase the upstream and sometimes it works and sometimes it doesn't. Elmor (Asus's Bios Guy) talked about this on Overclock.net back around 2017-2018 when AMD was launching X399 and talked about some of the troubles there and with AM4.
That said, the current situation has seemingly lit a fire under the board partners, with Intel out of commission and all these customers desperate for an alternative to their W680/raptor lake systems (which do support ecc officially, btw) in these performance-sensitive niches or power-limited datacenter layouts, they are finally cleaning up the mess like, within the last 3 weeks or so. They've very quickly gone from not caring about these boards to seeing a big market opportunity.
https://www.youtube.com/watch?v=n1tXJ8HZcj4
can't believe how many times I've explained in the last month that yes, people do actually run 13700Ks in the datacenter... with ECC... and actually it's probably some pretty big names in fact. A previous video dropped the tidbit that one of the major affected customers is Citadel Capital - and yeah, those are the guys who used to get special EVEREST and BLACK OPS skus from intel for the same thing. Client platform is better at that, the very best sapphire rapids or epyc -F or -X3D sku is going to be like 75% of the performance at best. It's also the fastest thing available for serving NVMe flash storage (and Intel specifically targeted this, the Xeon E-2400 series with the C266 chipset can talk NVMe SAS natively on its chipset with up to 4 slimsas ports...)
it's somewhere in this one I think: https://www.youtube.com/watch?v=5KHCLBqRrnY
The new EPYC processors for AM5 though look like they'll be ok for ECC ram though, at least in the coming months onwards.
Actually some of the 13th and 14th gen Intel Core processors support ECC.
Intel has always had randomly supported ECC on desktop CPUs. Sometimes it was just a few low end SKUs, sometimes higher end SKUs. 14th gen it appears i9s and i7s do, didn't check i5s, but i3s did not.
You need W680 boards (starting at around 500 bucks) for ECC on desktop intel chips.
I was seeing them around $400 (still expensive).
ECC support wasn't good initially on AM5, but there are now Epyc branded chips for the AM5 socket which officially support ECC DDR5. They come in the same flavors as the Ryzen 7xx0 chips, but are branded as Epyc.
Unfortunately not. I can't say for current gen, but the 5000 series APUs like the 5600G do not support ECC. I know, I tried...
But yes, most Ryzen CPUs do have ECC functionality, and have had it since the 1000 series, even if not officially supported. Official support for ECC is only on Ryzen PRO parts.
There was a strange happening with AMD laptop CPUs (“APUs”): the non-soldered DDR5 variants of the 7x40’s were advertised to support ECC RAM on AMD’s website up until a couple months before any actual laptops were sold, then that was silently changed and ECC is only on the PRO models now. I still don’t know if this is a straightforward manufacturing or chipset issue of some kind or a sign of market segmentation to come.
(I’m quite salty I couldn’t get my Framework 13 with ECC RAM because of this.)
More E-core is reasonable for multi threaded application performance. It's efficient for power and die area as the name indicates, so they can implement more E-cores than P-cores for the same power/area budget. It's not suitable for who need many single threaded performance cores like VM server, but I don't know is there any major consumer usage requires such performance.
I can sort of see that. The way I saw it explained as them being much lower clock and having a pretty small shared cache. I could see E cores as being great for running background processes and stuff. All the benchmarks seem to show the AMDs with 2/3rd the cores being around the same performance and with similar power draw. I'm not putting them down. I'm just saying it seems gimmicky to say "look at our 20 core!" with the implicit idea that people will compare that with an AMD 12 core seeing 20>12, but not seeing the other factors like cost and benchmarks.
It's the megahertz wars all over again!
Computers have taught is the rubric of truth: Numbers go Up.
Gaming.
There are some games that will benefit from greater single-core performance.
I was baffled by this too but what they don't make clear is the performance cores have hyperthreading the efficiency cores do not.
So what they call 2P+4E actually becomes an 8 core system as far as something like /proc/cpuinfo is concerned. They're also the same architecture so code compiled for a particular architecture will run on either core set and can be moved from one to the other as the scheduler dictates.
I don't know if that has done more good than harm, since they ripped AVX-512 out for multiple generations to ensure parity.
A major differentiator is that Intel CPUs with E cores don’t allow the use of AVX-512, but all current AMD CPUs do. The new Zen 5 chips will run circles around Intel for any such workload. Video encoding, 3D rendering, and AI come to mind. For developers: many database engines can use AVX-512 automatically.
intel is claiming 4% performance hit on the final patch https://youtu.be/wkrOYfmXhIc?t=308
Maybe a stretch - but this reminds me of blood sugar regulation for people with type 1 diabetes.
Too low is dangerous because you lose rational thought, and the ability to maintain consciousness or self-recover. However, despite not having the immediate dangers of being low, having high blood sugar over time is the condition which causes long-term organ damage.