A few notes because there are a few pieces of information which aren't a 100% up to date:
As most people know, Intel have lost the plot and are behind AMD on everything CPU and GPU related (with the tiny exception of the N100 series CPUs from Intel which are perfect for low power and even fanless applications). As such, their CPUs are mostly bought by organisations updating previous environments with like for like replacements for whatever reason (like vSphere's EVC which allows you to run newer processors from the same manufacturer as if they were an older model, and enable hot migration between CPU architectures and thus minimum disruption on hardware refreshes). Pretty much everyone else is getting the better and cheaper (per processing power) AMD CPUs.
Intel NICs are pretty good overall (the X710 which spent a year+ with broken drivers causing silent network failures/downright crashes while still being in the hardware compatibility list of VMware and other "enterprise" vendors notwithstanding), and they're getting more popular for consumer devices too. An Intel CPU on a non-low cost PC/laptop almost always means an Intel NIC (WiFi and/or Ethernet).
For many organisations buying servers, Supermicro* is probably their best option. They're cheaper, have much higher flexibility in terms of form factors, chassis, components, number of whatever slots and mostly reliable. However their support is less reliable than the theoretical Dell/HPE support, so it works best for redundant setups.
Also, the IPMI specification is being replaced by Redfish which is much more complete, secure, normalised and presents normal usable APIs. Any mainline server from a few years ago should have Redfish alongside IPMI.
* There's recent research from Hindenburg, a short selling research firm, that exposes some shady stuff from Supermicro - but their hardware is still top notch and used by the hyperscalers. https://hindenburgresearch.com/smci/
Having had a play with both Supermicro and ASRock Rack boards for workstations (admittedly, only H12's so not the "most recent" but from my cursory glance and newer boards, nothing has changed that much), SuperMicro board feels like they were made in 2005, not 2024. It is ridiculous in fact.
* No support for ACPI sleep. In 2024. Seriously. * No support for 4 and 3 pins fans. 3 pins are 100% speed all the time. * IPMI web interface straight out of 2010. * NVME placement prevents you from using heatsinks. * Tons of opaque jumpers on the board, with no board labelling.
The ASRock Rack equivalent board is amazing in comparison.
I don't want ACPI sleep on a server.
Why not? Assuming your servers are not all at or near full load at all times, you can put some to sleep as warm spares.
Asking myself the same question, but IME Linux support for sleep states is (was?) generally horrible due to absolute lack of standard enforcement, and that's with laptops where it's a priority. I've only had one laptop (my most recent one) work 100% with sleep states, and only after replacing the NVMe that froze roughly 33% of the wake ups due to some obscure bug (but worked fine otherwise)
Maybe lucky, but haven’t had trouble with Linux and sleep for about 15 years. Dell and Framework.
They support wake on lan (most likely) and you could power up from IPMI REST API
To be fair, these are meant for datacenter applications where it would be absolutely normal for these to be either fully on or off all of the time. You could make an argument for warm spare servers, but there's other ways to accomplish it than ACPI sleep, and I'd probably consider warm spares a relatively niche need, considering I'd rather leave them hot-but-outside-production for system monitoring purposes. Would be more annoying to wake a sleeping system and find it has a failing drive/RAM/NIC, bad switchport config, etc.
To be fair, these are meant for datacenter applications where it would be absolutely normal to run your fans at full speed all of the time.
Does it have HTML5 remote console, and basic component diagnostics? If so what else would you like? I've been a datacenter tech, a linux sysadmin, and a devops engineer, depending on the company for a decade now, and really only ever used IPMI for those two things so I'm curious where the other use cases are. I've also used Dell PowerEdge and HPE servers, which have slightly nicer looking UIs but perform the exact same functions more or less.
Case fans for rack mount chassis are very powerful, and also can take up a significant amount of energy when running full-bore (not to mention the mechanical wear).
I haven't used a server chassis in over a decade where the fans were running at full beyond a few brief seconds at startup. I'm not sure if they used a four-pin header or some other mechanism, but fan speed control is a normal and expected feature in server hardware.
IPMI specifically is meant to be a standardized remote management interface. It (mostly) works for basic things, but more advanced functionality is hit-or-miss, or absent entirely, leaving you at the mercy of proprietary tools. Redfish is supposed to be better, although I personally haven't used it.
Web interfaces can be hit-or-miss in terms of functionality and UX. Additionally, I've often found these web interfaces to be unstable -- either being very slow or not loading at all, to certain features of the UI not loading data or hanging the interface.
Word. I borrowed a circa 2005 – 2010 server and the fan roared on start-up. I would not want that on full-time for reasons of noise, power and wear.
If you think the latest SuperMicro IPMI webinterface is from 2010 you haven't seen their 2010 interfaces. Would you like to download the Java Web Start file to start your Remote Console? Use a Java Applet to see just the screenshot of the VGA output? How about getting RAID controllers with a (PS/2) mouse based GUI inside the OptionROM (looking at you LSI) only to find out the damn RAID controller manual lied about having a HBA passthrough mode. It just creates a single RAID volume per disk, but still stores controller specific metadata on the disks.
If ASRock Rack got the SSH serial console redirection latency down from ~1 second to the low milliseconds like the other vendors (SuperMicro, Dell, HP, etc.) it would actually be useable without taking Valium before logging in.
</rant>
ASRock IPMI, at least on X570D4U Ryzen 5000, is unusable. You can't turn on the firewall allow only specific IPs, it blocks everything. They said they have a beta firmware (> 01.39.00) to solve this, but it doesn't. Had to purchase a Spyder to add to the machine.
Supermicro's BMC UI on X11 does feel like 2004, but it is decently reliable. On the H13 series is even more stable and doesn't feel outdated.
Every vendor has their quirks. I have a pair of ASRock Rack ROMED8-2T/BCM boards, but my M.2 NVME boot drive is no longer detected if I update the BIOS past v3.50. Unfortunately, that means no support for resizable BAR in my configuration.
I have a pair of Supermicro H12SSL-NT boards to use for my next couple builds. I might be trading one set of issues for another, but I'm optimistic they'll work well for my purposes.
I've had 40%+ (8 out of 20) RMA rate with ASrock boards (from pcie drives dropping to weird gremlins), while I have replaced two out of over 300 super micro boards, all of them running 10y+
The IPMI interface sure is nicer though
I agree! I love my ASRock Rack board (X570si). I had written off ASRock as "budget" but it's been rock solid and had loads of features like you said.
Agreed. If nothing else, the IPMI web on supermicro randomly deciding you need to re-login constantly definitely feels very 2004.
It's been this way for at least a decade, too. It's like it just forgets all its session variables sometimes.
The problem with Supermicro isn't the short report, the problem is the secure boot keys were leaked so a huge amount of their hardware is impossible to secure because the root of trust is broken.
https://arstechnica.com/security/2024/07/secure-boot-is-comp...
Do you secure boot your systems?
I do but with my own PK. The problem is if the box has run untrusted code prior to that you can't trust it anymore.
This is a bigger problem when you have a huge fleet of these already rather than my few servers in my home lab that I can manually enroll my own keys in.
If that's your threat model you shouldn't trust your computers anyway. You don't know what things have been inserted into the chips on the motherboard. You don't know what code is in your operating system and applications. You don't know what code the controllers on your storage devices are running. You don't know if there is a cellular chip on your motherboard or a chip that is waiting for a certain frequency radio wave to leak all your shit. You don't know that all the code on your system is RCE free or LPE free. You don't know that there aren't any insiders at your software vendors signing bad code to send to you. You haven't personally audited the binaries on your system or even their source code.
Did the X710 actually become stable? I got burned with them pretty early in their life have ended up staying away from them ever since.
Yep, as soon as the patched drivers were released (1.8 IIRC) it became rock solid and we never had any issues with it in the 3-4 years afterwards that I spent at the same place.
Ah yes, RedFish.
I automate keeping IPMI ssl certain up to date using redfish apis. Importing the new certs requires slightly different magic (cert naming, encoding, etc) on every single different vendor implementation of redfish. I have a collection of Python modules to deal with each vendors eccentricity just around uploading and swapping ssl certs when it should be a set of standard put requests that work everywhere (that is what the redfish API docs would convince you of!)
So not sure I would agree that it is either normalized or usable, as the whole point of something like redfish is that I shouldn’t have to do this. It is equally as annoying as it was when I had to drive their web interfaces directly.
Yep.
I had to write code to mount an iso image to a server, set the server to boot from it once and reboot the server to start the install.
I had to do it for 4 vendors. Dell, Lenovo, Fujitsu and Supermicro.
4 different vendors, 4 different Redfish APIs.
And what's annoying is that Redfish always looks similar so it fools you into thinking all the vendors do things the same way but no, suddenly some endpoint just doesn't exist in this vendors implementation and you have to scour the docs for this vendors unique magic uri.
I really wish Supermicro PSU's were accessible over PMBus without their proprietary IPMI utilities[1] but they're not. Moreover, it's x86-only so there's no way to interface it from ppc64el. Had it been open source, it could trivially be built.
[1] https://www.supermicro.com/en/solutions/management-software/...
Recent Intel CPUs don't look too bad on paper but in practice they burn up.
It's not too big of a surprise for me because lifespan will eventually become an increasing problem as feature size goes down (e.g. a mobile defect can make more trouble more easily.) They say it is a problem of the microcode asking for too much voltage from the motherboard and that's true, but it is also true that chips are going to be more sensitive to environmental upsets. Ever since the industry went past 10nm I figured there was a chance I'd buy something that was affected by I guess I dodged the bullet going with AMD.
I remember loving Intel NICs for performance and Linux compatibility back in the day before you just used the NIC that came on the motherboard. For that matter I loved Intel SSDs but you had to read between the lines to see: (1) you got a little better performance in the P95-P99 range than cheaper competitors and (2) when you are getting frustrated that your computer is slow you are experiencing P95-P99. It's some of the reason why I both loved and hated Anandtech because they so often missed the plot.