return to table of content

The real realtime preemption end game

tyingq
76 replies
1d7h

I wonder if this being fixed will result in it displacing some notable amount of made-for-realtime hardware/software combos. Especially since there's now lots of cheap, relatively low power, and high clock rate ARM and x86 chips to choose from. With the clock rates so high, perfect real-time becomes less important as you would often have many cycles to spare for misses.

I understand it's less elegant, efficient, etc. But sometimes commodity wins over correctness.

binary132
53 replies
1d7h

I get the sense that applications with true realtime requirements generally have hard enough requirements that they cannot allow even the remote possibility of failure. Think avionics, medical devices, automotive, military applications.

If you really need realtime, then you really need it and "close enough" doesn't really exist.

This is just my perception as an outsider though.

calvinmorrison
15 replies
1d7h

If you really need realtime, and you really actually need it, should you be using a system like Linux at all?

refulgentis
5 replies
1d6h

...yes, after realtime support lands

lumb63
4 replies
1d6h

A lot of realtime systems don’t have sufficient resources to run Linux. Their hardware is much less powerful than Linux requires.

Even if a system can run (RT-)Linux, it doesn’t mean it’s suitable for real-time. Hardware for real-time projects needs much lower interrupt latency than a lot of hardware provides. Preemption isn’t the only thing necessary to support real-time requirements.

skyde
1 replies
1d5h

what kind of Hardware is considered to have "lower interrupt latency"? Is there some kind of Arduino board I could get that fit those lower interrupt latency required for real-time but still support things like Bluetooth?

lumb63
0 replies
1d2h

Take a look at the Cortex R series. The Cortex M series still has lower interrupt latency than the A series, but lower processing power. I imagine for something like Bluetooth that an M is more than sufficient.

refulgentis
0 replies
1d4h

Sure but that was already mentioned before the comment I was replying to. Standard hardware not being great for realtime has nothing to do with hypothetical realtime Linux.

rcxdude
0 replies
1d3h

realtime just means execution time is bounded. It doesn't necessarily mean the latency is low. Though, in this sense RT-linux should probably be mostly thought of as low-latency linux, and the improvement in realtime guarantees is mostly in reducing the amount of things that can cause you to miss a deadline as opposed to allowing you to guarantee any particular deadline, even a long one.

synergy20
2 replies
1d6h

no you don't, you use a true RTOS instead.

linux RTOS is at microseconds granularity but it still can not 100% guarantee it, anything in cache nature (L2 cache, TLB miss) are hard for hard real time.

a dual kernel with xenomai could improve it, but it is not widely used somehow, only used in industrial controls I think.

linux RT is great for audio, multimedia etc as well, where real-time is crucial, but not a MUST.

froh
1 replies
1d2h

anything in cache nature (L2 cache, TLB miss) are hard for hard real time

yup that's why you'd pin the memory and the core for the critical task. which, alas, will affect performance of the other cores and all other tasks. and whoosh there goes the BOM...

which again as we both probably are familiar with leads to the SoC designs with a real time core microcontroller and a HPC microprocessor on the same package. which leads to the question how to architect the combined system of real-time microcontroller and compute power but soft real time microprocessor such that the overall system remains sufficiently reliable...

oh joy and fun!

synergy20
0 replies
7h7m

that's indeed the trend, i.e. put a small RTOS core along with a normal CPU for non-real-time tasks, in the past it's done on two boards: one is a MCU another is a typical CPU, now it's one one, very important for where RTOS is a must, e.g. robotics.

How the CPU and MCU communicate is a good question to tackle, typically chip vendors provide some solutions, I think OpenAMP is for this.

tyingq
1 replies
1d6h

I'm guessing it's not that technical experts will be choosing this path, but rather companies. Once it's "good enough", and much easier to hire for, etc...you hire non-experts because it works most of the time. I'm not saying it's good, just that it's a somewhat likely outcome. And not for everything, just the places where they can get away with it.

froh
0 replies
1d2h

nah. when functional safety enters the room (as it does for hard real time) then engineers go to jail if they sign off something unsafe and people die because of that. since the challenger disaster there is an awareness that not listening to engineers can be expensive and cost lifes.

snvzz
0 replies
9h56m

This is why the distinction between soft and hard realtime exists.

Linux-rt makes linux actually decent at soft realtime. PREEMPT_RT usually results on measured peak latency for realtime tasks (SCHED_RR/SCHED_FIFO) on the order of a few hundred usec.

Standard Linux lets latency go to tens of milliseconds, easily verifiable by running cyclictest from rt-tests for a few hours while using the computer. Needless to say, this is unacceptable for many user cases, including pro audio, videoconference and even gaming.

In contrast, AmigaOS's exec.library had no trouble yielding solid sub-millisecond behaviour in 1985, on a relatively slow 7MHz 68000.

No amount of patching Linux can give you hard realtime, as it is about hard guarantees, backed up by proofs built from formal verification, which Linux is excluded from due to its sheer size.

There's a few RTOSs that are formally verified, but I only know one that provides process isolation via the usual supervisor vs user CPU modes virtualization model: seL4.

snickerbockers
0 replies
1d5h

Pretty sure most people who think they need a real-time thread actually don't tbh.

rcxdude
0 replies
1d3h

really depends on your paranoia level and the consequences for failure. soft to hard realtime is a bit of a spectrum in terms of how hard of a failure missing a deadline actually is, and therefore how much you try to verify that you will actually meet that deadline.

eschneider
0 replies
23h38m

The beauty of multicore/multi-cpu systems is that you can dedicate cores to running realtime OSs and leave the non-hard realtime stuff to an embedded linux on it's own core.

dripton
11 replies
1d6h

You can divide realtime applications into safety-critical and non-safety-critical ones. For safety-critical apps, you're totally right. For non-critical apps, if it's late and therefore buggy once in a while, that sucks but nobody dies.

Examples of the latter include audio and video playback and video games. Nobody wants pauses or glitches, but if you get one once in a while, nobody dies. So people deliver these on non-RT operating systems for cost reasons.

binary132
6 replies
1d6h

This kind of makes the same point I made though -- apps without hard realtime requirements aren't "really realtime" applications

tremon
2 replies
1d5h

No -- soft realtime applications are things like video conferencing, where you care mostly about low latency in the audio/video stream but it's ok to drop the occasional frame. These are still realtime requirements, different from what your typical browser does (for example): who cares if a webpage is rendered in 100ms or 2s? Hard realtime is more like professional audio/video recording where you want hard guarantees that each captured frame is stored and processed within the alotted time.

atq2119
1 replies
21h12m

who cares if a webpage is rendered in 100ms or 2s?

Do you really stand by the statement of this rhetorical question? Because if yes: this attitude is a big reason for why web apps are so unpleasant to work with compared to locally running applications. Depending on the application, even 16ms vs 32ms can make a big difference.

tremon
0 replies
11h43m

Yes I do, because I don't think the attitude is the reason, the choice of technology is the reason. If you want to control for UI latency, you don't use a generic kitchen-sink layout engine, you write a custom interface. You can't eat your cake and have it too, even though most web developers want to disagree.

duped
1 replies
1d6h

The traditional language is "hard" vs "soft" realtime

binary132
0 replies
1d3h

RTOS means hard realtime.

pluto_modadic
0 replies
1d6h

I sense that people will insist on their requirements being hard unnecessarily... and that the bug is the fault of it being on a near-realtime system instead of it being faulty even on a realtime one.

jancsika
2 replies
1d2h

Your division puts audio performance applications in a grey area.

On the one hand they aren't safety critical.

On the other, I can imagine someone getting chewed out or even fired for a pause or a glitch in a professional performance.

Probably the same with live commercial video compositing.

eschneider
1 replies
23h41m

Audio is definitely hard realtime. The slightest delays are VERY noticeable.

jancsika
0 replies
20h53m

I mean, it should be.

But there are plenty of performers who apparently rely on Linux boxes and gumption.

lll-o-lll
0 replies
1d2h

You can divide realtime applications into safety-critical and non-safety-critical ones.

No. This is a common misconception. The distinction between a hard realtime system and a soft realtime system is simply whether missing a timing deadline leads to a) failure of the system or b) degradation of the system (but the system continues to operate). Safety is not part of it.

Interacting with the real physical world often imposes “hard realtime” constraints (think signal processing). Whether this has safety implications simply depends on the application.

cptaj
10 replies
1d6h

Unless its just music

duped
5 replies
1d6h

Unless that music is being played through a multi kW amplifier into a stadium and an xrun causes damage to the drivers and/or audience (although, they should have hearing protection anyway).

beiller
3 replies
1d6h

Your talk of xrun is giving me anxiety. When I was younger I dreamed of having a linux audio effects stack with cheap hardware on stage and xruns brought my dreams crashing down.

robocat
2 replies
1d1h

xrun definition: https://unix.stackexchange.com/questions/199498/what-are-xru...

(I didn't know the term, trying to be helpful if others don't)

tinix
1 replies
20h12m

just a buffer under/overrun

snvzz
0 replies
6h36m

it's cross-run aka xrun because these buffers are circular.

Depending on implementation, it will either pause or play the old sample where the new one isn't yet but should be.

spacechild1
0 replies
13h28m

and an xrun causes damage to the drivers and/or audience

An xrun typically manifests itself as a (very short) discontinuity or gap in the audio signal. It might sound unpleasant, but there's nothing dangerous about it.

itishappy
2 replies
1d6h

It may not be safety critical, but remember that people can and will purchase $14k power chords to (ostensibly) improve the experience of listening to "just music".

https://www.audioadvice.com/audioquest-nrg-dragon-high-curre...

cwillu
1 replies
1d3h

FWIW, a power chord is a _very_ different thing than a power cord.

itishappy
0 replies
1d3h

LOL, what a typo! Good catch!

binary132
0 replies
1d6h

what if your analog sampler ruins the only good take you can get? What if it's recording a historically important speech? Starting to get philosophical here...

abe_m
6 replies
1d6h

Having worked on a number of "real time" machine control applications:

1) There is always a possibility that something fails to run by its due date. Planes crash sometimes. Cars won't start some times. Factory machinery makes scrap parts sometimes. In a great many applications, missing a real time deadline results in degraded quality, not end of life, or regional catastrophy. The care that must be taken to lower the probability of failure needs to be in proportion to the consequence of the failure. Airplanes have redundant systems to reduce (but not eliminate) possibility of failure, while cars and trucks generally don't.

2) Even in properly working real time systems, there is a tolerance window on execution time. As machines change modes of operation, the amount of calculation effort to complete a cycle changes. If the machine is in a warm up phase, it may be doing minimal calculations, and the scan cycle is fast. Later it may be doing a quality control function that needs to do calculations on inputs from numerous sensors, and the scan cycle slows down. So long as the scan cycle doesn't exceed the limit for the process, the variation doesn't cause problems.

mlsu
2 replies
1d6h

That is true, but generally not acceptable to a regulating body for these critical applications. You would need to design and implement a validation test to prove timing in your system.

Much easier to just use an RTOS and save the expensive testing.

vlovich123
1 replies
1d2h

But you still need to implement the validation test to prove that the RTOS has these requirements…

mlsu
0 replies
19m

You do not, if you use an RTOS that is already certified by the vendor. This saves not only a lot of time and effort for verification and validation, but also a lot of risk, since validation is unpredictable and extremely expensive.

Therefore it'd be remarkable not to see a certified RTOS in such industries and applications where that validation is required, like aerospace or medical.

blt
2 replies
1d2h

How is your point 2) a response to any of the earlier points? Hard realtime systems don't care about variation, only the worst case. If your code does a single multiply-add most of the time but calls `log` every now and then, hard realtime requirement is perfectly satisfied if the bound on the worst-case runtime of `log` is small enough.

abe_m
1 replies
1d1h

I suppose it isn't, but I bristle when I see someone tossing around statements like "close enough doesn't really exist". In my experience when statements like that start up, there are people involved that don't understand variation is a part of every real process. My point is that if you're going to get into safety critical systems, there is always going to be some amount of variation, and there is always a "close enough", as there is never an "exact" in real systems.

jancsika
0 replies
20h58m

The point is to care about the worst case within that variation.

Most software cares about the average case, or, in the case of the Windows 10/11 start menu animation, the average across all supported machines apparently going 20 years into the future.

ajross
2 replies
1d5h

Think avionics, medical devices, automotive, military applications.

FWIW by-device/by-transistor-count, the bulk of "hard realtime systems" with millisecond-scale latency requirements are just audio.

The sexy stuff are all real applications too. But mostly we need this just so we don't hear pops and echos in our video calls.

binary132
1 replies
1d3h

Nobody thinks Teams is a realtime application

ajross
0 replies
1d1h

No[1], but the people writing the audio drivers and DSP firmware absolutely do. Kernel preemption isn't a feature for top-level apps.

[1] Actually even that's wrong: for sure there are teams of people within MS (and Apple, and anyone else in this space) measuring latency behavior at the top-level app layer and doing tuning all the way through the stack. App latency excursions can impact streams too, though ideally you have some layer of insulation there.

moffkalast
1 replies
1d5h

I feel like at this point we have enough cores (or will soon, anyway) that you could dedicate one entirely to one process and have it run realtime.

KWxIUElW8Xt0tD9
0 replies
1d5h

That's one way to run DPDK processes under LINUX -- you get the whole processor for doing whatever network processing you want to do -- no interruptions from anything.

wongarsu
0 replies
1d6h

There is some amount of realtime in factory control where infrequent misses will just increase your reject rate in QA.

lmm
0 replies
1d1h

Like many binary distinctions, when you zoom in on the details hard-versus-soft realtime is really more of a spectrum. There's "people will die if it's late". "The line will have to stop for a day if it's late". "If it's late, it'll wreck the part currently being made". Etc.

Even hard-realtime systems have a failure rate, in practice if not in theory - even a formally verified system might encounter a hardware bug. So it's always a case of tradeoffs between failure rate and other factors (like cost). If commodity operating systems can push their failure rate down a few orders of magnitude, that moves the needle, at least for some applications.

tuetuopay
17 replies
1d7h

The thing is, stuff that require hard realtime cannot satisfy with "many cycles to spare for misses". And CPU cycles is not the whole story. A badly made task could lock down the kernel not doing anything useful. The point of hard realtime is "nothing cannot prevent this critical task from running".

For automotive and aerospace, you really want the control systems to be able to run no matter what.

tyingq
10 replies
1d7h

Yes, there are parts of the space that can't be displaced with this.

I'm unclear on why you put "many cycles to spare for misses" in quotes, as if it's unimportant. If a linux/arm (or x86) solution is displacing a much lower speed "real real time" solution, that's the situation...the extra cycles mean you can tolerate some misses while still being as granular as what you're replacing. Not for every use case, but for many.

tuetuopay
4 replies
1d6h

You won't be saved from two tasks deadlocking with cycles/second. this is what hard realtime systems are about. However, I do agree that not all systems have a real hard realtime requirements. But those usually can handle a non-rt kernel.

As for the quotes, it was a direct citation, not a way to dismiss what you said.

tremon
3 replies
1d6h

I don't think realtime anything has much to do with mutex deadlocks, those are pretty much orthogonal concepts. In fact, I would make a stronger claim: if your "realtime" system can deadlock, it's either not really realtime or it has a design flaw and should be sent back to the drawing board. It's not like you can say "oh, we have a realtime kernel now, so deadlocks are the kernel's problem".

Actual realtime systems are about workload scheduling that takes into account processing deadlines. Hard realtime systems can make guarantees about processing latencies, and can preemptively kill or skip tasks if the result would arrive too late. But this is not something that the Linux kernel can provide, because it is a system property rather than about just the kernel: you can't provide any hard guarantees if you have no time bounds for your data processing workload. So any discussion about -rt in the context of the Linux kernel will always be about soft realtime only.

hamilyon2
1 replies
22h58m

I had an introductory course on OS and learned about hard real-time systems. I had impression hard real-time is about memory, deadlocks, livelocks, starvation, and so on. And in general about how to design system that moves forward even in presence of serious bugs and unplanned-for circumstances.

syntheweave
0 replies
21h16m

Bugs related to concurrency - which is where you get race conditions and deadlocks - tend to pop up wherever there's an implied sequence of dependencies to complete the computation, and the sequence is determined dynamically by an algorithm.

For example, if I have a video game where there's collision against the walls, I can understand this as potentially colliding against "multiple things simultaneously", since I'm likely to describe the scene as a composite of bounding boxes, polygons, etc.

But to get an answer for what to do in response when I contact a wall, I have to come up with an algorithm that tests all the relevant shapes or volumes.

The concurrency bug that appears when doing this in a naive way is that I test one, give an answer to that, then modify the answer when testing the others. That can lead to losing information and "popping through" a wall. And the direction in which I pop through depends on which one is tested first.

The conventional gamedev solution to that is to define down the solution set so that it no longer matters which order I test the walls in: with axis aligned boxes, I can say "move only the X axis first, then move only the Y axis". Now there is a fixed order, and a built-in bias to favor one or the other axis. But this is enough for the gameplay of your average platforming game.

The generalization on that is to describe it as a constraint optimization problem: there are some number of potential solutions, and they can be ranked relative to the "unimpeded movement" heuristic, which is usually desirable when clipping around walls. That solution set is then filtered down through the collision tests, and the top ranked one becomes the answer for that timestep.

Problems of this nature come up with resource allocation, scheduling, etc. Some kind of coordinating mechanism is needed, and OS kernels tend to shoulder a lot of the burden for this.

It's different from real-time in that real-time is a specification of what kind of performance constraint you are solving for, vs allowing any kind of performance outcome that returns acceptable concurrent answers.

tuetuopay
0 replies
1d6h

much agreed. I used deadlocks as an extreme example that's easy to reason about and straight to the point of "something independent of cpu cycles". something more realistic would be IO operations taking more time than expected. you would not want this to be blocking execution for hard rt tasks.

In the case of the kernel, it is indeed too large to be considered hard realtime. Best case we can make it into a firmer realtime than it currently is. But I would place it nowhere near avionics flight calculators (like fly-by-wire systems).

nine_k
1 replies
1d3h

How much more expensive and power-hungry an ARM core would be, if it displaces a lower-specced core?

I bet there are hard-realtime (commercial) OSes running on ARM, and the ability to use a lower-specced (cheaper, simpler, consuming less power) core may be seen as an advantage enough to pay for the OS license.

lmm
0 replies
1d1h

How much more expensive and power-hungry an ARM core would be, if it displaces a lower-specced core?

The power issue is real, but it might well be the same price or cheaper - a standard ARM that gets stamped out by the million can cost less than a "simpler" microcontroller with a smaller production run.

bee_rider
1 replies
1d6h

It is sort of funny that language has changed to the point where quotes are assumed to be dismissive or sarcastic.

Maybe they used the quotes because they were quoting you, haha.

tuetuopay
0 replies
1d6h

it's precisely why I quoted the text, to quote :)

archgoon
0 replies
1d6h

I'm pretty sure they were just putting it in quotes because it was the expression you used, and they thus were referencing it.

zmgsabst
5 replies
1d6h

What’s an example of a system that requires hard real time and couldn’t cope with soft real time on a 3GHz system having 1000 cycle misses costing 0.3us?

LeifCarrotson
2 replies
1d5h

We've successfully used a Delta Tau real-time Linux motion controller to run a 24 kHz laser galvo system. It's ostensibly good for 25 microsecond loop rates, and pretty intolerant of jitter (you could delay a measurement by a full loop period if you're early). And the processor is a fixed frequency Arm industrial deal that only runs at 1.2 GHz.

Perhaps even that's not an example of such a system, 0.3 microseconds is close to the allowable real-time budget, and QC would probably not scrap a $20k part if you were off by that much once.

But in practice, every time I've heard "soft real time" suggested, the failure mode is not a sub-microsecond miss but a 100 millisecond plus deadlock, where a hardware watchdog would be needed to drop the whole system offline and probably crash the tool (hopefully fusing at the tool instead of destroying spindle bearings, axis ball screws, or motors and gearboxes) and scrap the part.

zmgsabst
1 replies
16h21m

Thanks for the detailed reply!

I’m trying to understand where the roadblock on a rPi + small FPGA hybrid board for $50 fails at the task… and it sounds like the OS/firmware doesn’t suffice. (Or a SoC, like a Zynq.)

Eg, if we could guarantee that the 1.5GHz core won’t “be off” by more than 1us on responding and the FPGA can manage IO directly to buffer out (some of) the jitter, then the cost of many hobby systems with “(still not quite) hard” real time systems would come down to reasonable.

rmu09
0 replies
14h35m

You can get pretty far nowadays with preempt rt and an FPGA. Maybe you even can get near 1µs max jitter. One problem with the older RPis was unpredictable (to me) behaviour of the hardware, i.e. "randomly" changing SPI clocks, and limited bandwidth.

Hobby systems like a small CNC mill or lathe usually don't need anything near 1µs (or better) max jitter. LinuxCNC (derived from NIST's Enhanced Machine Controller, name changed due to legal threats) runs fine on preempt-rt with control loops around 1kHz, with some systems you can also run a "fast" thread with say 20kHz and more to generate stepper motor signals, but that job is best left for the FPGA or an additional µC IMHO.

lelanthran
0 replies
1d6h

What’s an example of a system that requires hard real time and couldn’t cope with soft real time on a 3GHz system having 1000 cycle misses costing 0.3us?

Any system that deadlocks.

krylon
0 replies
23h16m

I suspect a fair amount of hard real time applications are not running on 3GHz CPUs. A 100MHz CPU (or lower) without an MMU or FPU is probably more representative.

But it's not really so much about being fast, it's about being able to guarantee that your system can respond to an event within a given amount of time every time. (At least that is how a friend who works in embedded/real time explained it to me.)

imtringued
0 replies
1d7h

Sure, but this won't magically remove the need for dedicated cores. What will probably happen is that people will tell the scheduler to exclusively put non-premptible real time tasks on one of the LITTLE cores.

foobarian
0 replies
1d7h

Ethernet nods in vigorous agreement

PaulDavisThe1st
0 replies
20h16m

CPU speed and clock rate has absolutely nothing to do with realtime anything.

JohnFen
0 replies
1d6h

When I'm doing realtime applications using cheap, low-power, high-clockrate ARM chips (I don't consider x86 chips for those sorts of applications), I'm not using an operating system at all. An OS interferes too much, even an RTOS. I don't see how this changes anything.

But it all depends on what your application is. There are a lot of applications that are "almost real-time" in need. For those, this might be useful.

loeg
58 replies
1d7h

Synchronous logging strikes again! We ran into this some at work with GLOG (Google's logging library), which can, e.g., block on disk IO if stdout is a file or whatever. GLOG was like, 90-99% of culprits when our service stalled for over 100ms.

cduzz
43 replies
1d7h

I have discussions with cow-orkers around logging;

"We have Best-Effort and Guaranteed-Delivery APIs"

"I want Guaranteed Delivery!!!"

"If the GD logging interface is offline or slow, you'll take downtime; is that okay?"

"NO NO Must not take downtime!"

"If you need it logged, and can't log it, what do you do?"

These days I just point to the CAP theorem and suggest that logging is the same as any other distributed system. Because there's a wikipedia article with a triangle and the word "theorem" people seem to accept that.

[edit: added "GD" to clarify that I was referring to the guaranteed delivery logging api, not the best effort logging API]

msm_
25 replies
1d6h

Interesting, I'd think logging is one of the clearest situations when you want best effort. Logging is, almost by definition, not the "core" of your application, so failure to log properly should not prevent the core of the program from working. Killing the whole program because logging server is clearly throwing the baby out with the bathwater.

What people probably mean is "logging is important, let's avoid losing log messages if possible", which is what "best" in "best effort" stands for. For example it's often a good idea to have a local log queue, to avoid data loss in case of a temporary log server downtime.

fnordpiglet
11 replies
1d5h

It depends.

Some systems the logs are journaled records for the business or are discoverable artifacts for compliance. In highly secure environments logs are not only durable but measures are taken to fingerprint them and their ordering (like ratchet hashing) to ensure integrity is invariant.

I would note that using disk based logging is generally harmful in these situations IMO. Network based logging is less likely to cause blocking at some OS level or other sorts of jitter that’s harder to mask. Typically I develop logging as an in memory thing that offloads to a remote service over the network. The durability of the memory store can be an issue in highly sensitive workloads, and you’ll want to do synchronous disk IO for that case to ensure durability and consistent time budgets, but for almost all application disk less logging is preferable.

shawnz
8 replies
1d5h

If you're not waiting for the remote log server to write the messages to its disk before proceeding, then it seems like that's not guaranteed to me? And if you are, then you suffer all the problems of local disk logging but also all the extra failure modes introduced by the network, too

lmm
6 replies
1d1h

If you're not waiting for the remote log server to write the messages to its disk before proceeding, then it seems like that's not guaranteed to me?

Depends on your failure model. I'd consider e.g. "received in memory by at least 3/5 remote servers in separate datacenters" to be safer than "committed to local disk".

cduzz
5 replies
22h37m

You're still on one side or another of the CAP triangle.

In a network partition, you are either offline or your data is not consistent.

If you're writing local to your system, you're losing data if there's a single device failure.

https://en.wikipedia.org/wiki/CAP_theorem

lmm
3 replies
19h56m

CAP is irrelevant, consistency does not matter for logs.

cduzz
2 replies
9h10m

Consistency is a synonym for "guaranteed", and means "written to 2 remote, reliable, append-only storage endpoints" (for any reasonable definition of reliability)

So -- a single system collecting a log event -- it is not reliable (guaranteed) if written just to some device on that system. Instances can be de-provisioned (and logs lost), filesystems or databases can be scrambled, badguys can encrypt your data, etc.

In this context, a "network partition" prevents consistency (data not written to reliable media) or prevents availability (won't accept new requests until their activity can be logged reliably).

If you define "reliably" differently, you may have a different interpretation of log consistency.

lmm
0 replies
17m

Consistency is a synonym for "guaranteed", and means "written to 2 remote, reliable, append-only storage endpoints" (for any reasonable definition of reliability)

No it doesn't. Read your own wiki link.

In this context, a "network partition" prevents consistency (data not written to reliable media) or prevents availability (won't accept new requests until their activity can be logged reliably).

A network partition doesn't matter for a log system because there is no way to have consistency issues with logs. Even a single partitioned-off instance can accept writes without causing any problem.

Of course if you cannot connect to any instance of your log service then you cannot write logs. But that's got nothing to do with the CAP theorem.

fnordpiglet
0 replies
2h8m

I’m not sure I understand the way you’re using the vocabulary. Consistency is a read operation concept not write. There is no online reads for logs.

Availability is achieved if at least one writer acknowledges a write. In a partition, it means when you have multiple parts of the system disagreeing about the write contents due to a partition in the network. But because logs are immutable and write only, this doesn’t happen in any situation. The only situation this might occur is if you’re maintaining a distributed ratchet with in delivery order semantics rather than eventually consistent temporal semantics- in which case you will never have CAP. But that’s an insanely rare edge case.

Note CAP doesn’t ensure perfect durability. I feel like you’re confusing consistency with durability. Consistency means after I’ve durably written something all nodes agree on read it’s been written. Since logs don’t support read on the online data plane this is trivially not an issue. Any write acknowledgment is sufficient.

fnordpiglet
0 replies
22h32m

For logs, which are immutable time series journals, any copy is entirely sufficient. The first write is a quorum. Also from a systems POV reads are not a feature of logs.

fnordpiglet
0 replies
1d3h

The difference is that network IO can be more easily masked by the operating system than block device IO. When you offload your logging to another thread the story isn’t over because your disk logging can interfere at a system level. Network IO isn’t as noisy. If durability is important you might still need to wait for an ACK before freeing the buffer for the message which might lead to more overall memory use, all the operations play nicely in a preemptable scheduling system.

Also, the failure modes of systems are very tied to durable storage devices attached to the system and very rarely to network devices. By reducing the number of things that need a disk (ideally to zero) you can remove disks from the system and its availability story. Once you get to fully disk less systems the system failure modes are actually almost nothing. But even with disks attached reducing the times you interact with the disk (especially for chatty things like logs!) reduces the likelihood the entire system fails due to a disk issue.

ReactiveJelly
1 replies
1d5h

If it's a journaled record for the business then I think I'd write it to SQLite or something with good transactions and not mix it in the debug logs

fnordpiglet
0 replies
1d3h

There are more logs than debug logs, and using SQLite as the encoding store for your logs doesn’t make it not logging.

insanitybit
7 replies
1d6h

If you lose logs when your service crashes you're losing logs at the time they are most important.

tux1968
3 replies
1d5h

That's unavoidable if the logging service is down when your server crashes.

Having a local queue doesn't mean logging to the service is delayed, it can be sent immediately. All the local queue does is give you some resiliency, by being able to retry if the first logging attempt fails.

insanitybit
2 replies
1d3h

If your logging service is down all bets are off. But by buffering logs you're now accepting that problems not related to the logging service will also cause you to drop logs - as I mentioned, your service crashing, or being OOM'd, would be one example.

tux1968
1 replies
1d

What's more likely? An intermittent network issue, the logging service being momentarily down, or a local crash that only affects your buffering queue?

If an OOM happens, all bets are off anyway, since it has as much likelihood of taking out your application as it does your buffering code. The local buffering code might very well be part of the application in the first place, so the fate of the buffering code is the same as the application anyway.

It seems you're trying very hard to contrive a situation where doing nothing is better than taking reasonable steps to counter occasional network hiccups.

insanitybit
0 replies
21h59m

It seems you're trying very hard to contrive a situation where doing nothing is better than taking reasonable steps to counter occasional network hiccups.

I think you've completely misunderstood me then. I haven't taken a stance at all on what should be done. I'm only trying to agree with the grandparent poster about logging ultimately reflecting CAP Theorem.

tremon
1 replies
1d5h

But if your service has downtime because the logs could not be written, that seems strictly inferior. As someone else wrote upthread, you only want guaranteed delivery for logs if they're required under a strict audit regime and the cost of noncompliance is higher than the cost of a service outage.

insanitybit
0 replies
1d2h

FWIW I agree, I'm just trying to be clear that you are choosing one or the other, as the grandparent was stating.

andreasmetsala
0 replies
1d5h

No, you’re losing client logs when your logging service crashes. Your logging service should probably not be logging through calls to itself.

wolverine876
2 replies
1d5h

Logging can be essential to security (to auditing). It's your record of what happened. If an attacker can cause logging to fail, they can cover their tracks more easily.

deathanatos
1 replies
1d5h

To me audit logs aren't "logs" (in the normal sense), despite the name. They tend to have different requirements; e.g., in my industry, they must be retained, by law, and for far longer than our normal logs.

To me, those different requirements imply that they should be treated differently by the code, probably even under distinct flows: synchronously, and ideally to somewhere that I can later compress like hell and store in some very cheap long term storage.

Whereas the debug logs that I use for debugging? Rotate out after 30 to 90d, … and yeah, best effort is fine.

(The audit logs might also end up in one's normal logs too, for convenience.)

wolverine876
0 replies
1d5h

While I generally agree, I'll add that the debug logs can be useful in security incidents.

linuxdude314
0 replies
1d5h

It’s not the core of the application, but it can be the core of the business.

For companies that sell API access logs in one form or another are how bills are reconciled and usage metered.

cduzz
0 replies
1d6h

People use logging (appropriately or inappropriately; not my bucket of monkeys) for a variety of things including audit and billing records, which are likely a good case for a guaranteed delivery API.

People often don't think precisely about what they say or want, and also often don't think through corner cases such as "what if XYZ breaks or gets slow?"

And don't get me started on "log" messages that are 300mb events. Per log. Sigh.

supriyo-biswas
9 replies
1d6h

The better way to do this is to write the logs to a file or an in-memory ring buffer and have a separate thread/process push logs from the file/ring-buffer to the logging service, allowing for retries if the logging service is down or slow (for moderately short values of down/slow).

Promtail[1] can do this if you're using Loki for logging.

[1] https://grafana.com/docs/loki/latest/send-data/promtail/

insanitybit
5 replies
1d6h

But that's still not guaranteed delivery. You're doing what the OP presented - choosing to drop logs under some circumstances when the system is down.

a) If your service crashes and it's in-memory, you lose logs

b) If your service can't push logs off (upstream service is down or slow) you either drop logs, run out of memory, or block

hgfghui7
2 replies
1d5h

You are thinking too much in terms of the stated requirements instead of what people actually want: good uptime and good debugability. Falling back to local logging means a blip in logging availability doesn't turn into all hands on deck everything is on fire. And it means that logs will very likely be available for any failures.

In other words it's good enough.

mort96
0 replies
1d5h

"Good uptime and good reliability but no guarantees" is just a good best effort system.

insanitybit
0 replies
1d3h

Good enough is literally "best effort delivery", you're just agreeing with them that this is ultimately a distributed systems problem and you either choose CP or AP.

o11c
0 replies
1d3h

Logging to `mmap`ed files is resilient to service crashes, just not hardware crashes.

kbenson
0 replies
1d5h

Yeah, what the "best effort" actually means in practice is usually a result of how much resources you want to throw at the problem. Those give you runway on how much of a problem you can withstand and perhaps recover from without any loss of data (logs), but in the end you're usually still just buying time. That's usually enough though.

sroussey
2 replies
1d6h

We did something like this at Weebly for stats. The app sent the stats to a local service via UDP, so shoot and forget. That service aggregated for 1s and then sent off server.

laurencerowe
1 replies
1d5h

Why UDP for a local service rather than a unix socket?

sroussey
0 replies
3h32m

Send and forget. Did not want to wait on an ack from a broken process.

Zondartul
2 replies
1d4h

I have some wishfull thinking ideas on this, but it should be possible to have both at least in an imaginary, theoretical scenario.

You can have both guaranteed delivery and no downtime if your whole system is so deterministic that anything that normally would result in blocking just will not, cannot happen. In other words it should be a hard real-time system that is formally verified top to bottom, down to the last transistor. Does anyone actually do that? Verify the program and the hardware to prove that it will never run out of memory for logs and such?

Continuing this thought, logs are probably generated endlessly, so either whoever wants them has to also guarantee that that they are processedand disposed of right after being logged... or there is a finite ammount of log messages that can be stored (arbitrary number like 10 000) but the user (of logs) has to guarantee that they will take the "mail" out of the box sooner than it overfills (at some predictable, deterministic rate). So really that means even if OUR system is mathematically perfect, we're just making the downtime someone elses problem - namely, the consumer of the infinite logs.

That, or we guarantee that the final resources of our self-contained, verified system will last longer than the finite shelf life of the system as a whole (like maybe 5 years for another arbitrary number)

morelisp
0 replies
1d3h

PACELC says you get blocking or unavailability or inconsistency.

ElectricalUnion
0 replies
2h46m

From a hardware point of view, this system is unlikely to exist, because you need a system with components that never have any reliability issues ever to have a totally deterministic system.

From a software point of view, this system is unlikely to exist as it doesn't matter that the cause of your downtime is "something else that isn't our system". As a result, you're gonna end up requiring infinite reliable storage to upkeep your promises.

rezonant
1 replies
1d6h

"If the GD logging interface is offline or slow, you'll take downtime; is that okay?"

[edit: added "GD" to clarify that I was referring to the guaranteed delivery logging api, not the best effort logging API]

i read GD as god-damned :-)

salamanderman
0 replies
1d5h

me too [EDIT: and I totally empathized]

loeg
1 replies
1d6h

I read GD as “god damn,” which also seems to fit.

rezonant
0 replies
1d5h

aw you beat me to it

lopkeny12ko
6 replies
1d5h

I would posit that if your product's availability hinges on +/- 100ms, you are doing something deeply wrong, and it's not your logging library's fault. Users are not going to care if a button press takes 100 more ms to complete.

fnordpiglet
1 replies
1d5h

100ms for something like say API authorization on a high volume data plane service would be unacceptable. Exceeding latencies like that can degrade bandwidth and cause workers to exhaust connection counts. Likewise, even in humans response space, 100ms is an enormous part of a budget for responsiveness. Taking again authorization, if you spend 100ms, you’re exhausting the perceptible threshold for a humans sense of responsiveness to do something that’s of no practical value but is entirely necessary. Your UI developers will be literally camped outside your zoom room with virtual pitch forks night and day.

loeg
0 replies
1d3h

Yes, and in fact the service I am talking about is a high volume data plane service.

saagarjha
0 replies
17h56m

Core libraries at, say, Google, are supposed to be reliable to several nines. If they go down for long enough for a human to notice, they’re failing SLA.

loeg
0 replies
6h48m

Our service is expected to respond to small reads at under 1ms at the 99th percentile. >100ms stalls (which can go into many seconds) are absolutely unacceptable.

kccqzy
0 replies
1d3h

Add some fan out and 100ms could suddenly become 1s, 10s…

hamandcheese
0 replies
1d5h

Not every API is a simple CRUD app with a user at the other end.

tuetuopay
3 replies
1d7h

We had prod halt once when the syslog server hanged. Logs were pushed through TCP which propagated the blocking to the whole of prod. We switched to UDP transport since: better to lose some logs than the whole of prod.

deathanatos
1 replies
1d5h

TCP vs. UDP and async best-effort vs. synchronous are completely orthogonal…

E.g., a service I wrote wrote logs to an ELK setup; we logged over TCP. But the logging was async: we didn't wait for logs to make it to ELK, and if the logging services went down, we just queued up logs locally. (To a point; at some point, the buffer fills up, and logs were discarded. The process would make a note of this if it happened, locally.)

tuetuopay
0 replies
1d4h

TCP vs. UDP and async best-effort vs. synchronous are completely orthogonal…

I agree, when stuff is properly written. I don't remember the exact details, but at least with UDP the asyncness is built-in: there is no backpressure whatsoever. So poorly written software can just send udp to heart's end.

tetha
0 replies
1d7h

Especially if some system is unhappy enough to log enough volume to blow up the local log disk... you'll usually have enough messages and clues in the bazillion other messages that have been logged.

oneepic
1 replies
1d5h

Oh, we had this type of issue ("logging lib breaks everything") with a $MSFT logging library. Imagine having 100 threads each with their own logging buffer of 300MB. Needless to say it annihilated our memory and our server crashed, even on the most expensive sku of Azure App Service.

pests
0 replies
1d3h

Brilliant strategy.

Reminds me a litte of the oldtimers trick of adding a sleep(1000) somewhere so they could later come back and have some resources later, or if they needed a quick win with the client.

Now cloud companies are using malloc(300000000) it to fake resource usage. /s

RobertVDB
0 replies
1d5h

Ah, the classic GLOG-induced service stall - brings back memories! I've seen similar scenarios where logging, meant to be a safety net, turns into a trap. Your 90-99% figure resonates with my experience. It's like opening a small window for fresh air and having a storm barrel in. We eventually had to balance between logging verbosity and system performance, kind of like a tightrope walk over a sea of unpredictable IO delays. Makes one appreciate the delicate art of designing logging systems that don't end up hogging the spotlight (and resources) themselves, doesn't it?

Animats
53 replies
1d3h

QNX had this right decades ago. The microkernel has upper bounds on everything it does. There are only a few tens of thousands of lines of microkernel code. All the microkernel does is allocate memory, dispatch the CPU, and pass messages between processes. Everything else, including drivers and loggers, is in user space and can be preempted by higher priority threads.

The QNX kernel doesn't do anything with strings. No parsing, no formatting, no messages.

Linux suffers from being too bloated for real time. Millions of lines of kernel, all of which have to be made preemptable. It's the wrong architecture for real time. So it took two decades to try to fix this.

vacuity
24 replies
1d3h

For a modern example, there's seL4. I believe it does no dynamic memory allocation. It's also formally verified for various properties. (Arguably?) its biggest contribution to kernel design is the pervasive usage of capabilities to securely but flexibly export control to userspace.

adastra22
19 replies
1d3h

Capabilities are important, but I don’t think that was introduced by seL4. Mach (which underlies macOS) has the same capability-based system.

vacuity
18 replies
1d1h

I didn't say seL4 introduced capabilities. However, to my knowledge, seL4 was the first kernel to show that pervasive usage of capabilities is both feasible and beneficial.

adastra22
13 replies
22h20m

There's quite a history of capabilities-based research OS's that culminated in, but did not start with L4 (of which seL4 is a later variant).

vacuity
12 replies
22h2m

Yes, but I believe seL4 took it to the max. I may be wrong on that count, but I think seL4 is unique in that it leverages capabilities for pretty much everything except the scheduler. (There was work in that area, but it's incomplete.)

mananaysiempre
8 replies
20h5m

IIRC the KeyKOS/EROS/CapROS tradition used capabilities for everything including the scheduler. Of course, pervasive persistence makes those systems somewhat esoteric (barring fresh builds, they never shut down or boot up, only go to sleep and wake up in new bodies; compare Smalltalk, etc.).

vacuity
4 replies
19h59m

Guess I'm too ignorant. I need to read up on these. I did know about the persistence feature. I think it's not terrible but also not great, and systems should be designed for being shut down and apps being closed.

naasking
3 replies
7h20m

I think it's not terrible but also not great, and systems should be designed for being shut down and apps being closed.

The problem with shutdowns and restarts is the secure bootstrapping problem. The boot process must be within the trusted computing base, so how do you minimize the chance of introducing vulnerabilities? With checkpointing, if you start in a secure state, you're guaranteed to have a secure state after a reboot. This is not the case with other any other form of reboot, particularly ones that are highly configurable and so easy for the user to introduce an insecure configuration.

In any case, many apps are now designed to restore their state on restart, so they are effectively checkpointing themselves, so there's clearly value to checkpointing. In systems with OS-provided checkpointing it's a central shared service and doesn't have to be replicated in every program. That's a significant reduction in overall system code that can go wrong.

vacuity
2 replies
6h40m

It's fallacious to assume that the persistence model of the system can't enter an invalid state and thus cause issues similar to bootstrapping. The threat model also doesn't make sense to me: if an attacker can manipulate the boot process, I feel like they would be able to attack the overall system just fine. Also, there's the bandwidth usage, latency, and whatnot. I think persistence is a strictly less powerful, although certainly convenient, design for an OS.

naasking
1 replies
6h5m

The threat model also doesn't make sense to me: if an attacker can manipulate the boot process, I feel like they would be able to attack the overall system just fine.

That's not true actually. These capability systems have the principle of least privilege right down to their core. The checkpointing code is in the kernel which only calls out to the disk driver in user space. The checkpointing code itself is basically just "flush these cached pages to their corresponding locations on disk, then update a boot sector pointer to the new checkpoint", and booting a system is "read these pages pointed to by this disk pointer sequentially into memory and resume".

The attack surface in this system is incomparably small compared to the boot process of a typical OS, which run user-defined scripts and scripts written by completely unknown people from software you downloaded from the internet, often with root or other broad sets of privileges.

I really don't think you can appreciate how this system works without digging into it a little. EROS was built from the design of KeyKOS that ran transactional bank systems back in the 80s. KeyKOS pioneered this kind of checkpointing system, so it saw real industry use in secure systems for years. I recommend at least reading an overview:

https://flint.cs.yale.edu/cs428/doc/eros-ieee.pdf

EROS is kind of like what you'd get it if you took Smalltalk and tried to push it into the hardware as an operating system, while removing all sources of ambient authority. It lives on as CapROS:

https://www.capros.org/

vacuity
0 replies
5h11m

I don't deny that bootstrapping in current systems is ridiculous, but I don't see why it can't be improved. It's not like EROS is a typical OS either. In any case, I'll read up on those OSes.

adastra22
2 replies
16h28m

Amoeba was my favorite, as it was a homogeneous, decentralized operating system. Different CPU architectures spread across different data centers, and it was all homogenized together into a single system image. You had a shell prompt where you typed commands and the OS could decide to spawn your process on your local device, in the server room rack, or in some connected datacenter in Amsterdam, it didn't make a difference. From the perspective of you, your program, or the shell, it's just a giant many-core machine with weird memory and peripheral access latencies that the OS manages.

Oh, and anytime as needed the OS could serialize out your process, pipe it across the network to another machine, and resume. Useful for load balancing, or relocating a program to be near the data it is accessing. Unless your program pays special attention to the clock, it wouldn't notice.

I still think about Amoeba from time to time, and imagine what could have been if we had gone down that route instead.

vacuity
1 replies
15h45m

Wouldn't there be issues following from distributed systems and CAP? Admittedly, I know nothing about Amoeba.

E.g. You spawn a process on another computer and then the connection drops.

adastra22
0 replies
3h3m

There's no free lunch of course, so you would have circumstances where a network partition at a bad time would result in a clone instead of a copy. I don't know what, if anything, Amoeba did about this.

In practice it might not be an issue. The reason you'd typically do something like move processes across a WAN is because you want it to operate next to data it is making heavy use of. The copy that booted up local to the data would continue operating, while the copy at the point of origin would suddenly see the data source go offline.

Now of course more complex schemes can be devised, like if the data source is replicated and so both copies continue operating. Maybe a metric could be devised for detecting these instances when the partition is healed, and one or both processes are suspended for manual resolution? Or maybe programs just have to be written with the expectation that their capabilities might suddenly become invalid at any time, because the capability sides with the partition that includes the resource? Or maybe go down the route of making the entire system transactional, so that partition healing can occur, and only throw away transaction deltas once receipts are received for all nodes ratcheting state forward?

It'd be an interesting research area for sure.

adastra22
2 replies
21h16m

L4 was developed in the 90's. Operating Systems like Amoeba, which were fundamentally capability-based to a degree that even exceeds L4, were a hot research topic in the 80's.

L4's contribution was speed. It was assumed that microkernels, and especially capability-based microkernels were fundamentally slower than monolithic kernels. This is why Linux (1991) is monolithic. Yet L4 (1994) was the fastest operating system in existence at the time, despite being a microkernel and capability based. It's too bad those dates aren't reversed, or we might have had a fast, capability-based, microkernel Linux :(

josephg
1 replies
10h25m

How did it achieve its speed? My understanding was that microkernel architectures were fundamentally slower than monolithic kernels because context switching is slower than function calls. How did L4 manage to be the fastest?

vacuity
0 replies
8h13m

Two factors are the small working set (fit the code and data in cache) and heavily optimized IPC. The IPC in L4 kernels is "a context switch with benefits": in the best case, the arguments are placed in registers and the context is switched. Under real workloads, microkernels probably will be slower, but not by much.

monocasa
2 replies
22h57m

The other L4s before it showed that caps are useful and can be implemented efficiently.

vacuity
1 replies
21h50m

https://dl.acm.org/doi/pdf/10.1145/2517349.2522720

" We took a substantially different approach with seL4; its model for managing kernel memory is seL4’s main contribution to OS design. Motivated by the desire to reason about resource usage and isolation, we subject all kernel memory to authority conveyed by capabili- ties (except for the fixed amount used by the kernel to boot up, including its strictly bounded stack). "

I guess I should've said seL4 took capabilities to the extreme.

naasking
0 replies
7h25m

seL4 was heavily inspired by prior capability based operating systems like EROS (now CapROS) and Coyotos. Tying all storage to capabilities was core to those designs.

Animats
0 replies
3h45m

No, that was KeyKOS, which was way ahead of its time.[1] Norm Hardy was brilliant but had a terrible time getting his ideas across.

[1] https://en.wikipedia.org/wiki/KeyKOS

_kb
3 replies
23h10m

And unfortunately had its funding dumped because it wasn’t shiny AI.

snvzz
2 replies
10h48m

Its old source of funding. And it was much more complex[0] than that.

seL4 is now a healthy non-profit, seL4 foundation[1].

0. https://microkerneldude.org/2022/02/17/a-story-of-betrayal-c...

1. https://microkerneldude.org/2022/03/22/ts-in-2022-were-back/

Animats
1 replies
5h20m

The trouble with L4 is that it's so low-level you have to put another OS on top of it to do anything Which usually means a bloated Linux. QNX offers a basic POSIX interface, implemented mostly as libraries.

snvzz
0 replies
4h8m

Note that L4 and seL4 are very different kernels. They represent the 2nd generation and 3rd generation of microkernels respectively.

With that out of the way, you're right in that the microkernel doesn't present a posix interface.

But, like QNX, there are libraries for that, seL4 foundation itself maintains some.

They have a major ongoing effort on system servers, driver APIs and ways to deploy system scenarios. Some of them were talked about in a recent seL4 conference.

And then there's third party efforts like the amazing Genode[0], which supports dynamic scenarios with the same drivers and userspace binaries across multiple microkernels.

They even have a modern webbrowser, 3d acceleration as well as providing a virtualbox box that runs inside Genode, so the dogfooding developers are be able to run e.g. Linux inside a virtualbox to bridge the gap.

0. https://www.genode.org/

gigatexal
8 replies
1d2h

QNX is used in vehicle infotainment systems no? Where else?

I'm not bothered by the kernel bloat. There's a lot of dev time being invested in Linux and while the desktop is not as much of a priority as say the server space a performant kernel on handhelds and other such devices and the dev work to get it there will benefit the desktop users like myself.

SubjectToChange
1 replies
21h39m

Railroads/Positive Train Control, emergency call centers, etc. QNX is used all over the place. If you want an even more impressive Microkernel RTOS, then Green Hills INTEGRITY is a great example. It's the RTOS behind the B-2, F-{16,22,35}, Boeing 787, Airbus A380, Sikorsky S-92, etc.

yosefk
0 replies
10h15m

"Even more impressive" in what way? I haven't used INTEGRITY but used the Green Hills compiler and debugger extensively for years and they're easily the most buggy development tools I've ever had the misfortune to use. To me the "impressive" thing is their ability to lock safety critical software developers into using this garbage.

tyfon
0 replies
1d1h

It was used in my old toyota avensis from 2012. The infotainment was so slow you could measure performance in seconds pr frame instead of frames pr second :)

In the end, all I could practically use it for was as a bluetooth audio connector.

notrom
0 replies
23h51m

I've worked with it in industrial automation systems in large scale manufacturing plants where it was pretty rock solid. And I'm aware of it's use in TV production and transmissions systems.

lmm
0 replies
1d1h

QNX is used in vehicle infotainment systems no? Where else?

A bunch of similar embedded systems. And blackberry, if anyone's still using them.

dilyevsky
0 replies
21h5m

Routers, airplanes, satellites, nuclear power stations, lots of good stuff

bkallus
0 replies
1d1h

I went to a conference at GE Research where I spoke to some QNX reps from Blackberry for a while. Seemed like they were hinting that some embedded computers in some of GE'S aerospace and energy stuff relies on QNX.

Cyph0n
0 replies
23h32m

Cisco routers running IOS-XR, until relatively recently.

gigatexal
7 replies
1d2h

QNX had this right decades ago. The microkernel has upper bounds on everything it does. There are only a few tens of thousands of lines of microkernel code. All the microkernel does is allocate memory, dispatch the CPU, and pass messages between processes. Everything else, including drivers and loggers, is in user space and can be preempted by higher priority threads.

So much like a well structured main method in a C program or other C like language where main just orchestrates the calling of other functions and such. In this case main might initialize different things where the QNX kernel doesn't but the idea or general concept remains.

I'm no kernel dev but this sounds good to me. Keeps things simple.

vacuity
6 replies
1d1h

Recently, I've been thinking that we need a microkernel design in applications. You have the core and then services that can integrate amongst each other and the core that provide flexibility. Like the "browser as an OS" kind of things but applied more generally.

blackth0rn
2 replies
23h47m

ECS systems for the gaming world are somewhat like this. There is the core ECS framework and then the systems and entity's integrate with each other

spookie
1 replies
23h35m

ECS is incredible. Other areas should take notice

whstl
0 replies
23h6m

Agreed. I find that we're going in this direction in many areas, games just got there much faster.

Pretty much everywhere there is some undercurrent of "use this ultra-small generic interface for everything and life will be easier". With games and ECS, microkernels and IPC-for-everything, with frontend frameworks and components that only communicate between themselves via props and events, with event sourcing and CQRS backends, Actors in Erlang, with microservices only communicating via the network to enforce encapsulation... Perhaps even Haskell's functional-core-imperative-shell could count as that?

I feel like OOP _tried_ to get to this point, with dependency injection and interface segregation, but didn't quite get there due to bad ergonomics, verbosity and because it was still too easy to break the rules. But it was definitely an attempt at improving things.

vbezhenar
0 replies
17h50m

COM, OSGI, Service architecture, microservice architecture and countless other approaches. This is correct way to build applications, because it gets reinvented over and over again.

galdosdi
0 replies
1d

Yes! This reminds me strongly of the core/modules architecture of the apache httpd, as described by the excellent O'Reilly book on it.

The process of serving an HTTP request is broken into a large number of fine grained stages and plugin modules may hook into any or all of these to modify the input and output to each stage.

The same basic idea makes it easy to turn any application concept into a modules-and-core architecture. From the day I read (skimmed) that book a decade or two ago this pattern has been burned into my brain

elcritch
0 replies
14h23m

That’s pretty much what Erlang/OTP is, and it’s like a whole OS. Though it lacks capabilities.

creshal
4 replies
12h6m

Millions of lines of kernel, all of which have to be made preemptable.

~90% of those are device drivers, which you'd still need with a microkernel if you want it to run or arbitrary hardware.

dontlaugh
3 replies
11h12m

But crucially, drivers in a microkernel run in user space and are thus pre-emptible by default. Then the driver itself only has to worry about dealing with hardware timing when pre-empted.

creshal
2 replies
10h36m

Sure, but who's going to write the driver in the first place? Linux's "millions of lines of code" are a really underappreciated asset, there's tons of obscure hardware that is no longer supported by any other actively maintained OS.

dontlaugh
1 replies
10h20m

I also don't see how we could transition to a microkernel, indeed.

naasking
0 replies
7h18m

The very first "hypervisors" were actually microkernels that ran Linux as a guest. This was done with Mach on PowerPC/Mac systems, and also the L4 microkernel. That's one way.

The only downside of course, is that you don't get the isolation benefits of the microkernel for anything depending on the Linux kernel process.

js2
1 replies
23h22m

VxWorks is what's used on Mars and it's a monolithic kernel, so there's more than one way to do it. :-)

dilyevsky
0 replies
21h2m

I think RT build also had to disable mmu

signa11
0 replies
12h27m

this feels like tannenbaum-torvalds debate once again.

matheusmoreira
0 replies
9h38m

And yet it's getting done! It's very impressive work.

bregma
0 replies
1d2h

The current (SDP 8) kernel has 15331 lines of code, including comments and Makefiles.

AndyMcConachie
0 replies
6h26m

Linus Torvalds and Andrew Tannenbaum called. They want their argument back!

eisbaw
16 replies
1d7h

Great to hear. However even if Linux the kernel is real-time, likely the hardware won't be due to caches and internal magic CPU trickery.

Big complex hardware is a no-no for true real-time.

That's why AbsInt and WCET tools mainly has simple CPU architectures. 8051 will truly live forever.

btw, Zephyr RTOS.

wholesomepotato
7 replies
1d6h

Features of modern CPUs don't really prevent them from real time usage, afaik. As long as something is bounded and can be reasoned about it can be used to build a real time system. You can always assume no cache hits and alikes, maximum load etc and as long as you can put a bound on the time it will take, you're good to go.

synergy20
2 replies
1d6h

mlock your memory, test with cache miss and cache invalidation scenarios will help, using no heap for memory allocation, but it's a bit hard

jeffreygoesto
0 replies
1d2h
eschneider
0 replies
23h36m

Does anyone use paged memory in hard realtime systems?

bloak
1 replies
1d6h

So the things that might prevent you are:

1. Suppliers have not given you sufficient information for you to be able to prove an upper bound on the time taken. (That must happen a lot.)

2. The system is so complicated that you are not totally confident of the correctness of your proof of the upper bound.

3. The only upper bound that can prove with reasonable confidence is so amazingly bad that you'd be better off with cheaper, simpler hardware.

4. There really isn't a worst case. There might, for example, be a situation equivalent to "roll the dice until you don't get snake eyes". In networking, for example, sometimes after a collision both parties try again after a random delay so the situation is resolved eventually with probability one but there's no actual upper bound. A complex CPU and memory system might have something like that? Perhaps you'd be happy with "the probability of this operation taking more than 2000 clock cycles is less than 10^-13" but perhaps not.

formerly_proven
0 replies
10h40m

You're probably thinking about bus arbiters in 4.), which are generally fast but have no bounded settling time.

dooglius
0 replies
1d4h

System management mode is one example of a feature on modern CPUs that prevents real-time usage https://wiki.linuxfoundation.org/realtime/documentation/howt...

SAI_Peregrinus
0 replies
1d6h

Exactly. "Real-time" is a misnomer, it should be called "bounded-time". As long as the bound is deterministic, known in advance, and guaranteed, it's "real-time". For it to be useful it also must be under some application-specific duration.

The bounds are usually in CPU cycles, so a faster CPU can sometimes be used even if it takes more cycles. CPUs capable of running Linux usually have higher latency (in cycles) than microcontrollers, but as long as that can be kept under the (wall clock) duration limits with bounded-time it's fine. There will still be cases where the worst-case latency to fetch from DRAM in an RT-Linux system will be higher than a slower MCU fetching from internal SRAM, so RT-Linux won't take over all these systems.

0xDEF
3 replies
1d6h

Big complex hardware is a no-no for true real-time.

SpaceX uses x86 processors for their rockets. That small drone copter NASA put on Mars uses "big-ish" ARM cores that can probably run older versions of Android.

ska
2 replies
1d5h

Does everything runs on those CPUs though? Hard realtime control is often done on much simpler MCU at the lowest level, with oversight/planning for a high level system....

zokier
1 replies
1d4h

In short, no. For Ingenuity (the Mars2020 helicopter) the flight computer runs on pair of hard-realtime Cortex R5 MCUs paired with a FPGA. The non-realtime Snapdragon SoC handles navigation/image processing duties.

https://news.ycombinator.com/item?id=26907669

ska
0 replies
1d3h

That's basically what I expected, thanks.

nraynaud
1 replies
1d6h

I think it's really useful on 'big' MCU, like the raspberry pi. There exists an entire real time spirit there, where you don't really use the CPU to do any bit banging but everything is on time as seen from the outside. You have timers that receive the quadrature encoders inputs, and they just send interrupt when they wrap, the GPIO system can be plugged to the DMA, so you can stream the memory to the output pins without involving the CPU (again, interrupts at mid-buffer and empty buffer). You can stream to a DAC, stream from a ADC to memory with the DMA. A lot of that stuff bypasses the caches to get a predictable latency.

stefan_
0 replies
1d6h

Nice idea but big chip design strikes again: on the latest Raspberry Pi, GPIO pins are handled by the separate IO chip connected over PCI Express. So now all your GPIO stuff needs to traverse a shared serial bus (that is also doing bulk stuff like say raw camera images).

And already on many bigger MCUs, GPIOs are just separate blocks on a shared internal bus like AHB/APB that connects together all the chip IP, causing unpredictable latencies.

snvzz
0 replies
6h36m

8051 will truly live forever.

68000 is the true king of realtime.

SubjectToChange
0 replies
1d4h

Big complex hardware is a no-no for true real-time.

There are advanced real time cores like the Arm Coretex-R82. In fact many real time systems are becoming quite powerful due to the need to process and aggregate ever increasing amounts of sensor data.

Aaargh20318
13 replies
1d6h

What does this mean for the common user? Is this something you would only enable in very specific circumstances or can it also bring a more responsive system to the general public?

stavros
4 replies
1d6h

As far as I can understand, this is for Linux becoming an option when you need an RTOS, so for critical things like aviation, medical devices, and other such systems. It doesn't do anything for the common user.

ska
2 replies
1d5h

For the parts of such systems that you would need an RTOS for this isn't really a likely replacement because the OS is way too complex.

The sort of thing it could help with is servicing hardware that does run hard realtime. For example, you have an RTOS doing direct control of a robot or medical device or whatever, and you have a UI pendant or the like that a user is interacting with. If linux on that pendant can make some realtime latency guarantees, you may be able to simplify communication between the two without risking dropping bits on the floor.

Conversely, for the common user it could improve things like audio/video streaming, in theory but I haven't looked into details or how much trouble there is currently.

elcritch
1 replies
13h45m

It depends on the field. I know of one robots control software company planning to switch to a RT Linux stack. Their current one is a *BSD derived rtos that runs, kid you not, alongside windows.

RT Linux might not pass on some certifications, but there’s likely many systems where it would be sufficient.

ska
0 replies
5h36m

With robots a lot depends on the scope of movement and speed, and whether or not it interacts with environment/people. For some applications the controller is already dedicated hardware on the joint module anyway with some sophistication, connected to an CAN (or etherCAT) bus or something like that - so no OS is the tightest loop - I could see the high level control working on a RT linux or whatever if you wanted too, lots of tradeoffs. Mainly though it's the same argument, you probably don't want a complex OS involved in the lowest level/finest time tick updates. Hell some of the encoders are spewing enough data you probably end up with the first thing it hits being an ASIC anyway, then a MCU dealing with control updates/fusion etc., then a higher level system for planning.

SubjectToChange
0 replies
1d6h

The Linux kernel, real-time or not, is simply too large and too complex to realistically certify for anything safety critical.

fbdab103
2 replies
1d6h

My understanding is that real-time makes a system slower. To be real-time, you have to put a time allocation on everything. Each operation is allowed X budget, and will not deviate. This means if the best-case operation is fast, but the worst case is slow, the system has to always assume worst case.

snvzz
0 replies
10h41m

real-time makes a system slower.

linux-rt's PREEMPT_RT has a negligible impact. It is there, but it is negligible. It does, however, enable a lot of use cases where Linux fails otherwise, such as pro audio.

In modern usage, it even helps reduce input jitter with videogames and enable lower latency videoconference.

I am hopeful most distributions will turn it on by default, as it benefits most users, and causes negligible impact on throughput-centric workloads.

sesm
0 replies
13h28m

It’s a classic latency-throughout trade off: smaller latency - lower throughout. Doing certain operations in bulk (like GC) increases latency, but is also more efficient and increases throughput.

dist-epoch
1 replies
1d5h

It could allow very low latency audio (1-2 ms). Not a huge thing, but nice for some audio people.

snvzz
0 replies
6h34m

s/nice/needed/g

rcxdude
0 replies
1d3h

the most common desktop end-user that might benefit from this is those doing audio work: latency and especially jitter can be quite a pain there.

ravingraven
0 replies
1d6h

If by "common" user you mean the desktop user, not much. But this is a huge deal for embedded devices like industrial control and communication equipment, as their devs will be able to use the latest mainline kernel if they need real-time scheduling.

andrewaylett
0 replies
1d5h

RT doesn't necessarily improve latency, it gives it a fixed upper bound for some operations. But the work needed to allow RT can definitely improve latency in the general case -- the example of avoiding synchronous printk() calls is a case in point. It should improve latency under load even when RT isn't even enabled.

I think I'm right in asserting that a fully-upstreamed RT kernel won't actually do anything different from a normal one unless you're actually running RT processes on it. The reason it's taken so long to upstream has been the trade-offs that have been needed to enable RT, and (per the article) there aren't many of those left.

andy_ppp
8 replies
1d7h

What do other realtime OS kernels do when printing from various places? It almost seems like this should be done in hardware because it's such a difficult problem to not lose messages but also have them on a different OS thread in most cases.

EdSchouten
3 replies
1d7h

Another option is simply to print less, but expose more events in the form of counters.

Unfortunately, within a kernel that’s as big as Linux, that would leave you with many, many, many counters. All of which need to be exported and monitored somehow.

taeric
2 replies
1d7h

This seems to imply you would have more counters than messages? Why would that be?

That is, I would expect moving to counters to be less information, period. That not the case?

nraynaud
1 replies
1d7h

My guess is that each counter would need to have a discovery point, a regular update mechanism and a documentation, while you can send obscure messages willy-nilly in the log? And also they become an Application Interface with a life cycle while (hopefully) not too many people will go parse the log as an API.

taeric
0 replies
1d6h

I think that makes sense, though I would still expect counters to be more dense than logs. I'm definitely interested in any case studies on this.

ajross
2 replies
1d7h

It's just hard, and there's no single answer.

In Zephyr, we have a synchronous printk() too, as for low-level debugging and platform bringup that's usually desirable (i.e. I'd like to see the dump from just before the panic please!).

For production logging use, though, there is a fancier log system[1] designed around latency boundaries that essentially logs a minimally processed stream to a buffer than then gets flushed from a low priority thread. And this works, and avoids the kinds of problems detailed in the linked article. But it's fiddly to configure, expensive in an RTOS environment (you need RAM for that thread stack and the buffer), depends on having a I/O backend that is itself async/low-latency, and has the mentioned misfeature where when things blow up, it's usually failed to flush the information you need out of its buffer.

[1] Somewhat but not completely orthogonal with printk. Both can be implemented in terms of each others, mostly. Sometimes.

vlovich123
1 replies
1d5h

What if the lower priority thread is starved and the buffer is full? Do you start dropping messages? Or overwrite the oldest ones and skip messages?

ajross
0 replies
1d4h

It drops messages. That's almost always the desired behavior: you never want your logging system to be doing work when the system is productively tasked with other things.

I know there was some level of argument about whether it's best to overwrite older content (ring-buffer-style, probably keeps the most important stuff) or drop messages at input time (faster, probably fewer messages dropped overall). But logging isn't my area of expertise and I forget the details.

But again, the general point being that this is a complicated problem with tradeoffs, where most developers up the stack tend to think of it as a fixed facility that shouldn't ever fail or require developer bandwidth. And it's not, it's hard.

xenadu02
0 replies
1d

In many problem spaces you can optimize for the common success and failure paths if you accept certain losses on long-tail failure scenarios.

A common logging strategy is to use a ring buffer with a separate isolated process reading from the ring. The vast majority of the time the ring buffer handles temporary disruptions (eg slow disk I/O to write messages to disk) but in the rare failure scenarios you simply overwrite events in the buffer and increment an atomic overwritten event counter. Events do not get silently dropped but you prioritize forward progress at the cost of data loss in rare scenarios.

Microkernels and pushing everything to userspace just moves the tradeoffs around. If your driver is in userspace and blocks writing a log message because the log daemon is blocked or the I/O device it is writing the log to is overloaded it does the same thing. Your realtime thread won't get what it needs from the driver within your time limit.

It all comes down to CAP theorem stuff. If you always want the kernel (or any other software) to be able to make forward progress within specific time limits then you must be willing to tolerate some data loss in failure scenarios. How much and how often it happens depends on specific design factors, memory usage, etc.

dataflow
6 replies
22h46m

I feel like focusing on the kernel side misses CPU level issues.

Is there any known upper bound on, say, how long a memory access instruction takes on x86?

rsaxvc
3 replies
20h5m

I don't know for x86.

But for things that really matter, I've tested by configuring the MMU to disable caching for the memory that the realtime code lives in and uses to emulate 0% hitrate. And there's usually still a fair amount of variance on top of that depending on if the memory controller has a small cache, and where the memory controller is in its refresh cycle.

dataflow
2 replies
19h57m

Yeah. And I'm not sure that even that would give you the worst case as far as the cache is concerned. Of course I don't know how these implementations work, but it seems plausible that code that directly uses memory could run faster than code that encounters a cache miss beforehand (or contention, if you're using multiple cores). Moreover there's also the instruction cache, and I'm not sure if you can disable caching for that in a meaningful way?

For soft real time, I don't see a problem. But for hard real time, it seems a bit scary.

rsaxvc
0 replies
40m

You're right! I can think of two cases I've run into where bypassing the cache can be faster compared to a miss.

On some caches the line must be filled before allowing a write(ignoring any write buffer at the interface above the cache) - those basically halve the memory bandwidth when writing to a lot of cache lines. Some systems now have instructions for filling a cache line directly to avoid this. And some CPUs have bit-per-byte validity tracking to avoid this too.

Even on caches with hit-during-fill, a direct read from an address near the last-to-be-filled end of a cacheline can sometimes be a little faster than a cache miss, since the miss will fill the rest of the line first.

rsaxvc
0 replies
19m

Moreover there's also the instruction cache, and I'm not sure if you can disable caching for that in a meaningful way?

Intels used to boot with their caches disabled, but I haven't worked with them in forever, and never multicore.

I worked with a lot of microcontrollers, and it's not uncommon to be able to disable the instruction cache there.

There are a few things that require the data caches too, like atomic accesses on ARM. Usually we were doing something fairly short though in our realtime code, so it was easy enough to map just the memory it needed as uncacheable.

saagarjha
1 replies
17h55m

You can continually take page faults in a Turing complete way without executing any code, so I would guess this is unbounded?

dataflow
0 replies
17h53m

I almost mentioned page faults, but that's something the kernel has control over. It could just make sure everything is in memory so there aren't any faults. So it's not really an issue I think.

TeeMassive
4 replies
1d7h

It's kind of crazy that a feature necessitated 20 years of active development to be somewhat called complete.

I hope it will be ready soon. I'm working in a project that has strict serial communication requirements and it has caused us a lot of headaches.

worthless-trash
2 replies
1d6h

Can you expand on this, as I'm a little naive in this area, say you isolated the cpus (isolcpus parameter) and then taskset your task onto the isolated cpu, would not the scheduler no longer be involved, and your task be the only thing serviced by that CPU ?

Is it other interrupts on the CPU that break your process out of the "real time" requirement, I find this all so interesting.

TeeMassive
1 replies
1d6h

It's an embedded system with two logical cores with at least 4 other critical processes running. Doing that will only displace the problem.

worthless-trash
0 replies
16h4m

I (incorrectly) assumed that serial port control was the highly sensitive time problem that was being dealt with here.

eisbaw
0 replies
1d7h

Zephyr RTOS.

deepsquirrelnet
3 replies
1d6h

What a blast from the past. I compiled a kernel for Debian with RT_PREEMPT about 17-18 years ago to use with scientific equipment that needed tighter timings. I was very impressed at the latencies and jitter.

I haven’t really thought about it since then, but I can imagine lots of used cases for something like an embedded application with raspberry pi where you don’t quite want to make the leap into a microcontroller running an RTOS.

HankB99
2 replies
22h6m

Interesting to mention the Raspberry Pi. I saw an article just a day or two ago that claimed that the RpiOS was stated by and ran on top of RTOS. That's particularly interesting because at one time years ago, I saw suggestions that Linux could run as a task on an RTOS. Things that required hard real time deadlines could run on the RTOS and not be subject to the delays that a virtual memory system could entail.

I don't recall if this was just an idea or was actually implemented. I also have seen only the one mention of RpiOS on an RTOS so I'm curious about that.

rsaxvc
1 replies
20h24m

That's particularly interesting because at one time years ago, I saw suggestions that Linux could run as a task on an RTOS.

I've worked with systems that ran Linux as a task of uITRON as well as threadX, both on somewhat obscure ARM hardware. Linux managed the MMU but had a large carveout for the RTOS code. They had some strange interrupt management so that Linux could 'disable interrupts' but while Linux IRQs were disabled, an RTOS IRQ could still fire and context switch back to an RTOS task. I haven't seen anything like this on RPi though, but it's totally doable.

HankB99
0 replies
17h53m

Interesting to know that it was more than just an idea - thanks!

0xDEF
3 replies
1d6h

What do embedded real-time Linux people use for bootloader, init system, utilities, and C standard library implementation? Even Android that does not have real-time constraints ended up using Toybox for utilities and rolling their own C standard library (Bionic).

shandor
0 replies
2h33m

I guess U-boot, uclibc, and busybox is quite common starting point.

Of course, this varies immensely between different use cases, as ”embedded Linux” spans such a huge swath of different kinds of systems from very cheap and simple to complex and powerful.

rcxdude
0 replies
1d3h

You aren't likely to need to change a lot of these: the whole point is basically making it so that all that can run as normal but won't really get in the way of your high-priority process. It's just that your high-priority process needs to be careful not to block on anything that might take too long due to some other stuff running. In which case you may need to avoid certain C standard library calls, but not replace it entirely.

jovial_cavalier
0 replies
23h38m

I use u-boot for a boot loader. As for init and libc, I just use systemd and glibc.

Boot time is not a bottleneck for my application (however long it takes, the client will take longer…), and I’m sure there’s some more optimal libc to use, but I’m not sure the juice is worth the squeeze.

I’m also interested in what others are doing.

salamanderman
2 replies
1d5h

I had a frustrating number of job interviews in my early career where the interviewers didn't know what realtime actually was. That "and predictable delay" concept from the article frequently seemed to be lost on many folks, who seemed to think realtime just meant fast, whatever that means.

ska
0 replies
1d5h

The old saying "real time" /= "real fast". Hard vs "soft" realtime muddies things a bit, but I think it's probably the majority of software developers don't really understand what realtime actually is either.

mort96
0 replies
1d5h

I would even remove the "minimum" part altogether; the point of realtime is that operations have predictable upper bounds. That might even mean slower average cases than in non-realtime systems. If you're controlling a car's braking system, "the average delay is 50ms but might take up to 80ms" might be acceptable, whereas "the average delay is 1ms but it might take arbitrarily long, possibly multiple seconds" isn't.

NalNezumi
1 replies
1d5h

Slightly tangential, but does anyone know good learning material to understand real-time (Linux) kernel more? For someone with rudimentary Linux knowledge.

I've had to compile&install real-time kernel as a requirement for a robot arm (franka) control computer. It would be nice to know a bit more than just how to install the kernel.

ActorNightly
0 replies
1d5h

https://www.freertos.org/implementation/a00002.html

Generally, having experience with Greenhills in a previous job, for personal projects like robotics or control systems I would recommend programming a microcontroller directly rather than dealing with SoC with RTOS. Modern STM32s with Cortex chips have enough processing power to run pretty much anything.

w10-1
0 replies
1d2h

There is no end game until there are end users beating on the system. That would put the 'real' in 'real-time'.

But who using a RTOS now would take the systems-integration cost/risk of switching? Would this put Android closer to Metal performance?

the8472
0 replies
1d3h

For an example how far the kernel goes to get log messages out even on a dying system and how that's used in real deployments:

https://netflixtechblog.com/kubernetes-and-kernel-panics-ed6...

sesm
0 replies
23h43m

IMO if you really care about certain process being responsive, you should allocate dedicated CPU cores and a contiguous region of memory to it, that shouldn’t be touched by the rest of OS. Oh, and also give a it direct access to a separate network card. I’m not sure if Linux supports this.

rwmj
0 replies
1d3h

About printk, the backported RT implementation of printk added to the RHEL 9.3 kernel has deadlocks ... https://issues.redhat.com/browse/RHEL-15897 & https://issues.redhat.com/browse/RHEL-9380

pardoned_turkey
0 replies
23h29m

The conversation here focuses on a distinction between "hard" real-time applications, where you probably don't want a general-purpose OS like Linux no matter what; and "soft" real-time applications like videoconferencing or audio playback, where you nothing terrible happens if you get a bit of stuttering or drop a couple of frames every now and then. The argument is that RT Linux would be a killer solution for that.

But you can do all these proposed "soft" use cases with embedded Linux today. It's not like low-latency software video or audio playback is not possible, or wasn't possible twenty years ago. You only run into problems on busy systems where non-preemptible I/O could regularly get in the way. That's seldom a concern in embedded environments.

I think there are compelling reasons for making the kernel fully-preemptible, giving people more control over scheduling, and so forth. But these reasons have relatively little to do with wanting Linux to supersede minimalistic realtime OSes or bare-metal code. It's just good hygiene that will result in an OS that, even in non-RT applications, behaves better under load.

knorker
0 replies
1d6h

I just want SCHED_IDLEPRIO to actually do what it says.

jovial_cavalier
0 replies
23h12m

does HN have any thoughts on Xenomai[1]? I've been using it for years without issue.

On a BeagleBone Black, it typically gives jitter on the order of hundreds of nanoseconds. I would consider it "hard" real-time (as do they). I'm able to schedule tasks periodically on the scale of tens of microseconds, and they never get missed.

It differs from this in that Real-Time Linux attempts to make Linux itself preemptive, whereas Xenomai is essentially its own kernel, running Linux as a task on top. It provides an ABI which allows you to run your own tasks alongside or at higher prio than Linux. This sidesteps the `printk()` issue, for instance, since Xenomai doesn't care. It will gladly context switch out of printk in order to run your tasks.

The downside is that you can't make normal syscalls while inside of the Xenomai context. Well... you can, but obviously this invalidates the realtime model. For example, calling `printf()` or `malloc()` inside of a xenomai task is not preemptable. The Xenomai ABI does its best to replicate everything you may need as far as syscalls, which works great as long as you're happy doing your own heap allocations.

[1]: https://xenomai.org/

alangibson
0 replies
1d5h

Very exiting news for those of us building CNC machines with LinuxCNC. The end of kernel patches is nigh!

Tomte
0 replies
1d3h