> ‘But REPLY_FAULT also provides a way to define and implement new kinds of errors — application-specific errors — such as access control rules. For instance, the Hubris IP stack assigns IP ports to tasks statically. If a task tries to mess with another task’s IP port, the IP stack faults them. This gets us the same sort of “fail fast” developer experience, with the smaller and simpler code that results from not handling “theoretical” errors that can’t occur in practice.‘
This sounds good when the system is small and tight and applications are written mostly by people who designed the whole system.
But as an application developer, I’d be somewhat scared to interface with third-party code over an IPC model where the other service can at any time send back an instant death pill to my process.
I guess I just don’t trust other app developers that much. The world is full of terrible drivers and background processes written by stressed-out developers harassed by management. They’ll drop in a bunch of potentially unsuitable default REPLY_FAULTs if it means they get to go home before 8pm.
I think that's intentional because that's what Hubris is aimed at.
...and in that circumstance, the author reports finding, apparently serendipitously, that it helped with development: "Initially I was concerned that I’d made the kernel too aggressive, but in practice, this has meant that errors are caught very early in development. A fault is hard to miss, and literally cannot be ignored the way an error code might be."
Swift death to deviance is a way to keep the system tight. The designed scope probably keeps it small anyway. Scopes have a way of creeping, but I don't think people will want to force tasks into Hubris that would be better on the host rather than in its embedded controllers.
To be fair, this is how abort works in any library you call into that’s in your process.
The Dennis Nedry approach to counting dinosaurs in Jurassic Park.
Indeed, this happened with Symbian. An IPC server could panic the client. As an application developer without access to the OS source code this was pretty terrible. Not all preconditions were easily understood and could vary between devices and OS versions.
For service, think "OS interface". If you make a bogus kernel call on a monolithic kernel, it would be reasonable for the OS to kill you. Also note that when you say "process" it might be different than you think because threads all share the same address space on hubris.
It seems like in an embedded environment, it's good to resolve these misunderstandings immediately when they occur, regardless of whose fault it is.
The server says "that client is bad!" so the kernel kills it. The problem is really that the two didn't understand each other.