In general, verifying programs is extremely hard, not just because of rice's theorem but because it's so easy to miss a spot, especially for non-trivial bytecode languages like lua's. wasm has no concepts of for loops for example.
It's strange that after upstream has given up on the problem as it was too hard, factorio devs have chosen to try to fix the verifier/write their own (not sure which of the two they did).
Minetest's loadstring function forbids bytecode entirely: https://github.com/minetest/minetest/blob/9a1501ae89ffe79c38...
I wonder why factorio mods need the ability to execute raw lua bytecode. If they don't have it, there would be no need for a verifier.
It's quite dangerous in the first place to execute lua code downloaded over the network. JS execution environments have gone through decades of cycles of discoveries of exploits and fixes. Lua gets those as well but on a smaller scale, and with less staffing to improve security. The main protection is I guess that there is fewer people running malicious game servers.
Factorio disabled bytecode loading in response to this. Bytecode did allow for some cool stuff like writing mods in a preprocessor language that spits out Lua bytecode, but ultimately the security issues were more important to address.
Almost all of the debug library was made unavailable to mods as well, for similar security reasons.
Citation?
Factorio 1.1.101 (which the blog post says included the fix) does not list any changes regarding the disabling of bytecode or restricting the debug library. This would have been notable news, even without admitting the security risk. Factorio 1.1.107 does mention disabling the debug library, but it doesn’t seem this article had anything to do with that.
I work on the game. The debug library was disabled for other security holes that were brought to our attention, so it wouldn‘t be related to this, but I thought it was interesting to mention.
I believe the change was not mentioned in the changelog as an attempt at 'security through obscurity', trying to avoid people getting any ideas before the update is wide-spread. Not sure that helps any, but still.
Sorry, but thats just a perfect example why 'security through obscurity' is wrong. I have zero idea about security risks, but if fix does not mentioned anywhere, then for people that use previous version there no rush to upgrade.
I suspect the overwhelming majority of Factorio players are using Steam, which auto updates.
Due to the need for perfect synchronization all users need to be using the exact same version. Mods can also break between versions. It is therefore very common for public servers to stick on one version for extended periods of time. It is common for people to use the Steam "betas" functionality to pick an exact version or download an exact version from the Factorio website.
I would say that servers only tend to update when large features are released. So announcing a security vulnerability would likely push some servers to update.
Without metrics of some kind from Wube I guess we aren't likely to know for sure, but I doubt very much it is common to run old versions of the game on Steam. I bet you that most people are simply running on the latest version at all times. That solves the MP issue, and plenty of mods don't need to be updated for each game version.
Factorio is special though, because it actively uses the beta version functionality in Steam to not only provide betas but also older stable versions. This allows the devs to move fast and break things.
I know I've held back my copy of Factorio due to some concern over changes in newer versions, preferring to letting the dust settle before upgrading to the latest stable version.
I don‘t disagree.
Arguments either way. Generic "security vulnerabilities addressed" in release notes is a nice balance.
Loading raw bytecode is known to be unsafe, and iirc that is mentioned in lua_load/luaL_load* documentation.
A preprocessor could spit out Lua code with the same effect and less complexity. Really interesting why and how these decision were made.
For what it’s worth, Metalua also generated PUC-Lua bytecode directly instead of source code, making it incompatible with LuaJIT (which might have been part of the reason why it died).
Factorio has stuff like this:
https://mods.factorio.com/mod/Moon_Logic
Besides, it's quite limiting to create software that can't just execute in a Turing complete environment.
Anyway, we really need interpreters that include a strong capability system.
“If I ran the business” (TM), I would just put it in Factorio settings as a toggle switch called “Reduced Security Mode - Allow Lua Bytecode.” By default, it’s turned off. People who really want those mods can turn it on, as long as they are informed (UAC prompt style) that they better trust the mod authors. I’d also add an API for mod authors to detect if bytecode access is enabled; so they can make their mods compatible with either environment.
Or maybe, down the road, Factorio could enable mods with greater privileges, as long as they are open source, and do an App Store-style review process with the community. Approved mods get not just bytecode, but perhaps even some of the typically forbidden modules like filesystem access. Unapproved mods using those enhanced privileges won’t run without a startup flag.
This works until a popular mod requires it (for legit reasons) then enabling the option becomes the de-facto standard for people who install mods.
It's possible to make it so you enable it on a per mod basis, like app permissions
When you're talking about security, adding a bunch of config flags for users is never a good idea. Most users aren't going to understand what it does, and like others mentioned, there's too many reasons to turn it on.
Because that's never ever been a total usability disaster that just encourages people to enable every one because they don't feel like fighting it. It's also not effective, given how tightly tied mods are to the core API.
I see no problem. It's those people's choice to do so, and they accept the risk.
OP only refers to bytecode. There's nothing wrong with executing Lua when provided to the VM via source code. The only reason to allow the VM to load bytecode directly is 1) a very minor improvement in loading time, as the interpreter then doesn't have to parse Lua code into bytecode itself 2) allowing obfuscation of logic running within Factorio. Both seem rather irrelevant, so I'm not sure why they allow loading bytecode directly.
I don't think Rice is relevant at all. I guess Rice is a useful first screen. Do you believe you can "just" Decide this correctly? If so, Henry Rice got his PhD half a century ago for proving you can't do that, stop.
But assuming you've made your peace with accepting only a subset of the inputs that would actually meet your requirements, Rice is done. And you're right - now instead of an impossible task you've just got an extremely hard task. This means when you fail (which you will) at least nobody will tell you it was impossible, if that helps?
Factorio should not have done this.
This idea of accepting some kind of subset is exactly what JVM does. There is a set of rules (IIRC 29 of them) that the JVM bytecode have to follow to be accepted, the one important rule is “stack entries should always be used as a same type”, the rest of the rules are there so that this can be statically checked.
As I said, because of Rice accepting a subset is the only thing which could work. But it only could work, it probably won't because now you've gone from "Impossible" to merely "Incredibly hard" and that's not as a big a change as you'd hope.
Perhaps a dumb question then. Java has famously had a bytecode verifier for decades. Is it the case that:
a) bytecode verification is fundamentally easier in statically typed languages
or
b) it's just as hard for Java, but Java has had decades to work on it and it's still taken a long time to fix all the bugs/security issues.
c) Java‘s VM uses a different architecture, and the bytecode is also designed differently.
Eventually every game developer learns the hard way that they must remove the bytecode ability from lua's loadstring() function.
E.g. here's a 12 year old blogpost on the topic from the ROBLOX developers: https://archive.is/oXPyM
To be honest, it would probably be better off disabled by default. Its legitimate uses are pretty niche.
Yep, its place is in luaconf.h really.
That doesn't apply here. By the broad definition of "syntax" that Rice's theorem takes, the things you want to verify on the bytecode are syntax.