WASM by example

Another interesting use case for WASM is making cross-language libraries. Write a library in one language, compile to WASM, and import into any other language.

Im not that familiar with WASM but isnt that pretty damn awesome? Feel like I must be missing something because that seems like the bigger deal but it seems like an uncommon use-case.

Your post got me wondering, what advantage might it provide over an FFI? Does WASM ABI define higher level common primitives than C?

If it ends up becoming more similar to LUA then a big advantage is that the WASM code won't randomly read your harddrive and send your bitcoins to North Korea unless you explicitly gave the WASM code disk/network permissions.

Lua allows full control over which APIs you want to expose to a script you embed in your application. With some effort you can even expose only a constrained version of the `os` module for example which only lets you access a few resources. Why do you believe WASM can do better here? In fact, as far as I know , there's nothing in WASM that lets you sandbox it yet once you've given it WASI access (unless you're talking about host specific features, which are NOT part of WASM spec itself).

That depends entirely on the runtime, and its WASI implementation.

wazero [1], which I'm most familiar with, allows you to decide in a relatively fine-grained way what capabilities your WASI module will have: command line arguments, environment variables, stdin/out/err, monotonic/wall clock, sleeping, even yielding CPU… Maybe more importantly, filesystem access can be fully emulated, or sandboxed to a specific folder, or have some directories mounted read-only, etc; it's very much up to you.

I've used it to wrap command line utilities, and package them as Go libraries.

For one example, dcraw [2]. WASM makes this memory safe, and I can sandbox it to access only the single file I want it to process (which can be a memory buffer, or something in blob storage, if I want it to).

Notice in [3] how you provide an io.ReadSeeker which could be anything from a file, a buffer in memory, or an HTTP resource. The spaghetti C that dcraw is made of won't be able to access any other file, bring your server down, etc.

1:https://wazero.io/

2:https://dechifro.org/dcraw/

3:https://pkg.go.dev/github.com/ncruces/rethinkraw/pkg/dcraw

Wazero looks super cool. I saw somewhere that programs can be run with a timeout, which sounds great for sandboxing. The program input is just a slice of bytes [1], so an interesting use case would be to use something like Nats [2] to distribute programs to different servers. Super simple distributed computing!

1:https://github.com/tetratelabs/wazero/blob/main/examples/bas...

2:https://natsbyexample.com/examples/messaging/pub-sub/go

Yes: if so configured, wazero respects context cancelation (including, but not limited to, timeouts).

This has a slight toll on performance: a call back from WASM AOT-compiled-assembly into Go is introduced regularly (on every backwards jump?) to give the Go runtime the opportunity to yield the goroutine and update the context (and break infinite loops), even when GOMAXPROCS=1.

Coordinating with the Go scheduler might be an area where there's some room for improvement, if fact.

Yeah lua is a weird example because it’s actually amazing at this. Lua gives you a massive amount of control over what scripts can access.

It is much safer than pulling in an opaque C library that works via ffi. Eg a nodejs native module. Those are written in C and can indeed sell your data to North Korea. (Just like any other package in npm.)

I’m excited by the idea of being able to depend on 3rd party code without it having access to my entire OS.

I had to look up FFI (Foreign Function Interface). Im not sure if WASM is better. Im aware there are language bindings (maybe synonymous or overlaps with FFI?) but Im not that familiar with them either.

I wondered if perhaps this WASM use case for a cross-language library was already just as possible and ergonomic using language bindings and maybe thats why this use case doesnt seem like a big deal to people. It does seem possible that the allure of running in the browser might prompt deeper support for WASM compilation than language bindings. The WASM case is also a many-to-one relationship (all languages to WASM) whereas language bindings are a many-to-many relationship (all languages to all languages) so it would take a lot mote effort for the same level of support.

I wondered if perhaps this WASM use case for a cross-language library was already just as possible and ergonomic using language bindings and maybe thats why this use case doesnt seem like a big deal to people.

Yeah that’s the reason. You don’t notice it a lot of the time, but FFIs are everywhere already. The most common foreign function interface is basically the ability to call C code, or have functions made available to C code. C is used because everyone knows it and it’s simple. And most languages either compile to native code (eg rust) - which makes linking to C code easy. Or the runtime is implemented in C or C++ (eg V8, Ruby). In languages like that, the standard library is already basically implemented via a FFI to C/C++ code.

I’ve got an iOS app I’m working on that’s half rust and half swift, with a touch of C in the middle. The bindings work great - the whole thing links together into one binary, even with link time optimizations. But the glue code is gross, and when I want to fiddle with the rust to Swift API I need to change my code in about 4 different places.

Most FFIs are a one to many relationship in that if you write a clean C API, you can probably write bindings in every language. But you don’t actually want to call naked C code from Ruby or Javascript. Good bindings will make you forget everything is done via ffi. Eg numpy. I haven’t looked at the wasm component model proposal - I assume it’s trying to make this process cleaner, which sounds lovely.

I maintain the nodejs bindings for foundationdb. Foundationdb bindings are all done via ffi linking to their C code. And the API is complex, using promises and things. I find it really interesting browsing their official bindings to go, Java, Python and Ruby. Same bindings. Same wrapped api. Same team of authors. Just different languages. And that’s enough to make the wrapper wildly different in every language. From memory the Java ffi wrapper is 4x as much code as it is in Ruby.

https://github.com/apple/foundationdb/tree/main/bindings

You don’t notice it a lot of the time, but FFIs are everywhere already

That’s true, but I sort of find it a negative in a platform if it relies too much on C libs, unless absolutely necessary. FFI is the prime reason why a given software fails to work on another OS, e.g. if your python/js project doesn’t build elsewhere, you 90% have trouble with a C lib.

There are various reasons why this is not the case with JVM languages (it historically didn’t have too great FFI options, also, the ratio of Java:C speed is much less than Python:C, so it didn’t make that much sense), but that platform grown to be almost 100% pure JVM byte code. The only part where they use native parts is stuff like OpenGL, where you pretty much have to. I think this gives for a more ideal starting point.

Yeah - its a problem with nodejs as well. Native C libraries in npm regularly break when you change OS, and given that most javascript packages pull in a small country worth of dependencies its very common to have some native code in theresomewhere.

I really hope most native javascript modules being rewritten / repackaged into wasm. As well as solving any cross-OS compatibility problems, that'll also make them work transparently in the browser.

well, for starters, wasm is sandboxed. So, if a wasm library needs an import (eg: read/write filesystem), it has to be explicitly provided. It cannot do anything except math by default. This allows host a high amount of control.

different wasm libraries can have separate memories. So, if a library X depends on a jpeg decoder library, the host has to provide that as import. The jpeg decoder library might export a fn called "decode" which takes an array of bytes of a jpeg, and returns an Image struct with rgba pixels. This allows the "memory" of the two libraries to be separate. the jpeg decoder cannot "write" to the X's memory, cleanly separating the two of them.

Wasm component model recognizes higher level objects called resources, whichcanbe shared between libraries (components). This allows X to simply pass a file descriptor to jpeg decode fn, and the sandbox model makes sure that jpeg library can read from that fileonlyand the rest of the filesystem is still offlimits. wasm is already getting support for garbage collector. So, a high level language can just rely on wasm's GC and avoid shipping its entire runtime. Or the host can guarantee that all the imports a language needs will be provided, so that the language libraries can shed as much runtime weight as possible.

Finally, Component model is designed from ground up to be modular, which allows imports/exports/namespaces and other such modern features. C.. well, only has headers and usually brings a lot of backwards compatibility baggage. The tooling (eg: wit-bindgen) will provide higher level support like generating code for different language bindings by taking a wit (header for wasm) declaration file. If you are familiar with rust, thenhttps://github.com/bytecodealliance/cargo-component#getting-...shows how easy it is to safely create (or bind to) wasm bindings

That's currently impossible. It requires the component model[1] to be figured out and that's taking a ridiculous amount of time because as anyone who has tried this before (and many have) this is really hard (languages have different semantics, different rules, different representations etc.).

If they do manage to get it working somehow, it will indeed be very exciting... but I've been waiting for this for several years :D (I was misled to believe this sort of modular thing was possible : but that's false unless you generate lots and lots of intermediate JS to glue different modules - and then all "communication" goes through JS, nothing goes directly from one module to another - which completely defies any possible performance advantage over just doing pure JS) so don't hold your breath.

[1]https://github.com/WebAssembly/component-model

The component model is already shipping in Wasmtime, and will be stable for use in Node.js and in browsers via jco (https://github.com/bytecodealliance/jco) soon. WASI Preview 2 will be done in December or January, giving component model users a stable set of interfaces to use for scheduling, streams, and higher level functionality like stdio, filesystem, sockets, and http on an opt-in basis. You should look at wit-bindgen (https://github.com/bytecodealliance/wit-bindgen) to see some of the languages currently supported, and more that will be mature enough to use very soon (https://github.com/bytecodealliance/componentize-py)

Right now jco will automatically generate the JS glue code which implements a Component Model runtime on top of the JS engine's existing WebAssembly implementation. So, yes, Components are a composition of Wasm Modules and JS code is handling passing values from one module/instance to another. You still get the performance benefits of running computation in Wasm.

One day further down the standardization road, we would like to see Web engines ship a native implementation of the Component Model, which might be able to make certain optimizations that the JS implementation cannot. Until then you can consider jco a polyfill for a native implementation, and it still gives you the power to compose isolated programs written in many languages and run them in many different contexts, including the Web.

(Disclosure: I am co-chair of WASI, Wasmtime maintainer, implemented many parts of WASI/CM)

Hm, ok so it seems there has finally been progress since I stopped looking (around a year ago, I was very actively following developments for a couple of years before that but got tired). I will check wit-bindgen and jco and see if I can finally make my little compiler emit code that can be called from other languages and vice versa without myself generating any JS glue code.

That's currently impossible. It requires the component model[1] to be figured out...

Not really if you use C-APIs with 'primitive-type args' at the language boundaries, which is the same like in any other mixed-language scenario. Some languages make it harder to interact with C APIs than others, but that's a problem that needs to be fixed in those languages.

This. And as long as you provide memory allocation primitives, you can pass arbitrarily complex arguments in linear memory. It's just a matter of “ABI.”

This isn't really true at a base level. You can do things like JS/Python interop now without this, eg:https://til.simonwillison.net/deno/pyodide-sandbox

It's possible all right, I've already done it. Imported Tesseract into Go with WASM.

https://news.ycombinator.com/item?id=38146154

It's not trivial because you have to figure out intricacies of the language and whatever compiled it into WASM, and Emscripten compiled WASM does expect some JS glue code. But WASM with WASI doesn't inherently require JS. And since Emscripten's JS glue is called via WASM host imports, you can implement them in whatever the host language is.

Yes, and go upvote this .NET feature so we can make portable .NET WASM libraries:https://github.com/dotnet/runtime/issues/86162

.NET WASM performance is actually very impressive, especially with AOT enabled.

You're in luck because this is already possible:https://github.com/extism/dotnet-pdk

It's awesome, but also funny that we're using WASM to reinvent COM 30 years later.

And that's not a knock on WASM. It's just that COM was pretty neat for it's time, even if it was sometimes painful in practice.

But I find it pretty nifty that I canstilltake a COM library written in one language and then import and use it in C++, C#, Python, Ruby, JavaScript, or Racket (and plenty of others - those are just the ones I've used COM libraries with.)

COM is pretty much presence tense, it being the main Windows ABI since Windows Vista adopted Longhorn ideas into COM/C++ instead of .NET.

The tooling could have been improved, but I guess WinDev really loves their baroque tools.

Most of the 3rd-party libraries I use, I use for their side effects.

Qt opens GUI windows and sockets and such. libusb touches USB devices. OpenCV can capture video frames from a camera and sometimes use GPU acceleration. Sqlite manipulates files on disk.

So unfortunately with wasm in a sandbox, the easiest libraries to work with are only pure functions. ffmpeg would work, but HW encoding or decoding would be difficult, and I need to either enable some file system access in the wasm runtime or feed it file chunks on demand.

This! Who really needs to tie together business logic components written in different languages?!

WASM is the new DLL. Will we have to deal with WASM-Hell eventually? But that's not catchy enough. Maybe we should call circular dependency and incompatible versions "WASM-WTF"

Yeah the component model the bytecode alliance is pushing defines a canonical ABI and codegen tools to make this easier (also separating memory from these components so a bug in some random C library doesn’t have a blast radius outside the processing it does in its library boundary)

As proven by the JVM, CLR, TIMI and many others that predated WASM, that is much harder in practice than people think.

Turns out there always needs to exist a common subset that most languages are capable to understand.

Extism handles this really well across 16 or so different languages - and you don’t need to write a whole IDL / schema.

https://github.com/extism/extism

It’s a general purpose framework for building with WebAssembly and sharing code across languages is a great way to put it to work.

Vax VMs used to do that.

Also anything the compiles to a C shared library also does that

With supply chain attacks becoming more of an issue the strong sandboxing of library permissions would a huge benefit also. A thought on how this might be workable would be to have a wasm registry that when pushed to, would auto-build packages for each ecosystem, then push upstream to npm/maven/etc.

Of course the "component model" or some agreed upon structure of data shared among modules and the mappings to each language is the missing piece.

As mentioned by others, it is not particularly new in any way.

I find GraalVM’s polyglot abilities far more impressive, where the VM can actually optimize across your JS code calling into Python calling into C — while providing more granular sandboxing abilities as well (you can run certain so-called isolates with, say file-access only, some others without even that, all under the same runtime/setup.

We sort of do this with WASM for just in time pipelines. We write pipeline rules in WASM...for things like detecting/masking fields...then we import and execute those wasm rules in a variety of language SDKs. As a sibling comment indicates, it's pretty difficult getting data in and out, but it's doable. See here for an example:https://github.com/streamdal/node-sdk/blob/main/src/internal.... We do this sort of thing in node, go & python and are adding other languages.

This is exactly one of the use-cases for the Scale Framework[1]. (Full disclosure: I work on this project)

You can absolutely take a library from one language and run it in another. In a sense, you could kind of see this ability as drastically reducing the need for rewriting sdks, middlewares, etc. across languages, as you could just reuse code from one language across many others. We played around with some fun ideas here, like taking a Rust regex library and using it in a Golang program via a scale function plugin (compiled to Wasm), to the effect of the performance being ~4x fasterthan native codethat uses Go's regex library[2].

[1]https://github.com/loopholelabs/scale

[2]https://twitter.com/confusedqubit/status/1628409282462093312

Great guide to getting started. I wonder if one day WASM will replace Javascript in the browser.

That's the dream. Render to WebGL/Canvas. Bypass CSS + HTML.

Make web actually open to other languages by putting them on a very common ground.

Nah, that is a nightmare. Don't forget your WebGL/Canvas based UI has to be responsive to screens ranging from a 3" phone up to a 8K ultrawide monitor. Nothing beats html/css for achieving that. Also, good luck with ADA compliance when your page is a pixel drawing on canvas.

Nothing beats html/css for achieving that

Many people would argue that the native iOS and Android UI toolkits beat HTML/css.

6dc610bc2d7c1ba9b6783c61bf8c79897c733964e845d574990b120954428d69

They don’t run on the web, though.

ChromeOS you mean.

My understanding is that devs are "supposed" to use WASM as a compliment to a11y-friendly tech.

Fromhttps://webassembly.org/docs/faq/#is-webassembly-trying-to-r...

HTML/CSS/JavaScript UI around a main WASM-controlled center canvas, allowing developers to build accessible experiences

I caneasilysee that not happening obviously (plenty ofnon-WASM sites with poor a11y exist already). If anyone has any useful articles regarding accessibility and WASM please share.

Well, there is Flutter, which is no small project, that renders to the web (always promising to go dom-based) — and it’s still just a canvas with zero accessibility. So if google doesn’t have the resources/priorities for that, I doubt that 2-person rust library will.

Why must all apps support all scenarios? Is the latest triple-A big-budget video game ADA compliant? Does it run on a 3" phone?

It is the revenge of plugins, after we got a 10 year delay with WebAssembly re-inventing the stack.

I would rather not have JS replaced. It would really hurt the ability of users to poke through the code and tweak stuff as they like.

There isn't anything actually wrong with JavaScript in 2023, either. The idea that it needs to be replaced stems from countless failed attempts to shove a bunch of crap into the client with a bunch of frameworks and without a shred of actual engineering discipline. Trust me when I say that, if web apps become commonly written mainly in other "scalable" or "type-safe" languages, your average software developerswillfind a way to screw that up as much as they have with today's web, if not worse than that.

WASM is cool, but I don't see any reason to get rid of JS. There's value to having a lingua franca of JIT interpreted code with a standard UI toolkit, and I think those who point out how "insufficient" the DOM is and how "slow" and "unsafe" JavaScript is are simply wrong. They both do their jobs exceedingly well when idiotic and theological ideas are not simply thrown at them by software developers obsessed by convenience over soundness of design.

There isn't anything actually wrong with JavaScript in 2023, either.

The semantics of the language can be quite complex and it tookdecadesfor browsers to agree on them for most use cases. WASM arose out of a failure of browsers to figure out ways to deprecate this—mostly unnecessary—complexity.

The idea that it needs to be replaced stems from countless failed attempts to shove a bunch of crap into the client with a bunch of frameworks and without a shred of actual engineering discipline.

The same can be said about the implementation of javascript in browsers as well.

We're stuck with it regardless, but our reliance on javascript and its myriad interactions with html and css functions much the same way for large browser vendors as regulatory capture does for large corporations at the state level.

The semantics of the language can be quite complex and it took decades for browsers to agree on them for most use cases.

That's ancient history. JavaScript has its quirks, but it's not a difficult language to learn or use. Frankly, I don't know where you get this idea that the semantics of the language are hard. In contrast to what? Maybe if you shared some examples I could understand what you're talking about. JavaScript was challenging in decades past not because it was complex but because it was way too simple. Using it on a webpage to do more than very rudimentary things with the browser API meant doing a lot of whacky stuff and using libraries for operations we take for granted today.

WASM arose out of a failure of browsers to figure out ways to deprecate this—mostly unnecessary—complexity.

As someone else mentioned, no it didn't. WASM came from the same desire as Java applets and browser plugins for Shockwave and Flash, which was to develop applications that run in the browser using entirely different languages and authoring tools.

The same can be said about the implementation of javascript in browsers as well.

No idea what you are basing this on. Nobody (as in the vast majority) thinks that modern web development is a failure because JavaScript the language is too complex. Everyone is complaining about web development because of all the tools that have been added between the keyboard and the code running in the client, and said tools failing to live up to their promise while encouraging patterns that commonly backfire.

our reliance on javascript and its myriad interactions with html and css functions

What does that even mean? JavaScript only has as much interaction with the DOM as is demanded of it. If there's a myriad of ways that JavaScript can interact with the DOM, well, that's by design... how else would you have it? CSS functions have nothing to do with JavaScript, if that's what you're actually referring to. At most, JavaScript can listen for some events that are emitted by things like CSS animations.

WASM arose out of a failure of browsers to figure out ways to deprecate this—mostly unnecessary—complexity.

No, WASM arose out of the work done by Alon Zakai on asm.js at Mozilla which was in good part motivated to show that the web didn't need Google's PNaCl.

That seems to remove the advantage of using the browser in the first place—leveraging native controls and integrations, giving the user control over how things are renderered, and accessibility concerns.

Of course, I can see how this is mostly irrelevant for some applications like games, but that's still rather niche compared to, you know, useful and wide-spread apps that people actually want to use.

You forgot about the most important advantages. True write once run anywhere cross platform support, and one click zero install distribution with no gatekeepers. The latter in particular is key and impossible to replicate any other way because device gatekeepers like Apple or Nintendo will never allow any other app platform to bypass their distribution monopoly.

True write once run anywhere cross platform support, and one click zero install distribution with no gatekeepers.

We had (have) that, and the apps were miserable to use for the same reasons enumerated above. The place where it has come closest to succeeding? Video games.

Unfortunately, platforms are too diverse to target as a generic platform without making major sacrifices as to either consistency or usability.

Make web actually open to other languages

And closed for humans? I am lucky enough that I haven’t yet met an unblockable canvas ad, but they are coming. You also won’t get any support from the browser, no text select, no right click, no accessibility whatsoever, no reader mode, probably not even have proper mobile support, because it just happened to react to clicks, and not touches..

Unless it is some specific simulation/game/ultra-complex gui (think photoshop), I explicitlydon’twant to see any canvas rendering, even though I really dislike JS as a language/platform.

Maybe for web games that's the dream. But for user interfaces that will mean dodgy homegrown text rendering, text selection, caret movement, etc on a site-by-site (or at least framework-by-framework) basis.

I highly doubt it. Reading this I honestly struggle to understand the value. In the article it says it is not faster than JavaScript except for intense computations. So, I could understand the existence of WASM libraries for certain computations that are actually faster, but for UI stuff and simple calculations (most of what frontend needs to do) you might as well stick to JavaScript as its WAAAY faster in dev time and equal in performance (or at least equal enough that it is imperceptible by the end user), and in some cases it might even be slower since you have to take the time to serialize the data to send it between JS and WASM.

Bit of a chicken-and-egg scenario here. Frontends focus on simple, cheap UI's (<form>'s) because that's what they're good at. That doesn't mean they should be limited to just that.

Where are our cloud-based DAWs? Where's CAD?

There's a lot of OS-based tooling that never made the leap to web-based because of how limiting the web environment is and WASM is a great step towards solving that.

Additionally, the sandboxing is suuuuper useful in today's world. Docker either wouldn't exist, or wouldn't have the clout it has today, if WASM had been fully fleshed out when Docker took off (https://thenewstack.io/when-webassembly-replaces-docker/)

Also, it's nice being able to opt-out of garbage collection by compiling a non-GC'ed language to WASM. Latency-sensitive applications (real-time gaming) don't want to aggressively manage object lifetimes in JavaScript. Like, yeah, you could, in theory, avoid excessively allocating strings and strongly prefer mutating existing objects vs immutability, but those decisions lead to more fragile code and ultimately limit the depth/quality of tooling you can create.

JavaScript is pretty powerful these days and not really the limiting factor for most use cases.

Tinker CAD feels great and it is built with JS/HTML with the 3D view done in WebGL/Canvas:https://www.tinkercad.com/

I could maybe see it for gaming at some point, but it is a long way to go. All screen drawing (canvas, DOM manipulation, etc) are still going through JavaScript. So none of that is any faster.

Hell, one could write a full 3D FPS game with online interactivity almost a decade ago with WebGL (someone at my uni had such a thesis work). It definitely wouldn’t beat some ray-traced AAA game, but it is surely enough for most things.

Tinker CAD seems fine, but is quite clearly not the full package. Contrast this with the ability to compile CAD to WASM and drop it in. We will see a lot more of that going forward - legacy packages that have had a lot of work put into them becoming web-accessible.

It's not necessarily about being faster, but about predictability. You're correct in identifying gaming as one use case.

Personally, I am trying to build a basic 2D sim game (think RimWorld) that runs in the browser. I prototyped a demo using TypeScript. React isn't viable because the reconciler isn't intended for handling <canvas /> writes and becomes the perf bottleneck. After that, the GC becomes a bottleneck because of all the allocations.

So, my choices were to either write very special JavaScript that intentionally minimized allocations by pooling objects and favoring mutation over immutability, or choose a better language more suited for the task. JavaScript is absolute garbage if you care about micromanaging performance. You have no idea what you're going to get because V8 optimizations constantly change in subtle ways.

I've since rewritten my approach in Rust + WASM to address those concerns.

It would be nice to be able to have a shared memory buffer that <canvas /> reads from to eliminate the need to call back into JS. I doubt we'll ever get there due to WASM sandboxing, but that has yet to be the limiting factor for me.

FWIW, others smarter than me have come to similar conclusions.https://maxbittker.com/making-sandspieldocuments someone creating an app three times: in JS, in Lua, and then in Rust to get it back into the web. They aren't advocating for UI to be written in Rust, but Rust + WASMdidsolve a need for them, too.

Easy, by adopting the timesharing model, they will run server side with browser only showing the GUI.

CAD is already taken for.

https://www.infoq.com/presentations/autocad-webassembly/

if WASM had been fully fleshed out when Docker took off

It was already there, as Java and .NET application servers, which Kubernetes + WASM containers are now re-inventing as something "revolutionary", now with YAML spaghetti instead of XML config files.

First we need access to dom without javascript

100%

An instruction set that is supported by all major browsers sounds enticing. I have tried the hello_world demo with Emscripten a couple years ago and was stumped that the generated page had multiple megabytes. In the first example in this page, I read

    This will output a pkg/ directory containing our wasm module, wrapped in a js object.

So I'm guessing that the result is the same. Why is it so? Hello world requires a print function, which I suppose needs a small subset of some implementation of libc. Why so much space? Why the need for a .js object? Shouldn't we be bypassing the JS engine?

You need (a little) JS to run Wasm in the same way you need (a little) HTML to run JS; it's a hosted platform. JS handles loading and invoking the wasm code, and because it's close to a pure instruction set there's very little you can do without calling JS APIs, which in turn requires support code to translate across the boundary.

The WASI project specifies wasm-native APIs (modelled on posix) for running locally without JS, so you could imagine something similar for the browser. But the complexity of the DOM is forbidding.

I've not tried Emscripten hello world for a while, but I imagine it depends on things like optimisation level, dead code elim etc. In general to compile C code you'll need a malloc, string support and so on as you say. You can make the wasm file tiny if you lean on JS strings, but that increases the amount of support code again. Languages other than C will have an easier time reusing parts of the JS runtime (like strings or GC).

Yeah. And hello world is (thankfully) much smaller now than it used to be. Bigger than you think if you use printf, which is a quite complex C function. But at a guess, 10kb or something including support files. There are some great guides and tooling around to help shrink wasm size. Eg for rust:

https://rustwasm.github.io/docs/book/reference/code-size.htm...

Hello world just needs to call console.log, so doesn't need libc. Here's an example that builds without libc / emscripten to produce a very small wasm hello world:https://github.com/nikki93/cxx-wasm-freestandingThere's actually some other stuff in there right now but console.cc is the main thing -- it calls consoleLog which js exposed to wasm from the js code athttps://github.com/nikki93/cxx-wasm-freestanding/blob/master....

You do need some JS code that asks the browser to run the wasm blob. You can't eg. just have a script tag that refers to a wasm blob yet.

libc does help with things like having an allocator or string operations etc., or for using C libraries that use libc. And that's where emscripten becomes helpful.

Browser functionality like the console or making html elements is exposed through JS interfaces, and the wasm needs to call out to those. But they may be directly exposed to wasm later (or they may already be at this point in new / experimental browser features).

The hello world in this guide doesn't actually use console.log at all. It adds 2 numbers and sets the page content to the result. All it does is expose an add function from rust and call it from the javascript side.

You can get a simple WebGL2 WASM app down to a couple of kBytes, for instance this downloads 30 KBytes in Chrome:

https://floooh.github.io/sokol-html5/clear-sapp.html

(1.4 KB for the .html, 8.8 KB for the .js, 14.5 KB for the .wasm - and a whopping 5.5 KB for the 404 page returned by Github pages for the missing favicon - wtf...)

The .js file is basically glue code between the WASM code and the browser runtime environment.

Without the Emscripten "convenience runtime" you can also go smaller, but at a few dozen KBytes more or less it's pretty deep in diminishing returns territory.

The C library is provided via MUSL these days (in Emscripten). But there's a small 'syscall layer' which is implemented in Javascript (basically if you want to run in the browser and access web APIs, at some point you need to go through Javascript).

Can't speak to the size issue as I don't use emscripten, but I agree that most WASM output is waaay too large. Regarding JS, you need a small amount of JS to bootstrap your WASM program. In the browser, JS is the host environment and WASM is the guest. The host instantiates the WASM module and passes in capabilities that map to the module's declared imports.

https://developer.mozilla.org/en-US/docs/WebAssembly/Using_t...

To really understand WASM, you should try to write it by hand! That's right, it's possible, it even has a text format that all WASM runtimes let you run directly.

Check this out:https://blog.scottlogic.com/2018/04/26/webassembly-by-hand.h...

Once you know what WASM really does, it's obvious why you need JS (or whatever the host language is in your runtime, which could be anything) to provide anything related to IO and why there's zero chance you'll get DOM access as it currently stands... until they finally finish off specifying a whole lot of APIs (they did GC already which was pretty fundamental for that to work, but there's many more things needed for complete access to IO and DOM).

If you use a compiler like Rust, it's going to include WASI which is a sort of POSIX for WASM (and currently completely underdefined, just try to find the spec and you'll see it's mostly being made up as implementers go), which is why you'll get ridiculous amounts of code into your WASM file. If you write it by hand, importing some made up function like `console_log` which you then import from WASM, then your WASM should be just a few bytes! Literally. I wrote a little compiler to do this as it's not hard to generate WASM bytecode, but to make anything useful is very painful due to the complete lack of specifications of how things should work (e.g. to print a string is actually very hard as you'll just be able to give JS or your host a pointer to linear memory, and then you need to write the actual bytes - not UTF or even ASCII characters in WASM itself - to linear memory, which the JS then has to decipher to be able to call its own `console.log()`)... so I am waiting (a few years by now) until this improves to continue.

Wasm will never take off. It is too hard to get started for it to be meaningful. Maybe for games or things like figma it will work but having to use rust means it will never ever be mainstream.

You can use Kotlin, C# or bunch of other languages

Wasm will never take off.

Already has taken off.

having to use rust

You don't have to use Rust. There are wasm compilers for dozens of languages. C, C++, D, Java, C#, F#, Python, Go, Lua, PHP, etc.

Maybe a compiled-to-wasm Typescript lookalike is more your thing?

https://www.assemblyscript.org/

I think you're projecting a little. WASM is already in heavy use. It's obviously not intended for casual scripting. So if you find it too hard, you're not the target. But you're not representative of all developers in general.

Also, it's not intended, usually, for you to actually write WASM directly. It's a compilation target. You can import C++ libraries, Rust libraries, and so on, and use them from JavaScript. Or you can write your own C++ code and run it in a browser. You write WASM directly mostly when you write compilers. Have you written compilers? Well let me tell you it's definitely not simpler than WASM.

I'm under the impression you may think WASM is trying to be a replacement for JavaScript. It's not. They play together. In fact, you can think of WASM as a new JavaScript API, like we have WebGL or WebGPU. It's just that. WASM can't do anything by itself. It's just math in a block of memory. But... it's fast math.

I was wondering why my burst of middle-clicking links into tabs didn't actually open those pages.

       <a
            href="#"
            onclick="goToExample('examples&#x2F;hello-world', 'hello-world', 'assemblyscript', 'en-us')"
            >Hello World!</a
          >

Looks like the code is here if you want to send a PR:https://github.com/torch2424/wasm-by-example

Yes, this killed navigation and pressing back. I was on example 4 in rust, and changed it to c++ to look at the differences. Instead, it showed the intro page in c++ and pressing back does nothing, as the history is completely broken.

There was no easy way to jump back to example 4 in rust. I had to change it to rust and click through each example again.

Everyone: Please stop using ‘#’ as an href placeholder. Don’t make your ‘<a>’ elements rely on events. If you want to replace their events with others, go for it. But please make them work at the base level.

That simple add example, huh... makes sense but idk why it's surprising to me. The imported non-js language is interactive (can take parameters at runtime). neat, thought it was just a static thing you embed/can't change.

If it couldn't take parameters what point would there be in it?

It reminds me of how you can export a 3D model and inject it into a webpage via ThreeJS/.glb file. Granted this one too, you can individually import the parts/move them for animation.

(it's not passive/just working like an iframe)

Shameless plug: I wrote a similar from-scratch guide a couple of days ago about how to get WASM debugging working in VSCode both for "command line" and browser apps using the new WASM DWARF Debugging extension for VSCode:

https://floooh.github.io/2023/11/11/emscripten-ide.html

very nice! Can this be done with Java and/or Rust? Or does it only work with C ATM?

Rustshouldwork if the Rust compiler can generate DWARF info for WASM targets. Java most likely not.

I'm definitely going to read through it. But one thing immediately made an impression on me: CTRL+CLICK doesn't open links in a new tab, but the input event is hijacked and it opens the page in the same tab. The same with other modifiers like SHIFT+CLICK, ALT+CLICK etc.

The feeling is one of losing control of your computer, like the site is trying to keep you trapped in a tab. Very unpleasant. I know you don't mean it this way. And I appreciate it's loading the pages inline without reloading, but it's absolutely not worth it, if people have to constantly fight years, decades of muscle memory in order to use the site.

Please fix this! Links should act like links.

Also, the back button doesn't work properly for navigations on the site.

Indeed. There are bugs in the history management, for example it pushes false entries like index.html# next to index.html which were not navigated to.

So if you click back/next it sometimes works, sometimes it doesn't. And when it works, it doesn't show the page where you had scrolled it to (the native behavior) but scrolls you always to the top, so you lose context.

Also if you right-click a link to open it in a new tab from the menu, it opens a new tab... but at the home page, instead of the link.

I have to say... pretty bad experience start to end :(

The examples seemed clear enough to read (I did not test them), but I felt than even when teaching by example there needs to be more overview and explanation. I.e., I would prefer an overview of WASM structure and use with examples, rather than just the examples. (I have some (but limited) experience using WASM.)

As for the utility of wasm, note also that Cloudflare workers can run WASM on edge servers [1], and that the Swift community has some support for compiling to wasm [2].

I've never really understood how wasm could do better than java bytecode, but I've been impressed with how much people are using lua and BPF. More generally, in a world of federated programming, we need languages client can submit that providers can run safely, without obviously leaking any secret sauce -- perhaps e.g., for model refinement or augmented lookup.

[1]https://github.com/cloudflare/workers-wasi

[2]https://github.com/swiftwasm

I tried WASM in an FPS and I have to say, I'm not a fan of reaching across half the keyboard when I need to slide right. ;)

Edit: looks like someone's sense of humor was taken out to a shed and shot. You enjoy that.

There is also another book in the works that has you grapple with WASM itself directly throughout by building your own compiler. It's called WASM from the Ground Up [1]. It's still in progress, but I have really found it informative so far

[1]https://wasmgroundup.com/

When WASM has direct access to the dom, everything will change.

Here's an example of a Unity demo running using WASM and WebGPU, posted to HN:

https://news.ycombinator.com/item?id=38281040

Such annoying. I can't find anymore about wasm assembler compiler.

Can you add a category for raw WebAssembly in the S-expr syntax? I think it would be helpful to understand how things work under the hood. WebAssembly is a very high level assembly language and somewhat different from your usual x64/ARM/RISC-V since it's a stack machine.