Maybe I'm missing it, but it says it was originally twice as slow as JS, than it says they did optimizations, then no comparison on the final outcome?
This is a bit of a tangent but it always annoys me when content has no published date next to it. There's a tiny "last updated" date at the bottom. But how do we know if this is a new article or one from a few years back? With technology moving so fast it can significantly change the context in which you read an article.
It’s an old engagement trick. The classic advice is to hide publish dates because you might lose some people who don’t want to read old posts.
Which isn’t helpful if you’re a person who doesn’t want to read old posts.
I'm actually biased towards older articles, given the rise of SEO oriented writing, and now, LLM generated content. Standing against the test of time should be a badge of honor. But readers like me may be the minority.
That's a good point, though it doesn't help if you're researching a new technology unfortunately.
Yes, unfortunately. I hope though that LLM's reliance on training data would at least give human writers an advantage when it comes to novel content such as new tech.
Last updated 2024-06-26 UTC
Located at the bottom of the page.
Typically if a page is no longer relevant but recently updated (like this page was), it would be denoted as such right up top.
How do I know the last updated time isn't just new Date()? After all it would be true the last time the page was updated is today. What was updated? The last updated time.
They could have rebuilt the page with a new template or something. knowing the original pub date is kind of important for placing the decision in context.
I did mention that, but it doesn't mean it was first published then. It could have been first published years ago, and then recently edited to fix a typo.
Yes this does seems to be more common. What are basically blog posts with no date to be found at all, at least this has a "last updated".
I can only assume "old" pages are down-ranked more by Google than those with no date at all.
Maybe the content date is in a metadata tag somewhere, I've not bothered to check yet. But if it is why not make it visible on the page as well?
Yes, we had to remove the publication date from our archives because Google was heavily down ranking it. We left an “updated” date but are probably going to remove that as well. I fought hard against this until it became clear there really was no other option.
Generate the date dynamically client side so it can't be parsed. Or use an image/svg
There are content creator "experts" going around telling their clients to take dates off of everything to make their content "evergreen" to keep engagement up. It's infuriating.
That and the randomly updated dates with no changes
Even if they had mentioned a published date, how can you be REALLY sure that actually IS the published date? There is always that dicey nature of internet which makes it untrustworthy.
This is one of the indicators of whether a source is credible or not. Traditional newspapers always include an original publication date and a "last updated" date, along with notes—such as an "Editor's note" or "Update"—to indicate factual corrections or major changes to the article.
What qualifies as a major change can be subjective at times: there have been controversies where newspapers change an inaccurate headline without notice and larger controversies where there is a "stealth edit" that many readers believe should have been publicized. But this expectation that there should be transparency with edits should be the norm.
I believe that articles on personal websites and blogs can only gain by following the practice, too, as it's an indicator to the reader that the author cares about transparency and accuracy.
Not sure how Google actually indexes sites, but it would be great if you could see a "last changed" indicator in Chrome. That would be super useful.
My heuristic (especially for bigger, better indexed websites) is whenever archive.org first saw it
I didn't know WasmGC was a thing yet!
https://developer.chrome.com/blog/wasmgc/
Some things that are at first glance surprising is that GC'ed languages might ship a smaller binary, because they don't need to include the code that manages memory. (The example here was Java.)
When WasmGC is widely adopted, we might hopefully see some languages adopting it as a target instead of JS. I'm thinking of Clojure and Dart specifically but there are plenty of other languages that might benefit here.
This could also have an impact on "edge computing" (isolates etc.) down the line.
I work on a Scheme to Wasm GC compiler called Hoot. https://spritely.institute/hoot/
Languages like Kotlin, Scala, and OCaml also target Wasm GC now.
I'm waiting for the "you should run the same code on the front-end and back-end" argument to get thrown on it's head.
golang on the back-end - golang in the browser.
That would be my dream. Another comment highlights that Go's memory management may not be a good fit for WasmGC, so we may not get near-native performance, which would be a bummer.
Go also has a runtime that does non-trivial things. If it can dump the GC out of it it'll shrink, but "hello world" would still be fairly large compared to many other languages. Having no tree shaking or similar optimizations also means that using a function from a library pulls the whole thing in.
While I look forward to being able to use things other than Javascript or languages seriously confined to Javascript's basic semantics in the frontend, I expect Go to be a laggard because it's going to be hard to get it to not generate relatively large "executables". I expect other languages to be the "winners" in the not-JS-frontend race.
That's already happening with j2cl
https://github.com/google/j2cl
Java everywhere.
Also j2objc
https://github.com/google/j2objc
Write the logic in java and get it on iOS, web and android.
And the dream of Java everywhere is finally realized ... not by shipping a JRE everywhere, but by using Java-the-language and parts of the JRE as a high-level specification and ahead of time compiling it to other languages. The solution wasn't bytecode VMs but multiple implementations of a high-level language.
Is this a case of worse-is-better? I think so.
This is an extremely cool project, thank you for sharing!
Edit:
I'm not familiar with scheme/guile (only dabbled with Racket and later Clojure).
Are these real bytes that behave like actual bytes when you manipulate their bits (unlike Java etc.) and are stored as actual bytes?
https://www.gnu.org/software/guile/manual/html_node/Bytevect...
Yeah, bytevectors in Scheme are just some linear chunk of memory that you can get/set however you'd like. Funny that you mention this because this is actually another area where Wasm GC needs improvement. Hoot uses a (ref (array i8)) to store bytevector contents. However, Wasm GC doesn't provide any instructions for, say, interpreting 4 bytes within as a 32-bit integer. This is in contrast to instructions like i32.load that work with linear memory. The workaround for now is that you read 4 bytes, bit shift appropriately, and then OR them together. Wasm GC is fantastic overall but these little issues add up and implementers targeting GC feel like they're not on even footing with linear memory.
As someone who jumps back and forth between Dart and Typescript fairly regularly I can’t tell you how quickly and enthusiastically I’m planning on moving more and more of my stuff to Dart. It’s a huge quality of life improvement overnight.
Do you use Flutter Web, or some other framework for web UI?
As somebody who is not familiar with how garbage collection is implemented at the low level, can somebody explain why WasmGC is needed on top of Wasm?
For example, isn't CPython a C program and hence can just be compiled to Wasm, including its garbage collection part? Does garbage collection usually depend on OS specific calls, which are not part of C standard?
WasmGC allows you to reuse the native V8 garbage collector instead of having to bundle and run a virtualized garbage collector.
Is Wasm performance that far off of native that the difference between bundled GC and native GC is noticeable?
Performance is less of a concern than binary size. Without WasmGC, you need to ship a compiled garbage collector to the user along with every WASM module written in a GC'd language. That's a lot of wasted bandwidth to duplicate built-in functionality, so avoiding it is a big win! And performance will always be a bit better with a native GC, plus you can share its scheduling with the rest of the browser process.
I think it’s also that the V8 garbage collector already has a stupidly high bar when it comes to optimizations that shipping your own even without any consideration to WASM performance would be a step backwards for most languages running on the web.
There are several problems with bringing your own GC. Some that come to mind:
* Significantly increased binary size
* No easy way to trace heap objects shared amongst many modules
* Efficient GC needs parallelism, currently limited in Wasm
For a more thorough explanation, see https://wingolog.org/archives/2023/03/20/a-world-to-win-weba...
It's a big help in interop if everyone uses the same GC. Otherwise it becomes a huge headache to do memory management across every module boundary, with different custom strategies in each.
Definitely! Vanessa Freudenberg's SqueakJS Smalltalk VM written in JavaScript took a hybrid approach of using the JavaScript GC instead of the pure Smalltalk GC. WasmGC should make it easier to implement Smalltalk and other VMs in WebAssembly, without resorting to such tricky hybrid garbage collection schemes.
https://news.ycombinator.com/item?id=29019992
One thing that's amazing about SqueakJS (and one reason this VM inside another VM runs so fast) is the way Vanessa Freudenberg elegantly and efficiently created a hybrid Smalltalk garbage collector that works with the JavaScript garbage collector.
SqueakJS: A Modern and Practical Smalltalk That Runs in Any Browser
http://www.freudenbergs.de/bert/publications/Freudenberg-201...
The fact that SqueakJS represents Squeak objects as plain JavaScript objects and integrates with the JavaScript garbage collection (GC) allows existing JavaScript code to interact with Squeak objects. This has proven useful during development as we could re-use existing JavaScript tools to inspect and manipulate Squeak objects as they appear in the VM. This means that SqueakJS is not only a “Squeak in the browser”, but also that it provides practical support for using Smalltalk in a JavaScript environment.
[...] a hybrid garbage collection scheme to allow Squeak object enumeration without a dedicated object table, while delegating as much work as possible to the JavaScript GC, [...]
2.3 Cleaning up Garbage
Many core functions in Squeak depend on the ability to enumerate objects of a specific class using the firstInstance and nextInstance primitive methods. In Squeak, this is easily implemented since all objects are contiguous in memory, so one can simply scan from the beginning and return the next available instance. This is not possible in a hosted implementation where the host does not provide enumeration, as is the case for Java and JavaScript. Potato used a weak-key object table to keep track of objects to enumerate them. Other implementations, like the R/SqueakVM, use the host garbage collector to trigger a full GC and yield all objects of a certain type. These are then temporarily kept in a list for enumeration. In JavaScript, neither weak references, nor access to the GC is generally available, so neither option was possible for SqueakJS. Instead, we designed a hybrid GC scheme that provides enumeration while not requiring weak pointer support, and still retaining the benefit of the native host GC.
SqueakJS manages objects in an old and new space, akin to a semi-space GC. When an image is loaded, all objects are created in the old space. Because an image is just a snapshot of the object memory when it was saved, all objects are consecutive in the image. When we convert them into JavaScript objects, we create a linked list of all objects. This means, that as long as an object is in the SqueakJS old-space, it cannot be garbage collected by the JavaScript VM. New objects are created in a virtual new space. However, this space does not really exist for the SqueakJS VM, because it simply consists of Squeak objects that are not part of the old-space linked list. New objects that are dereferenced are simply collected by the JavaScript GC.
When full GC is triggered in SqueakJS (for example because the nextInstance primitive has been called on an object that does not have a next link) a two-phase collection is started. In the first pass, any new objects that are referenced from surviving objects are added to the end of the linked list, and thus become part of the old space. In a second pass, any objects that are already in the linked list, but were not referenced from surviving objects are removed from the list, and thus become eligible for ordinary JavaScript GC. Note also, that we append objects to the old list in the order of their creation, simply by ordering them by their object identifiers (IDs). In Squeak, these are the memory offsets of the object. To be able to save images that can again be opened with the standard Squeak VM, we generate object IDs that correspond to the offset the object would have in an image. This way, we can serialize our old object space and thus save binary compatible Squeak images from SqueakJS.
To implement Squeak’s weak references, a similar scheme can be employed: any weak container is simply added to a special list of root objects that do not let their references survive. If, during a full GC, a Squeak object is found to be only referenced from one of those weak roots, that reference is removed, and the Squeak object is again garbage collected by the JavaScript GC.
Yes you can compile CPython and utilize its GC.
The idea of WasmGC is to make objects which are available in the browser environment (like window) available in the wasm module. It is a spec which allows you to pass objects which are actually managed by the browsers GC to your wasm code.
Python has two garbage collectors - reference counting, and tracing
Reference counting does not handle circular references, and tracing does
Tracing collectors have to be able to read all the references of objects to work. The difficulty is that some of those objects or references to objects are in the stack - or in simpler terms, these are objects and references that only one function can read and understand. In C there are some non-standard extensions that let you do this to varying degree of success. In WASM, this is prohibited by design, because it’s a safety issue
To quote the summary of wasmgc feature from Chrome:
Managed languages do not fit the model of "linear" Wasm with memories residing in one ArrayBuffer. In order to enable such functionality, they need to ship their own runtime which has several drawbacks: (1) it substantially increases binary size of these modules and (2) it's unable to properly deal with links crossing the boundary between the module, other modules and the embedder that all implement their own GC who is not aware of the others.
WasmGC aims at providing a managed heap that Wasm modules can use to store their own data models while all garbage collection is handled by the embedder.
Maybe they can finally make Google Sheets, etc... off-line stand alone app.
Work on Google Docs, Sheets, & Slides offline https://support.google.com/docs/answer/6388102?hl=en
You must use the Google Chrome or Microsoft Edge browser.
Thanks Google.
PS: I think it's basically ancient wisdom at this point to keep a Chrome instance around to best handle most Google properties, but I don't remember seeing it much spelled out in official documentation.
How much effort should Google put into supporting browsers that don't have the feature sets to support the use case?
Depends, are the serious about avoiding antitrust cases?
I think Microsoft Office's mere existence precludes Google ever being considered a monopoly in the office apps space.
Mere existence of a competitor doesn't preclude an anti-trust case for monopolization or abuse of a dominant market position.
Lots of office suites exist, but Microsoft has to tread lightly. In Recent News, the EU is dinging them for bundling/tying Teams with Office. Because it's a dominant market position, regardless of competitors.
If at some future time, Google Sheets has a significant market share, they might get dinged, too.
What do you want them to do? Firefox categorically refuses to support things like PWAs.
I wish Google wouldn't force us to use chrome. Hell they have Google Play for games on Windows, they could just offer the complete app store this way and get a close to native app.
Switching to WasmGC doesn't really change the equation here.
Anyone here actually bothered by the speed of the core calculation engine of Google sheets?
I'm bothered by the sluggishness of the UI, but I've never typed '=A1+A2' in a box and found it takes too long to get an answer.
Even if it did, I'd kinda like million row X million column sheets to somehow be server side and my machine simply views a local window, and calculations happen locally if feasible with locally available data, or server side otherwise.
Would you pay for that?
No - all web apps should be designed with the idea that the client only has a little CPU, ram and storage, and if the user tries to do something outside those capabilities, the server should do the work.
Nobody expects a Google search to involve the client crawling the whole web and picking out your search keywords...
How is that an answer to the question? Your explanation shows how the service is pretty much doing everything which seems to justify why a paid service would be acceptable. If they provided free code but your had to pay the expense of running it locally would be more of a justifiable reason for not finding a need for paying.
Probably not bothered because it's fast thanks to this work. I'd probably be bothered if it was slow.
I mean, I've definitely been bothered by things not changing when I change them; but I never thought that was the core calculation engine.
I don't need to do a lot of MMO spreadsheets though, so I generally try to use Apache OpenLibreOffice.org locally, and it never stalls because of client-server communication.
I don't have experience with Google Sheets, but it's definitely an issue in Excel with complex sheets.
It's more of an issue when you have a few 100k lines and lots of layers of different lookups etc.
Oh, I know that one. To hinder Firefox. Am I right?
Has Firefox announced any intentions to not support WASM?
From what I can tell there is a bug with the current implementation, but I have a hard time believing that had anything to do with Google's decision. If that's what they wanted, it'd be a lot easier to just check the user agent. Most people aren't savy enough to fake their user agent.
Firefox supports WASM. The GC spec though is very much a work in progress. https://bugzilla.mozilla.org/show_bug.cgi?id=1774825
Eyeballing that list it looks like they have a lot to do yet.
Wasm GC works great in Firefox. It's Safari/WebKit that need to catch up! It's been over 6 months since Chrome and Firefox enabled Wasm GC in stable releases and we're still waiting for Safari.
Ah. My mistake. I saw a ton of open issues.
WasmGC has recently been completed in Firefox, the feature tracking page says since Firefox 120:
https://webassembly.org/features/
(hover over the cell for Firefox and Garbage collection).
The Firefox people have also been very active in WasmGC's development.
They also mention that “there are cases where browser APIs are backed by optimized native implementations that are difficult to compete with using Wasm … the team saw nearly a 100 times speedup of regular expression operations when switching from re2j to the RegExp browser API in Chrome” if so how they call RegExp from WasmGC or viceversa? 100 times speedup for native web apis is not something you can ignore, that’s the important lesson here, not everything compiled to wasm will result in a speed again
I'd be really curious to see the same benchmark applied to re2j on a native jvm, contrasted with those same optimized chrome internals. 100x is in the right ballpark for just a core algorithmic difference (e.g., worst-case linear instead of exponential but with a big constant factor), VM overhead (excessive boxing, virtual function calls for single-op workloads, ...), or questionable serialization choices masking the benchmarking (getting the data into and out of WASM).
The difference is primarily algorithmic. Java on WasmGC is about 2x slower than Java on the JVM. The remaining 50x is just Chrome's regex impl being awesome.
You can call from wasm back into js, I believe some webgl games do that
Yep it's a bi-directional communication layer. You can even have shared memory.
perhaps we should have more optimized web apis to use? maybe for creating optimized data structures with strong typing? Not just the typed arrays types, but also typed maps
It's great to see Java Web Start finally being used in anger 24 years after its initial release.
I'm familiar with applets, but hadn't heard of Java Web Start. Could Java Web Start interact with the DOM?
Not really, Java Web Start essentially provided a way to download a fully-capable Java application running on normal JVM, from a click on website.
It's been used in place of applets a lot, but mostly because you could provide a complex application that opened in a new window with a click (assuming you dealt with the CA shenanigans).
It couldn't interact with DOM unless it accessed a browser over something like COM.
I worked on an app that was using Java Web Start still, as of January 2023. It was a problem, because JWS was not included in the parts of Java that were open-sourced, and was not available as part of recent OpenJDK versions of Java. Some open source JWS implementations exist, when I left that job though, the situation had still not been resolved in terms of finding a JWS replacement. It was imperative that we get off Oracle Java because of their expensive upcoming licensing changes. I wonder what ever happened..........
It's now possible to run Java Web Start applications fully in the browser without any plugins: https://cheerpj.com/cheerpj-jnlp-runner/
It's based on CheerpJ, a JVM that runs in the browser (using JS and Wasm).
I still long for a world where we got ASM.js and SIMD directly in JavaScript.
you can still use ASM.js it's just a coding convention
No it's not just a convention. It was also an optimization compiler.
While we're at it, let's have CUDA.js and let me map CUDA objects to WebGL Texture and Vertex Buffers. So many cool visualization to be built.
It's got Urbit energy
This is really cool, I wonder what .NET to wasmGC would be like.
Do you mean in relation to Blazor? C#, F#, and VB already compile to wasm. They have their own GC.
Yeah. Googling it lead me to an old comment of mine about it, and Dan Roth mentioned this is their discussion on it https://github.com/dotnet/runtime/issues/94420
But not having to ship the GC could be a great benefit.
It is not entirely clear from the article, but apparently, they still use Java for their calculation engine. And then transpile it into JavaScript. Which makes me wonder whether instead of porting this code base to WasmGC, a partial rewrite would have helped the project’s maintain ability in the long run. Rust seems like a potentially good candidate due to its existing WebAssembly backend.
WasmGC is useful for many other projects of course. But I wonder how painful it is to maintain a Java code base, which is not a great choice for client-side web apps to begin with. I remember using GWT back in the days - and that never felt like a good fit. GWT supported a whitelisted subset of the Java standard library. But the emitted Javascript code was nigh on impossible read. I don’t remember if Chrome’s developer tools already had source map support back in those days. But I doubt it. Other core Java concepts like class loaders are equally unsuited for JavaScript. Not to mention that primitive data types are different in Java and JavaScript. The same is true for collections, where many popular Java classes do not have direct counterparts in JavaScript.
Somehow Java is the ultimate always the best stuff to do something even transpiled. Java is easy to build upon on the long run and can save itself and the codebase to new times.
The code wasn't really ported to WasmGC. They now compile the same code base to WasmGC with J2CL where before they used J2CL to transpile it to JavaScript.
Rewriting a large codebase with many years of work behind it to a totally new language is a pretty big task, and often ends up introducing new bugs.
But I do agree Java has downsides. A more practical conversion would be to Kotlin. That's a much closer language to Java, and it can interop, which means you can rewrite it incrementally, like adopting TypeScript in a JavaScript project. Also, Kotlin uses WasmGC, which avoids the downsides of linear memory - it can use the browser's memory management instead of shipping its own, etc., which is an improvement over C++ and Rust.
It’s perhaps telling that they had an initial prototype which was 2x slower, then talk about specific optimization strategies but never share the updated overall speed comparison.
In a different blog:
https://workspace.google.com/blog/sheets/new-innovations-in-...
"Building on improvements like smooth scrolling and expanded cell limits in Sheets, today we’re announcing that we’ve doubled the speed of calculation in Sheets on Google Chrome and Microsoft Edge browsers,"...
I assume this is 2x the speed of their current JavaScript code.
Yes, that is correct. It began 2x slower than JS, and ended up 2x faster than JS.
They forgot to mention the final speedup:
The initial version of Sheets Wasm showed calculation performance roughly two times slower than JavaScript.
[...]
Implementing speculative inlining and devirtualization—two very common optimizations—sped up calculation time by roughly 40% in Chrome.
So the only numbers we get are 2x slowdown and the 1.4x speedup, which makes it sound like it's still slower. I'm sure thats probably not the case but it is a strange way to write an article advertising wasm.
Also, I'm a bit confused about which language this was actually written in before the wasm switch. Was it always written in Java and transpiled to JS? It doesn't seem that way from the article:
Removing JavaScript-specific coding patterns.
they had a core data structure in Sheets which was blurring the lines between arrays and maps. This is efficient in JavaScript
which doesn't make sense if they were writing it in Java and transpiling to JS.
Was it transpiled to JS only once in 2013 and then developed in JS from there? In which case why go back to Java?
My understanding is that Sheets was and remains written in Java. I interpreted "JS-specific coding patterns" to mean that they were writing some of their Java code in a particular way to please the compiler so that it generated more efficient JS.
Ok, That makes sense. Now that I think of it if they switched to JS they probably would have rewritten it by hand instead of using a tool like J2CL.
Why is javascript slower than java?
uhhhh I don't think there is actually a real answer in that paragraph that compares PERFORMANCE of JS to JAVA.
that explanation is completely unhinged?
I mean, I thought it was a pretty good explanation.
JavaScript plays fast and loose with types. Which means that when JIT compilers generate code for it, they have to make all sorts of conservative decisions. This makes generated code more complex and, in turn, slower.
Does that make sense?
When will Safari have WasmGC?
AFAIK I think Igalia is/was working on WasmGC support for Webkit/Javascript Core [0]. Not sure about the status though, it's tracked here I think [1]. It says it's updated earlier this month, but don't know of the progress made.
[0]: https://docs.webkit.org/Other/Contributor%20Meetings/Slides2... [1]: https://bugs.webkit.org/show_bug.cgi?id=247394
With infinite effort, a company with infinite resources made something that is already objectively fast enough twice as fast!
They are right though that wasmgc might have a bigger impact than wasm alone. Lots of attention is deservedly paid to Rust but people use garbage collected languages more.
From the article, it seems like the big picture is that Google wants to move WasmGC forward. They got the Workspace/Sheets and Chrome/V8 teams together to start with Sheets "as an ideal testbed".
Presumably they think this investment will pay off in various ways like ensuring Chrome has a good implementation of WasmGC and building expertise and tools for using WasmGC in other Google products.
WasmGC: Revenge of the JVM
I'll need to look into this but if anyone knows: is this available (easily) as a standalone package somewhere; think using the engine from a CLI without needing or contacting Google Sheets at all.
I love how we are slowly back to Applets and related plugins.
WebGL | WebGPU + WasmGC => Flash!
Google finally got NaCL/Pepper into the browser.
Nice to hear from teams building with Wasm GC. They bring up the "shared everything" proposal [0] for enabling heap objects to be shared across the threads. In addition to this, I'd like to be able to share the contents of a packed array, such as a (ref (array i8)), with the host. Right now, only Wasm memory objects have ArrayBuffer access in JS. Providing the same shared access to packed GC arrays would unlock things like WebGL, WebGPU, and Web Audio for Wasm GC applications. I made a "hello triangle" WebGL program with Wasm GC awhile back, but there's no way that I know of to get acceptable throughput when working with GPU buffers because they have to be copied byte-by-byte at the Wasm/JS boundary.
[0] https://github.com/WebAssembly/shared-everything-threads/blo...
is this why all my gsheets scripts started failing out of nowhere today
This is the kind of cool cutting edge web development that google excels at. Hopefully there are some down stream effects that help more tools move to the web. Good work!
Maybe the numbers didn't look very good and they thought it was better to leave them out
But then what is the point telling us they are doing it with WASM now? Seems more like a KPI - Work Progress report.
Edit: In a different blog post :
https://workspace.google.com/blog/sheets/new-innovations-in-...
"Building on improvements like smooth scrolling and expanded cell limits in Sheets, today we’re announcing that we’ve doubled the speed of calculation in Sheets on Google Chrome and Microsoft Edge browsers,"...
I dont use Google Sheet but I wonder how far apart are their new implementation compared to Native Microsoft Excel.
I don't know if native excel should be the benchmark. It seems to me that it has been ever slower over excel versions.
One trouble with Excel is that it does an awful lot of different things but doesn’t do them well. If you try do use it to do what Pandas does it is shocking how small of a data file will break it.
The idea of incremental calculation of formulas that don’t need to be put in a particular order is still genius but the grid is so Apple ][. Maybe the world doesn’t know it because it spurned XBRL but accounting is fundamentally hyper dimensional (sales of items > $100 in the afternoon on the third week of February in stores in neighborhoods that have > 20% black people broken down but department…) not 2-d or 3-d.
Having the wrong data structures is another form of “garbage in garbage out”. I have though many times about Excel-killers, trouble is they have to be something new, different and probably specialized, people know how to get answers with Excel even if they are wrong, plus Excel is bundled with the less offensive Word and Powerpoint so a lot of people are paying for it either way.
Personally I want a spreadsheet that does decimal math even if there is no hardware support except on mainframes: I think a lot of people see 0.1 + 0.2 |= 0.3 and decide computers aren’t for them.
Google sheets does do decimal math so I'm not sure what you mean
It doesn’t use ordinary floats?
I don't know absolute values but Excel still has several advantages which make it faster: - Excel is written in C++ and compiled natively which will be a bit faster than Java running on the JVM. And Java running as WasmGC is about 2x slower than Java on the JVM. - Sheets is limited to 4GB of RAM by the browser, Excel isn't. - C++ can do shared memory multi-threading, WasmGC cannot.
Not sure why you’re comparing the c++ Excel to the browser version of sheets. It would make more sense to compare the native version of Excel to the native versions of sheets, ie. android iOS and chromeOS, and the browser sheets to the browser excel.
Because they're replying to someone who wondered about the performance "compared to Native Microsoft Excel".
I'll add that browser Google Sheets and native Microsoft Excel are the fastest versions available (of each product).
The two posts make the most sense together, yeah.
I was involved in this work (happy to answer any questions). Overall we started from a large slowdown compared to JS, worked hard, and ended up with a large speedup over JS of around 2x. So right now it is a big improvement over JS.
That improvement required work across the codebase, the toolchain, and the VM. Most of it is not specific to Java and also helps other WasmGC projects too (like Dart and Kotlin). We are also working on further improvements right now that should make things even faster.
Well done on the accomplishment -- no small feat to improve runtime 50% in a mature codebase.
I'm interested more in learning how to work within wasmgc. Do you have any resources you'd point to for someone looking to pick it up?
I'm surprised it was twice as slow. Just this past week I was playing around with WASM, running some pure math calculations and comparing them against a JS version. I was seeing a 10x perf increase, though I was writing WAT directly, not compiling C or anything like that.
Well, they give some reasons:
"For example, they had a core data structure in Sheets which was blurring the lines between arrays and maps. This is efficient in JavaScript, which automatically models sparse arrays as maps, but slow on other platforms. ""
"Of the optimizations they found, a few categories emerged:
Basically it seems, they tried to copy their js code. And this unsurprisingly did not work out well. They had to reimplement some critical parts.Note -- AIUI the js and wasmgc are both produced from the same Java codebase. The problem here is that the developers on the Java codebase had started making changes based on their performance when transpiled to javascript (which is only natural -- it seems that they had been targeting js output for a decade).
I don't think that was the only issue.
For example, they point out that they got 40% improvement by adding devirtualization to the compiler. Java by its nature likes to add a whole bunch of virtual method calls. Java devs primarily rely on the JIT to fix that up (and it works pretty well). Javascript relies similarly on that sort of optimization.
WASM, on the other hand, was built first to compile C/C++/Rust code which frequently avoids (or compiles away when it can) virtual method calls.
I imagine this isn't the only issue. For example, I'll guess that dealing with boxing/unboxing of things also introduces a headache that wouldn't be present in similar C/C++ code.
In short, it just so happens that a lot of the optimizations which benefit JS also benefit Java. The one example where they did a Javascript optimization was prefering Maps and Lists over PoJos.
It was all of the above.
We had a pretty awesome team of people working across the entire stack optimizing absolutely everything. The blog post only talks about the three optimizations that made the biggest difference.
Likely because they are compiling Java with WasmGC extensions and stuff. If you try C, Rust, Zig etc they tend to run extremely fast in WASM.
Go also has a WASM target which runs pretty slow compared to its native binaries. GC extensions might help but as far as I can remember, Go's memory model does not fit the WasmGC so it might never be implemented.
I'm somewhat curious why they even chose Java over Rust or Zig, considering the spreadsheet data paths should be relatively straight forward even with a lot of clone() activity.
The calculation engine was written in Java before Rust or Zig were invented.
Gotcha, I made the mistaken assumption this was a rewrite.
For larger applications things like method dispatch performance start to dominate and you land closer to 2x overall performance uplift.
I have tried using Google Sheets for large calculations, with even a few thousand rows it is much slower than say LibreOffice (instant vs. progress bar) Although maybe the WASM thing was not working.
Not surprising, google apps regularly seem to be slow to respond when I just type stuff.
They only made these changes yesterday.
Yeah, I did not enjoy that part, it was basically an abrubt stop. It also sounded that they were compiling java to js and afterwards to wasm, I would like to know what engine they are using.
The compiler from Java to JS is J2CL:
https://github.com/google/j2cl/
And that now includes a Java to WasmGC compiler, called J2Wasm:
https://github.com/google/j2cl/blob/master/docs/getting-star...
Ah same thought exactly
The WasmGC version is twice as fast as the JS version.
Oh why so negative? Can't you just be happy for them and their ability to release propaganda to have their team in the news? It's no different than all of these new AI releases showing a very early version of something that produces absolute crap output, but hey, at least it compiles successfully now! /s