I have thought about doing this and I just can't get around the fact that you can't get much better performance in JS. The best you could probably do is transpile the JS into V8 C++ calls.
The really cool optimizations come from compiling TypeScript, or something close to it. You could use types to get enormous gains. Anything without typing gets the default slow JS calls. Interfaces can get reduced to vtables or maybe even straight calls, possibly on structs instead of maps. You could have an Int and Float type that degrade into Number that just sit inside registers.
The main problem is that both TS and V8 are fast-moving, non-standard targets. You could only really do such a project with a big team. Maintaining compatibility would be a job by itself.
At least without additional extensions, TypeScript would help less than you think. It just wasn’t designed for the job.
As a simple example - TypeScript doesn’t distinguish between integers and floats; they’re all just numbers. So all array accesses need casting. A TypeScript designed to aid static compilation likely would have that distinction.
But the big elephant in the room is TypeScript’s structural subtyping. The nature of this makes it effectively impossible for the compiler to statically determine the physical structure of any non-primitive argument passed into a function. This gives you worse-than-JIT performance on all field access, since JITs can perform dynamic shape analysis.
I think the even bigger elephant in the room is that TypeScript's type system is unsound. You can have a function whose parameter type is annotated to be String and there's absolutely no guarantee that every call to that function will pass it a string.
This isn't because of `any` either. The type system itself deliberately has holes in it. So any language that uses TypeScript type annotations to generate faster/smaller code is opening itself to miscompiling code and segfaults, etc.
Can you name a single language that is used for high-performance software and whose type system is sound? To speed up the process, note that none of the obvious candidates have sound type systems.
JVM bytecode is a "language" and is proven to be sound. The languages that compile to that language, on the other hand, are a different kettle of fish.
This is specifically about type systems. It's easy to have a sound type system when you have no type system.
Also, I'm not too familiar with JVM bytecode, but if I load i64 in two registers and then perform floating point addition on these registers, does the type system prevent me from compiling/executing the program?
Can you say more about "proven to be sound"? Are you talking about a sound type system?
The type checker is specified in Prolog and rejects the above scenario:
Fun fact: Said type system has a 'top' type that is both the top type of the type system as well as the top half of a long or double, as those two actually take two values while everything else, including references, is only one value. Made some sense when everything was 32 bit, less so today.It does have a type system.
https://docs.oracle.com/javase/specs/jvms/se22/html/jvms-2.h...
JVM is a stack not register machine and yes the type system will prevent that from running. It will fail verification.
Maybe OCaml, but I haven't studied it much.
I doubt it's been proved to be sound. It shows up a lot on https://counterexamples.org/, although if I skim the issues seem to have been fixed since then.
I'm a little behind times on Haskell (haven't used it for some years) – there always were extensions that made it unsound, but the core language was pretty solid.
Look at the last example in there: https://counterexamples.org/polymorphic-references.html
Scala 3 has aimed to get sound but I’m not sure how far they got?
https://www.scala-lang.org/api/3.x/docs/blog/2016/02/17/scal...
Java, C#, Scala, Haskell, and Dart are all sound as far as I know.
Soundness in all of those languages involves a mixture of compile-time and runtime checks. Most of the safety comes from the static checking, but there are a few places where the compiler defers checking to runtime and inserts checks to ensure that it's not possible to have an expression of type T successfully evaluate to a value that isn't an T.
TypeScript doesn't insert any runtime checks in the places where there are holes in the static checker, so it isn't sound. If it wasn't running on top of a JavaScript VM which is dynamically typed and inserts checks everywhere, it would be entirely possible to segfault, violate memory safety, etc.
So - I know this in theory, but avoided mentioning it because I couldn’t immediately think of any persuasive examples (whereas subtype polymorphism is a core, widely used, wholly unrestricted property of the language) that didn’t involve casts or any/unknown or other things that people might make excuses for.
Do you have any examples off the top of your head?
Here's an example I constructed after reading the TS docs [1] about flow-based type inference and thinking "that can't be right...".
It yields no warnings or errors at compile stage but gives runtime error based on a wrong flow-based type inference. The crux of it is that something can be a Bird (with "fly" function) but can also have any other members, like "swim" because of structural typing (flying is the minimum expected of a Bird). The presence of a spurious "swim" member in the bird causes tsc to infer in a conditional that checks for a "swim" member that the animal must be a Fish or Human, when it is not (it's just a Bird with an unrelated, non-function "swim" member).
[1] https://www.typescriptlang.org/docs/handbook/2/narrowing.htm...This narrowing is probably not the best. I'm not sure why the TS docs suggest this approach. You should really check the type of the key to be safer, though it's still not perfect.
https://counterexamples.org/ is a good collection of unsoundness examples in various languages.
For TypeScript, they list an example with `instanceof`:
https://counterexamples.org/polymorphic-union-refinement.htm...
In the playground:
https://www.typescriptlang.org/play/?#code/GYVwdgxgLglg9mABA...
It might be useful for an interpreter though. I believe that in V8 you have this probabilistic mechanism in which if the interpreter "learns" that an array contains e.g. numbers consistently, it will optimize for numbers and start accessing the array in a more performance way. Typescript could be used to inform the interpreter even before execution. (My supposition, I'm not an interpreter expert)
Outside of really funky code, especially code originally written in TS, you can assume the interface is the actual underlying object. You could easily flag non-recognized-member accesses to interfaces and then degrade them back to object accesses.
You’re misunderstanding me, I think.
Suppose you have some interface with fields a and c. If your function takes in an object with that interface and operates on the c field, what you want is to be able to do is compile that function to access c at “the address pointed to by the pointer to the object, plus 8” (assuming 64-bit fields). Your CPU supports such addressing directly.
Because of structural subtyping, you can’t do that. It’s not unrecognized member. But your caller might pass in an object with fields a, b, and c. This is entirely idiomatic. Now c is at offset 16, not 8. Because the physical layout of the object is different, you no longer have a statically known offset to the known field.
In practice v8 does exactly what you're saying can't be done, virtually all the time for any hot function. What you mean to say is that typescript type declarations alone don't give you enough information to safely do it during a static compile step. But modern JS engines, that track object maps and dynamically recompile, do what you described.
I mentioned this in my original comment:
We're talking about using types to guide static compilation. Dynamic recompilation is moot.
Oh, I thought JIT in your comment meant a single compilation. Either way, having TS type guarantees would obviously make optimizing compilers like v8's stronger, right? You seem to be arguing there's no value to it, and I don't follow that.
My claim is that the guarantees that TS provides aren't strong enough to help a compiler produce stronger optimizations. Types don't just magically make code faster - there's specific reasons why they can make code faster, and TypeScript's type system wasn't designed around those reasons.
A compiler might be able to wring some things out of it (I'm skeptical about obviouslynotme's suggestions in a cousin comment, but they seem insistent) or suppress some checks if you're happy with a segfault when someone did a cast...but it's just not a type system like, say, C's, which is more rigid and thus gives the compiler more to work with.
I would bet that, especially outside of library code, 95+% of the typed objects are only interacted with using a single interface. These could be turned into structs with direct calls.
Outside of this, you can unify the types. You would take every interface used to access the object and create a new type that has all of the members of both. You can then either create vtables or monomorphize where it is used in calls.
At any point that analysis cannot determine the actual underlying shape, you drop to the default any.
Which is exactly the kind of optimizations JIT compilers are able to perform, and AOT compiler can't do them safely without having PGO data, and even then, they can't re-optimize if the PGO happens to miss a critical path that breaks all the assumptions.
Such Typescript already exists, Static Typescript,
https://makecode.com/language
Microsoft's AOT compiler for MakeCode, via C++.
The 2019 paper[1] says: “STS primitive types are treated according to JavaScript semantics. In particulars, all numbers are logically IEEE 64-bit floating point, but 31-bit signed tagged integers are used where possible for performance. Implementation of operators, like addition or comparison, branch on the dynamic types of values to follow JavaScript semantics[.]”
[1]: https://www.microsoft.com/en-us/research/publication/static-...
AssemblyScript (https://www.assemblyscript.org/) is a TypeScript dialect with that distinction
It’s advertised as that, and it’s a cool project, but while it’s definitely a statically typed language that reuses TypeScript syntax, it’s not clear to me just what subset of the actual TypeScript type system is supported. That’s necessarily bad—TypeScript itself is very unclear about what its type system actually is. I just think the tagline is misleading.
Ecmascript 4 was an attempt to add better types to the language, which sadly failed a long time ago.
It'd be nice of TS at least allowed for specifying types like integer, allowing some of the newer TS aware runtimes could take advantage of the additional info, even if the main TS->JS compilation just treated `const val: int` the same as `const val: number`.
I wonder if a syntax like
would be acceptable.Yeah, that is why I said TS (or something similar). TS made some decisions that make sense at the time, but do not help compilation. The complexity of its typing system is another problem. I'm pretty sure that it is Turing-complete. That doesn't remove feasibility, but it increases the complexity of compiling it by a whole lot. When you add onto this the fact that "the compiler is the spec," you really get bogged down. It would be much easier to recognize a sensible subset of TS. You could probably even have the type checker throw a WTFisThisGuyDoing flag and just immediately downgrade it to an any.
Because JS code can arbitrarily modify a type, any language trying to specify what the outputs of a function can be also has to be Turing complete.
There are of course still plenty of types that TS doesn't bother trying to model, but it does try to cover even funny cases like field names going from kebab-case to camelCase.
With Extractors [1] (currently at Stage 1), you could define something like this to work:
[1] https://github.com/tc39/proposal-extractorsNumber is not semantically compatible with raw 64-bit integer, so you might as well wish for a native
The current state of the art isContributor to Porffor here! I actually disagree, there's quite a lot that can be improved in JS during compile time. There's been a lot of work creating static type analysis tools for JS, that can do very very thorough analysis, an example that comes to mind is [TAJS](https://www.brics.dk/TAJS/) although its somewhat old.
I wonder how much performance gain you expect to achieve. For simple CPU-bounded tasks, C/Rust/etc is roughly three times as fast as v8 and Julia, which compiles full scripts and has good type analysis, is about twice as fast. There is not much room left. C/Rust/etc can be much faster with SIMD, multi-threading and fine control of memory layout but an AOT JS compiler might not gain much from these.
Honestly, I’m fine with only some speed up compared to V8, it’s already pretty fast… My issue with desktop/mobile apps using web tech (JS) is mostly the install size and RAM hunger.
[raises hand] I'd be fine with no speedup at all if I can get more reasonable RAM usage and an easily linkable .so out of the deal.
In my mind, the big room for improvement is eliminating the cost to call from JS into other native languages. In node/V8 you pay a memcopy when you pass or return a string from C++ land. If an ahead of time compiler for JS can use escape analysis or other lifetime analysis for string or byte array data, you could make I/0 or at least writes from JavaScript to, for example, sqlite, about twice as fast.
Somewhat related to this idea is AssemblyScript https://www.assemblyscript.org
Yea I came here to say this, actually I was able to transpile a few typescript files from my project into assembly using GPT just for fun and it actually worked pretty well. If someone simply implements a strict typescript-like linter that is a subset of javascript and typescript that transpiles into assemblyscript, I think that would work better for AOT because then you can have more critical portions of the application in AOT and other parts that are non-critical in JIT and you get best of both worlds or something like that. making js backwards compatible and AOT sounds way too complicated.
You can do inference and only fall back to Dynamic/any when something more specific can't be globally inferred in the program. For an optimization pass this is an option.
You say you have "thought about doing this"..."[but] you can't get much better performance", then describe the approach requiring things that are described first-thing, above the fold, on the site.
Did the site change? Or am I missing something? :)
Dart, maybe, but it lost