HN comments for: LSP: The good, the bad, and the ugly

forrestthewoods

24 replies

1d21h

2024-09-06 20:48:09 UTC

LSP is pretty ok. Better than the before times I suppose. Although C++ has long had good IDE support so it hasn't affected me too much.

I have a maybe wrong and bad opinion that LSP is actually at the wrong level. Right now every language needs to implement a from scratch implementation of their LSP server. These implementations are HUGE and take YEARS to develop. rust-analyzer is over 365,000 lines of code. And every language has their own massive, independent implementation.

When it comes to debugging all native language support common debug symbol formats. PDB for Windows and DWARF for Nixy things. Any compiled language that uses LLVM gets debug symbols and rich debugging "for free".

I think there should be a common Intellisense Database file format for providing LSP or LSP-like capabilities. Ok sure there will still be per-language work to be done to implement the IDB format. But you'd get like 95% of the implementation for free for any LLVM language. And generating a common IDB format should be a lot simpler than implementing a kajillion LSP protocols.

My dream world has a support file that contains: full debug symbols, full source code, and full intellisense data. It should be trivial to debug old binaries with full debugging, source, and intellisense. This world could exist and is within reach!

PoignardAzur

15 replies

1d20h

2024-09-06 21:59:24 UTC

rust-analyzer is over 365,000 lines of code.

That has nothing to do with LSP.

Rust Analyzer is similar in scope to a second implementation of the Rust compiler.

forrestthewoods

13 replies

1d20h

2024-09-06 22:15:11 UTC

similar in scope to a second implementation of the Rust compiler.

I know. That’s really bad!

nobodywasishere

12 replies

1d19h

2024-09-06 23:15:01 UTC

I disagree. A compiler for batch building programs and a compiler for providing as much semantic information about incomplete/incorrect/constantly changing programs are completely different tasks that require completely different architectures and design considerations.

forrestthewoods

8 replies

1d18h

2024-09-06 23:33:21 UTC

I don’t think that’s true at all.

First of all, a compiler for a 100% correct program definitely has all the necessary information for robust intellisense. They don’t currently save all the data, but it should exist.

So the only real question is whether they can support the 0.01% of files that incomplete and changing?

I’ll readily admit I am not a compiler expert. So I’m open to being wrong. But I certainly don’t see why not. Compilers already need to support incorrect code so they can print helpful error messages. Including different errors spread through out a single file.

It may be that current compilers are badly architected for incremental intellisense generation. But I don’t think that’s an intrinsic difference. I see no reason that the tasks require “completely different architectures”.

troupo

7 replies

1d13h

2024-09-07 05:28:11 UTC

First of all, a compiler for a 100% correct program definitely has all the necessary information for robust intellisense.

It doesn't. Intellisense is supposed to work on 100% incorrect and incomplete programs. To the point that it should work in syntactically invalid code.

forrestthewoods

6 replies

1d12h

2024-09-07 05:39:50 UTC

Intellisense is supposed to work on 100% incorrect and incomplete programs.

Correct. I literally discussed this scenario in my comment!

If the program compiles successfully then the compiler has all the information it needs for intellisense. If the program does NOT fully compile then the compiler may or may not be able to emit sufficient intellisense information. I assert that compilers should be able to support this common scenario. It is not particularly different from needing to support good, clear error messages in the face of syntactically invalid code.

troupo

5 replies

1d9h

2024-09-07 08:44:52 UTC

I assert that compilers should be able to support this common scenario

Not necessarily. These are two very different tasks quite at odds with each other

forrestthewoods

4 replies

1d9h

2024-09-07 08:56:38 UTC

These are two very different tasks quite at odds with each other

Are they? I feel like intellisense is largely a subset of what a compiler already has to do.

I’d say the key features of an LSP are knowing the exact type of all symbols, goto definition, and auto-complete. The compiler has all of that information.

Compilers produce debug symbols which include some of the information you need for intellisense. I wrote a PDB-based LSP server that can goto definition on any function call for any language. Worked surprisingly well.

If you wanted to argue that intellisense is a subset of compiling and it can be done faster and more efficiently I could buy that argument. But if you’re going to declare the tasks are at odds with one another I’d love to hear specific details!

zaksingh

2 replies

1d4h

2024-09-07 13:52:33 UTC

On the efficiency angle, I think a big difficulty here that isn’t often discussed is that many optimization strategies relevant to incremental compilation slow down batch compilation, and vice versa.

For example, arena allocation strategies (i.e internment of identifiers and strings, as well as for allocating AST nodes, etc) is a very effective optimization in batch compilers, as the arenas can live until the end of execution and therefore don’t need “hands on” memory management.

However, this doesn’t work in an incremental environment, as you would quickly fill up the arenas with intermediary data and never be deleting anything from them. This is one reason rust-analyzer reimplements such a vast amount of the rust compiler, which makes heavy use of arenas throughout.

As essentially every programming language developer writes their batch compiler first without worrying about incremental compilation, they can wind up stuck in a situation where there’s simply no way to reuse their existing compiler code for an IDE efficiently. This effect tends to scale with how clever/well-optimized the batch compiler implementation is.

I think the future definitely lies in compilers written to be “incremental first,” but this requires a major shift in mindset, as well as accepting significantly worse performance for batch compilation. It also further complicates the already very complicated task of writing compilers, especially for first-time language designers.

thramp

0 replies

1d4h

2024-09-07 14:17:43 UTC

That's a great point about allocation/memory management. As an example, rust-analyzer needs to free memory, but rustc's `free` is simply `std::process::exit`.

If I remember correctly, the new trait solver's interner is a trait (https://doc.rust-lang.org/nightly/nightly-rustc/rustc_trait_...) that should allow rust-analyzer's implementation of it to free memory over time and not OOM people's machines.

I think the future definitely lies in compilers written to be “incremental first,” but this requires a major shift in mindset, as well as accepting significantly worse performance for batch compilation. It also further complicates the already very complicated task of writing compilers, especially for first-time language designers.

I'm in strong agreement with you, but I will say: I've really grown to love query-based approaches to compiler-shaped problems. Makes some really tricky cache/state issues go away.

gugagore

0 replies

1d1h

2024-09-07 16:41:08 UTC

I thought that rust's compiler was indeed written to be incremental first. Check a sibling comment of mine for reasons why I thought so.

thramp

0 replies

1d4h

2024-09-07 14:11:10 UTC

Are they? I feel like intellisense is largely a subset of what a compiler already has to do.

They are distinct! Well, not just intellisense, but pretty much everything. I'll paraphrase this blog post, but the best way to think about think about the difference between a traditional compiler and an IDE is that compilers are top-down (e.g, you start compiling a program from a compilation unit's entrypoint, a `lib.rs` or `main.rs` in Rust), but IDEs are cursor-centric—they're trying to compile/analyze the minimal amount of code necessary to understand the program. After all, the best way to go fast is to avoid unnecessary work!

If you wanted to argue that intellisense is a subset of compiling and it can be done faster and more efficiently I could buy that argument. But if you’re going to declare the tasks are at odds with one another I’d love to hear specific details!

Beyond the philosophical/architectural difference I mentioned above, compilers typically have a one-way mapping between syntax and mapping, but to support things like refactors or assists, you often need to do the opposite: go from semantics to syntax. For instance, if you want to refactor from struct to an enum, you often need to find all instances of said struct, make the semantic change, then construct the new syntax tree from the semantics. For simple transformations like a struct to an enum, a purely syntax-based based approach might work (albeit, at the cost of accuracy if you have two structs with same name), but you start to run into issues when you consider traits, interfaces (for example: think about how a type implements an interface in Go!), or generics.

It doesn't really make sense for a compiler to support above use cases, but they're are _foundational_ to an IDE. However, if a compiler is query-centric (as rustc is), then it's pretty feasible for rustc and rust-analyzer to share, for instance, the trait solver or the borrow checker (we're planning/scoping work on the former right now).

sph

0 replies

1d10h

2024-09-07 08:06:53 UTC

Nonsense. Given that the end user of both is a human, you want the compiler that builds program to know as much about semantics to aid in fixing buggy/incomplete/incorrect programs.

gugagore

0 replies

1d2h

2024-09-07 16:28:58 UTC

Other comments have addresses many always of your comment. The constantly changing part is also an important feature for recompilation being more efficient than recompiling from scratch each time. You can read about it here: https://rustc-dev-guide.rust-lang.org/queries/query-evaluati...

There is recording of a talk on YouTube from Niko Matsakis that goes into the motivation.

In conclusion, you don't really want to optimize for the batch use case, even outside of IDE support.

ReleaseCandidat

0 replies

1d12h

2024-09-07 06:07:23 UTC

No. Actually "interactive" frontends in batch compilation mode generally have better error messages in this mode too. Yes, it may make the batch compilation (the frontend part) sligthly slower, but won't turn Go into Rust (or Haskell or C++).

And there always is the possibility to stop in batch mode when the first error occured.

czei002

0 replies

1d8h

2024-09-07 10:27:37 UTC

I would blame Rust though. For example, Rust has macros which are way too powerful and make it very hard to write a LSP (https://rust-analyzer.github.io/blog/2021/11/21/ides-and-mac...)

Very interesting is how Roslyn/Typescript does it: https://www.youtube.com/watch?v=qnyOHY7AiZk

ReleaseCandidat

2 replies

1d12h

2024-09-07 05:37:47 UTC

Rust-analýzer is an example of what not to do, which is reimplementing a compiler frontend. Ideálly it should be the samé as the "real" compiler is using. Of course this has it's own problems, which the Haskell LSP this post is about, shows. As compilers not written for being used "interactively".

Any compiled language that uses LLVM gets debug symbols and rich debugging "for free".

That doesn't hold for C++ and much less for any language even "less C" than C++. Like languages using a GC, e.g. Roc https://www.roc-lang.org/

forrestthewoods

1 replies

1d10h

2024-09-07 08:24:47 UTC

That doesn't hold for C++

What do you mean? Why not? Clang PDBs for C++ work great. A GC isn’t particular disruptive to debug symbols afaik.

ReleaseCandidat

0 replies

1d9h

2024-09-07 08:59:57 UTC

What do you mean? Why not?

You need support for each language in the debugger, as symbols do not contain semantics. As users we for example know that `foo::bar` and `foo::baz` are methods of the same class `foo`, the debugger doesn't.

The problem with GCs is that pointers must contain some additional information (like an additional bit for mark and sweep), they are not "just" pointing to some memory. Without "knowing" that, the debugger cannot follow the pointer to its target. Or tricks with unboxed ints like making them 1 bit smaller and using the first 1 as a tag for "this is not a pointer, but an integer".

aidenn0

1 replies

1d18h

2024-09-06 23:34:47 UTC

1. I use zero languages that use either PDB or DWARF that are not named "C".

2. You are either overestimating the level of detail available in PDB/DWARF or underestimating the massive amount of language-specific work needed for even basic features (e.g. methods, which lack any cross-language ABI) given just what PDB/DWARF give you.

3. What LSP provides and what PDB/DWARF offer are only very loosely related. Consider the case of writing function1, then (without compiling) writing function2 that calls function1. It is typical for an LSP to offer completion and argument information when writing out the call for function1. That's not something you get "for free" with PDB/DWARF.

forrestthewoods

0 replies

1d18h

2024-09-06 23:58:18 UTC

You are either overestimating the level of detail available in PDB/DWARF

Uhhh. I didn’t say PDB/DWARF already have the necessary information. In fact I even proposed a new file format! I suggest you re-read what I said.

Consider the case of writing function1, then (without compiling) writing function2 that calls function1. It is typical for an LSP to offer completion and argument information when writing out the call for function1. That's not something you get "for free" with PDB/DWARF.

What do you think LSP servers do in the background? They’re effectively compilers that are CONSTANTLY compiling the code.

Amusingly rust-analyzer takes longer to bootstrap than a full and complete clean and build. Maybe it’s not as parallel, I’m not sure.

thramp

0 replies

1d4h

2024-09-07 13:42:54 UTC

I've responded on reddit before (https://www.reddit.com/r/rust/comments/1eqqwa7/comment/lhwwn...), but I'll restate and cover some other things here.

I have a maybe wrong and bad opinion that LSP is actually at the wrong level. Right now every language needs to implement a from scratch implementation of their LSP server. These implementations are HUGE and take YEARS to develop. rust-analyzer is over 365,000 lines of code. And every language has their own massive, independent implementation.

rust-analyzer a big codebase, but it's also less problematic than the raw numbers would make you think. rust-analyzer has a bunch of advanced functionality (term search https://github.com/rust-lang/rust-analyzer/pull/16092 and refactors), assists (nearly 20% of rust-analyzer!) and tests.

I think there should be a common Intellisense Database file format for providing LSP or LSP-like capabilities. Ok sure there will still be per-language work to be done to implement the IDB format.

I think you might be describing formats like (https://code.visualstudio.com/blogs/2019/02/19/lsif) and SCIP (https://github.com/sourcegraph/scip). I personally like SCIP a bit more LSIF because SCIP's design makes it substantially easier to incrementally update a large index. We use SCIP with Glean (https://glean.software/) at work; it's pretty nice.

But you'd get like 95% of the implementation for free for any LLVM language. And generating a common IDB format should be a lot simpler than implementing a kajillion LSP protocols.

I wouldn't say 95%. SCIP/LSIF can do the job for navigation, but that's only a subset of what you want from an IDE. For example: - Intellisense/autocomplete is extremely latency sensitive where milliseconds count. If you have features like Rust/Haskell's traits/typeclasses that allow writing blanket implementations like `impl<T> SomeTrait for T`, it's often faster to try to solve that trait bound on-the-fly than storing/persisting that data. - It'd be nice to handle features like refactors/assists/lightbulbs. That's going to result in a bunch of de novo code needs to exist outside of a standard compiler, not counting all the supporting infrastructure.

My dream world has a support file that contains: full debug symbols, full source code, and full intellisense data.

Rust tried something similar in 2017 with the Rust Language Server (RLS, https://github.com/rust-lang/rls). It worked, but most people found it too slow because it was invoking a batch compiler on every keystroke.

nobodywasishere

0 replies

1d20h

2024-09-06 21:33:47 UTC

That sounds similar to LSIF https://microsoft.github.io/language-server-protocol/specifi...

Zababa

0 replies

1d20h

2024-09-06 21:47:08 UTC

"any LLVM language" is a lot but also not that much. You're missing Python, JS, Go, Ruby, etc.

aidenn0

20 replies

1d22h

2024-09-06 19:38:37 UTC

RE: "Not a truly open project."

If LSP isn't truly open, then neither are most GNU projects. It was very common for the first 15+ years of GNU's existence for the public development process of a project to be "the single maintainer publishes a release archive whenever they feel like it"

It's a standard freely published and available for all to implement. If that's not "truly open" then we have moved the goalposts way too far.

jchw

14 replies

1d20h

2024-09-06 22:05:15 UTC

I think "truly open" is not specific enough. Not being developed "in the open" is one thing, not having "open governance" is another thing.

That said, I guess the problem here is that for standards it helps if well, you collaborate with the people for which the standard is meant to be used by, which is presumably a little hard if there's a huge asymmetric relationship when it comes to Microsoft's concerns vs the rest of the world's concerns.

This is one of those cases where having a standards committee or consortium is the way to go. Committees have their problems, but I think it's only reasonable. If you think about it, doesn't it seem inevitable that eventually, big organizations that make editors would want a consortium of some sort to collaborate on protocols like this? LSP is really just the beginning, since there are plenty of things that editors would probably like to integrate deeper with, such as build systems.

aidenn0

9 replies

1d19h

2024-09-06 23:24:44 UTC

This is one of those cases where having a standards committee or consortium is the way to go. Committees have their problems, but I think it's only reasonable.

I think a committee is a reasonable backup plan, and can even be done without Dirk's approval (see e.g. WHATWG done without the W3C's approval). If the LSP continues to be "good enough" then it seems unlikely.

If you think about it, doesn't it seem inevitable that eventually, big organizations that make editors would want a consortium of some sort to collaborate on protocols like this?

Maybe? Editors tend to be an afterthought for most companies; JetBrains is the only company I can think of that is big on the LSP and for whom the editor is a primary experience.

thayne

4 replies

1d16h

2024-09-07 01:35:07 UTC

Editors tend to be an afterthought for most companies; JetBrains is the only company I can think of that is big on the LSP and for whom the editor is a primary experience.

Well, expand beyond "big companies", to include open source prijects, and non-profit foundations, and there are a lot of parties that might be interested in an LSP comity including both groups that develop editors (neovim, emacs, Jetbrains/intellij, eclipse, zed, etc.) and makers of lsp servers (ex. Google for gopls, Rust foundation for rust, etc.)

ReleaseCandidat

3 replies

1d13h

2024-09-07 05:07:47 UTC

But then there is no reason to use LSP, they could (and should) come up with their own protocol.

jchw

1 replies

1d12h

2024-09-07 06:18:26 UTC

I don't follow. The reason why you use LSP is because it is a common protocol. LSP is valuable because it is a standard protocol; build an LSP client into your text editor and gain access to the rich ecosystem of existing LSP servers, build an LSP server for your language and get rich code intelligence for your language inside many of the most popular text editors.

Of course someone else could just try to make a competing protocol with LSP, but I think that's a waste of time when it could most likely be incrementally and backwards compatibly improved quite a lot. And also, any given party only is one side of the equation, so any given entity only has so much sway here.

ReleaseCandidat

0 replies

1d9h

2024-09-07 09:17:42 UTC

The reason why you use LSP is because it is a common protocol.

Exactly. And if "everybody" (except MS) in the committee agrees on the implementation of LSP, they can define a new common protocol too. The old LSP clients and servers won't stop working, they may slowly die out if the new protocol "wins".

Don't get me wrong, this is just a hypothetically perfect solution, which didn't happen before LSP and I'm sceptical that it will happen in the foreseeable future, as LSP is "good enough". And with "it" I mean that there won't be such a commitee, much less a new, common protocol.

It will always need a new, better and open protocol implemented in a successful editor or IDE to gain traction and spread (like what happened with VS Code and LSP).

thayne

0 replies

1d1h

2024-09-07 17:20:14 UTC

The reason to use LSP is that there are already many existing implementations of it.

Starting over with a new protocol, and replacing implementations for all the existing editors and languages would be a tremendous amount of work for relatively little benefit.

And either forking lsp, or creating a new protocol would cause fragmentation.

Also, VSCode is popular enough that even if the other editors forked LSP and made significant improvements, if VSCode wasn't on board, it would be a tough sell to get many languages and existing LSP implementations to adopt it.

jchw

3 replies

1d17h

2024-09-07 01:06:53 UTC

Yes I certainly agree, there is not really much of an impetus for this to happen right away or anything. In the long term, though, I do suspect it is inevitable.

Maybe? Editors tend to be an afterthought for most companies; JetBrains is the only company I can think of that is big on the LSP and for whom the editor is a primary experience.

I have some thoughts:

- I think there will be more. At the very least, I suspect there is a reasonable chance Apple/XCode would eventually adopt LSP.

- Realistically, there aren't that many browser engines either. There's really just two truly distinct browser engines, and really only one of them is the main product of the company that produces it. Arguably there are already more distinct text editor engines that have LSP clients built-in today: Neovim, VSCode/Monaco, Visual Studio, IntelliJ IDEA, Eclipse, Zed, and probably more I'm not thinking of.

- I think that it is likely LSP clients will continue to appear in more places. In "cloud compute" UIs for things like serverless functions, inside of code forge's built-in editors, and so forth. It's not that it's necessarily that it's so easy to do it well, it's more that the value:effort ratio of doing it is pretty great, and every time someone develops a new high-quality LSP, it gets even better.

immibis

2 replies

1d7h

2024-09-07 11:05:52 UTC

Note that by fund distribution, Firefox is not the main product of Mozilla. There are zero browser engines that are the main product of their owning companies.

jchw

0 replies

1d2h

2024-09-07 16:25:00 UTC

I did think about phrasing it this way, but to be fair, by income, Firefox is definitely their main income source. Although, not in a great way. Not looking forward to how that pans out.

d4rti

0 replies

8h19m

2024-09-08 10:11:48 UTC

A strange thing is you can specifically donate to Thunderbird, but not to Firefox.

I never quite understood why Mozilla seems more interested in other things vs Firefox (the browser, not the other Firefox branded things).

ReleaseCandidat

3 replies

1d13h

2024-09-07 05:03:31 UTC

doesn't it seem inevitable that eventually, big organizations that make editors would want a consortium of some sort to collaborate on protocols like this?

Jetbrains' main feature are their own engines. And MS still has Visual Studio - without Code ;) - which doesn't use LSP. And Apple is Apple. So, who would that be? On the contrary I'd say that everybody who want's to get "big", must not use LSP to have something that sets the editor apart.

jchw

2 replies

1d11h

2024-09-07 06:34:11 UTC

Jetbrains' main feature are their own engines.

Jetbrains IntelliJ platform supports language servers and for example, WebStorm will use tsserver for TypeScript code. CLion also uses clangd, and there are also third-party plugins that use the LSP client.

And MS still has Visual Studio - without Code ;) - which doesn't use LSP.

Microsoft Visual Studio definitely has a built-in LSP client, and yes, I mean "not Code". By default it will be used for tsserver (regular Visual Studio indeed supports TypeScript), and third-party plugins can use the LSP client too.

And Apple is Apple

Of course, we'll see. Apple is stubborn yes. If they do LSPs, they'll do it their way specifically. That said, I personally think it's decent odds to eventually happen, especially seeing as the Swift programming language provides an LSP.

On the contrary I'd say that everybody who want's to get "big", must not use LSP to have something that sets the editor apart.

Well, it's certainly possible to build richer features than the LSP can support if you build your code intelligence engine to be tightly integrated with your text editor, and it is true that a huge selling point of IntelliJ is indeed, their own custom code intelligence. However, I think this is a false dichotomy. There is definitely no reason a program that has its own framework for code intelligence systems can't also support LSP - I mean, both Visual Studio and IntelliJ do, and IntelliJ has always been mixing multiple sources of code intelligence together, as it still does today with clangd and tsserver, which in IntelliJ get combined with their own analysis and refactoring tools.

Modelling code intelligence the way Jetbrains always has probably won't go away any time soon, but I think it is pretty clearly not the future. The future is building code intelligence into the compiler, and redesigning the architecture of compilers to better accommodate these interactive, incremental use cases. Jetbrains seems to be a very smart company and I don't expect them to continue to cling to outmoded approaches if they prove to be less effective, and I suspect that as LSPs continue to improve Jetbrains will continue to lean on them and probably even contribute to them.

I will say though, at this point, not having a built-in LSP client does indeed set your editor apart quite a bit, although definitely not in a way that will be favorable!

forty

0 replies

22h40m

2024-09-07 19:50:49 UTC

Jetbrains IntelliJ platform supports language servers and for example, WebStorm will use tsserver for TypeScript code

I don't believe tsserver even use LSP. It feels a bit like "Let's make a standard that everyone should use so it's easy for vscode to integrate but for our own language we are not going to use it so it's harder to integrate in other IDE"

ReleaseCandidat

0 replies

1d9h

2024-09-07 08:43:20 UTC

I did formulate that badly. I did not want to say that Jetbrains and MS do not use LSPs too, but that the most important IDEs do _not_ use an LSP - Java, Kotlin, Python, C#, Rust - or add significant improvement to them - like the debugger and refactoring for C++.

And again, while (not-Code) VS may support LSP, it does not use one it for the main languages - C++, C# and F#.

aithrowaway1987

1 replies

1d22h

2024-09-06 20:00:33 UTC

But that is not the standard for current GNU projects in large part because of all the easily avoidable friction. "If it was good enough for Richard Stallman in 1987, it's good enough for Microsoft in 2024" is just a dumb argument.

Not to mention you're conflating apples with oranges, since a software standard is very different from an application. POSIX wasn't just one Bell Labs employee working by himself.

From the article:

The LSP should be an open standard, like HTTP, with an open committee that represents the large community which is invested in LSP, and can offer their insight in how to evolve it.

There is no goalpost moving here.

cryptonector

0 replies

1d22h

2024-09-06 20:24:46 UTC

Building and maintaining a community is hard work. Even just talking to all comers is hard work. You need a team for that, but if you're a team of one then the community is likely going to suffer. You could find external contributors to promote to committers, but that's work too, and maybe LSP's maintainer doesn't want that (or maybe LSP's maintainer's employer (MSFT) doesn't want that). Apart from what MSFT wants, the rest is just as likely to happen for small enough projects whether they be GNU projects or not.

leni536

0 replies

1d21h

2024-09-06 21:13:27 UTC

Most GNU projects are typically not standards/specifications but programs and libraries. That's a significant difference IMO.

kelnos

0 replies

1d20h

2024-09-06 21:53:08 UTC

Two things:

1. (As another commenter mentioned) Most GNU projects are not standards that are expected to be adopted by a significant number of implementers, and used by a huge number of users. Most GNU projects are totally fine having a few maintainers.

2. I am a lot more comfortable with a GNU project being run by a single maintainer than a public specification being owned by a corporation, where changes to that specification are largely driven by that company's product choices and profit motive.

And regardless, it seems a little weird to compare the GNU of the 80s and 90s to any public project today. In GNU's first 15 years the internet was nascent (at best!), and the number of people who implemented, used, and cared about these sorts of things were orders of magnitude smaller than they are today. Needs have changed.

kayodelycaon

0 replies

1d21h

2024-09-06 21:20:40 UTC

Yeah… this is can be confusing. “Open source”, “open standard”, and “open project” are different concepts.

The first two are well-defined and LSP meets the criteria for both.

“Open project” doesn’t have a definition or criteria. In this case, it probably means the community using a standard, controls the standard. The http protocol is an example of this.

Ultimately, they think the people in charge of the project are not listening to the people using it. This is a serious concern.

However, making a project “open” does not fix this problem. See systemd.

I hate dropping Wikipedia links but here is open standard: https://en.m.wikipedia.org/wiki/Open_standard

ogoffart

12 replies

1d20h

2024-09-06 22:23:58 UTC

I've implemented a LSP server (for Slint - https://slint.dev) and I agree with this article.

The paradox is that it was meant to avoid to write language support for each editor. Yet, if you want to support vscode you must create a specific extension for it and can't just have a language client.

The article mention the configuration problem, but I'd add the problem that Microsoft refuses to specify a way for the server to tell the client what are the config options so that the client can show some kind UI showing the possible configuration options with a description of what they do. https://github.com/microsoft/language-server-protocol/issues...

dualogy

4 replies

1d10h

2024-09-07 07:44:16 UTC

The paradox is that it was meant to avoid to write language support for each editor. Yet, if you want to support vscode you must create a specific extension for it and can't just have a language client.

On the one hand, this has always bugged me slightly. But on the other hand, every time when it came down to it in practice, I realized two things: first, a VSC extension that does nothing but wire up the LSP language clienting library is a tiny project with no major maintenance burden; and secondly, more importantly you usually want additional language-extension client-side features that are editor-specific by nature and either couldn't or shouldn't be abstracted-inside / covered-by the language-server / lang intel / IntelliSense realm. Syntax-coloring and other grammar-describing meta-data come to mind, or custom Notebook, or rich & productive build or debug or pkg-mgmt etc. helpers & tooling UX, etc. Stuff that arguably belongs in one's "language extension" but is mostly editor-specific and thus at the same time doesn't truly belong in one's "language intelligence serving".

ReleaseCandidat

3 replies

1d8h

2024-09-07 09:40:05 UTC

Syntax-coloring and [...] come to mind

Semantic highlighting is part of LSP and supported by all "bigger" LSPs.

legrangramgroum

2 replies

1d8h

2024-09-07 09:58:07 UTC

LSP syntax highlighting can be heavy weight the whole source is sent every few keystroke, and if many tokens get an annotation. It is common to use fast and simple JS frontend highlighting for simple things like literals, keyword and comments. For example, with Textmate via a VSCode extension. LSP is left to deal with the identifiers where a name resolver or a type system is needed.

bryjnar

0 replies

1d4h

2024-09-07 13:42:45 UTC

LSP syntax highlighting supports sending deltas, so it's not true that the whole source gets sent repeatedly.

ReleaseCandidat

0 replies

1d8h

2024-09-07 10:24:44 UTC

I'm not sure if regex engine (or treesitter) + LSP really is faster than just using the LSP, as most of them check the source on typing anyways, so the overhead is sending the tags for the syntax highlighting. And there is no need to send the whole text, the client can send the current edited range only (if the server supports that).

And at least the highlighting would actually be correct (or consistent with the compiler's AST ;).

paxys

3 replies

1d9h

2024-09-07 08:55:04 UTC

That's because the LSP makes no assumptions about stuff like (1) what the language server is, (2) how it is launched and managed, (3) how the editor communicates with it, etc. It only defines the format of the communication, nothing else.

In the simplest form the entire language extension in VS Code can be ~15 lines of code that forks the server process and talks with it over stdio. But in complex cases you can:

- Write the entire language server in the extension itself.

- Have your code hosted on a different machine or Docker container.

- Run the editor in a browser and get language support though a service worker or a TCP connection to a remote server.

LSP supports all of this seamlessly because it is agnostic about the transport layer.

ogoffart

2 replies

1d5h

2024-09-07 13:23:50 UTC

Other editors have kind of standardized it to associate a language to a command to run and communicate trough the stdout/stdin. Why do vscode need to be more complicated?

(even this association could be done with a local DB similar to mime types if we wanted to simplify)

ratmice

0 replies

1d4h

2024-09-07 13:54:53 UTC

I'm pretty torn, on the one hand this tight vscode integration should exist, and is being used to good effect by projects like the lean info-view, which shows the goals of the proof state. And my own project which e.g. generates railroad diagrams and displays them in the editor.

On the other hand, it is kind of unfortunate that this sort of tight integration with language server and editor is happening in the editor which drives/governs the specification itself. I think it would be much more tolerable in regards to the future of the specification if these sorts of "above and beyond" integrations were all happening in any other editor.

WhyNotHugo

0 replies

1d4h

2024-09-07 14:16:53 UTC

The approach that you mention has its limitations, and what many (most?) editors do is quite a bit more complex than that.

For example, I don't want a new LSP instance for each file, I want to re-use the same LSP instance if files belong to the same project. For this, my editor needs to understand which directory is the root of the project, and this varies per-language.

In case of neovim, there's a nvim-lspconfig plugin that ships all this configuration and lots of other little nuances. I guess the folks from VSCode preferred lots of little plugins instead of a single big one.

blahgeek

2 replies

1d7h

2024-09-07 10:38:57 UTC

Yet, if you want to support vscode you must create a specific extension for it and can't just have a language client.

It’s funny that you generally don’t need to do that for editors like vim or emacs - only need to add a single line in config to specify the command line argument

ratmice

0 replies

1d5h

2024-09-07 13:23:30 UTC

It honestly depends on how much your LSP server infests the editor. In the language server I wrote there are really 2 cases where we utilize the creation of an editor-specific extension

* dynamic registration of file extensions * display of svg generated by the language server.

We also had to write ad-hoc custom vim-script and would need to do the same for emacs for the first of those, and just dump a URL for vim to punt to a browser for the latter. But it isn't unrealistic to require a custom editor scripts for other editors besides vscode, in the sense that I've done so...

bryjnar

0 replies

1d4h

2024-09-07 13:43:50 UTC

It depends - the client modules for Emacs' `lsp-mode` generally need a fair bit of configuration. Not just how to launch the server, but also often defining a native-compatible way of setting the options. e.g. for Emacs they often get wired up as `defcustom`s.

kayodelycaon

12 replies

1d22h

2024-09-06 20:05:59 UTC

The LSP specification is big. Really big. Last time I checked it had 90 (!) methods and 407 (!!) types. Printing it to a PDF gives you 285 pages (!!!).

Given what it does, that’s pretty dang small. You don’t get all those features across every programming language without some complexity.

Compared to the 3,000+ pages specifications I’ve dealt with, this is easy.

I have a very tiny violin and a cricket to play it for anyone complaining about a few hundred pages. :)

saghm

7 replies

1d20h

2024-09-06 21:35:14 UTC

What sticks out to me is that there's less than one method per every four types. Given that, I assume that most of these types are just plain structs or type aliases; in other words, just names for things. I'm not sure what the author is trying to convey with that; would it be better if there were half as many types but they all had twice as many fields and methods and had to get used for multiple purposes?

If the scope of the project is too large, that's certainly a valid complaint, but independent of somehow measuring the expected "size" of an API to cover a given scope, I don't really see how these raw numbers are particularly concerning. I'd be more concerned if the ratio of methods to types were reversed, but still a bit skeptical of any significance without more context.

kayodelycaon

4 replies

1d20h

2024-09-06 21:42:15 UTC

For comparison, YAML is 77 pages.

saghm

3 replies

1d11h

2024-09-07 06:56:30 UTC

YAML is a data serialization language that doesn't have any concept of "execution"; LSP needs to encode not just one full programming language semantics, but _arbitrary_ programming language semantics. YAML's spec being over 3/4 the size of LSP is a lot more damning to YAML to me.

kayodelycaon

2 replies

1d3h

2024-09-07 15:22:24 UTC

YAML is known to be incredibly complex and nuanced, which is why I choose it as a comparison.

Also, 77 is 37% of 285.

saghm

1 replies

2024-09-07 17:55:10 UTC

Also, 77 is 37% of 285.

Not sure why I misremembered the number of pages so inaccurately; I must have gotten so focused on the method/type numbers to the point of ignoring the page count

I do think that "incredibly nuanced" is a relative term; it's nuanced and complex because it's competing in a space where the default option is JSON, which for all of its faults is extraordinarily simple. In my mind, a data format is just a set of "nouns", compared to a protocol which has to also define "verbs" alongside those nouns. It's not even comparing apples and oranges; it's comparing an apple with the act of growing an apple from a seed into a tree. Maybe I'm misunderstanding, and the YAML specification includes an API definition?

kayodelycaon

0 replies

22h49m

2024-09-07 19:41:55 UTC

I wasn’t trying for an orange to orange comparison. I assumed developer familiar with YAML would be able to make a rough guess how they compare.

Maybe I’m assuming too much? I’ve read a lot of documentation.

As a side note, YAML does have internal templating. You can give an object a name and use it elsewhere like a class. I think that feature makes it turning complete.

diggan

1 replies

1d8h

2024-09-07 10:16:49 UTC

would it be better if there were half as many types but they all had twice as many fields and methods and had to get used for multiple purposes?

Reminds me of this quote:

9. It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.

Written by Alan Perils, shared in "Epigrams in Programming" (1982): https://cpsc.yale.edu/epigrams-programming

Maybe I'm lisp-damaged, but I agree with this quote and parent, more functions (not methods) and less data structures tend to be easier to manage when things grow in scope.

saghm

0 replies

2024-09-07 17:59:05 UTC

Interesting, I hadn't heard this before! My immediate reaction is that yes, I agree with principle as he states it exactly, but that's because anything with two dimensions where one of them is "1" is always going to be simpler because it's effectively one-dimensional. I don't think the comparison is fair because of that; I'd rather 100 functions and one data structure than 10 of each, but I'd prefer 100 data structures and a single function compared to that.

sph

2 replies

1d10h

2024-09-07 08:02:19 UTC

Your argument basically boils down to "I am cooler than you because I have worked on larger specs". The fact that there are hilariously larger documents doesn't offer any perspective on whether you need 300 pages to describe how to color source code in a text editor.

But good for you! Here's a very tiny award.

kayodelycaon

0 replies

22h14m

2024-09-07 20:16:32 UTC

In all seriousness, I do have a point. I’ve read enough documentation to know the size of LSP’s documentation is what I would expect for something of that complexity.

The author’s surprise speaks more to their unfamiliarity with projects at this scale than anything special about LSP.

And I had to be sarcastic about it. That’s why I had :) at the end of my comment.

As far as being cooler, I’m not. If anything the author is way cooler. I’m just a random snarky dog on the internet. :)

iudqnolq

0 replies

1d8h

2024-09-07 09:36:25 UTC

I pulled up a random section and saw a method the editor sends right before a save so that the lsp can do format on save. I think format on save is pretty handy. Would you cut it?

(There's two methods, one with a timeout. In an ideal world maybe that would be one but I totally understand how initially they maybe assumed format on save would always be reasonably quick and then had to backtrack)

Uehreka

0 replies

1d18h

2024-09-06 23:52:21 UTC

I was gonna say, I’ve had to pore over 800+ page PDFs from the 3GPP describing like, the possible fields for one particular protocol in the LTE spec. 285 pages for the whole shebang feels Hemingwayesque.

pama

4 replies

1d22h

2024-09-06 19:45:03 UTC

How much easier is it to get the tree-sitter spec implemented for a new language compared to LSP? Are there synergies in getting both to work?

ramon156

0 replies

1d20h

2024-09-06 21:54:18 UTC

Apples to Oranges

kriiuuu

0 replies

1d22h

2024-09-06 19:48:19 UTC

Significantly easier because there is just a lot less surface area.

ReleaseCandidat

0 replies

1d13h

2024-09-07 05:13:32 UTC

Treesitter is the wrong solution anyway. The biggest advantage of LSP is that the parser, type checker,... used by the LSP is (well, can and should be) the same as the one used in the compiler (or at least another _complete_ implementation of the language, like clang for gcc/g++).

IshKebab

0 replies

1d21h

2024-09-06 20:33:56 UTC

Tree Sitter doesn't do a tenth of what most LSP servers do, so it's much easier. But they aren't really related. Tree Sitter is a parser that you might use in an LSP server (I have done; worked decently).

sesm

2 replies

1d17h

2024-09-07 00:49:13 UTC

The problem that Microsoft is solving is promotion of VsCode and Visual Studio. Making LSP small and easy to implement would go against that. They want LSP to work well only in their IDEs. Yes, this makes implementing LSP for languages harder, but most of those developers are paid by MS directly or indirectly, in fact making LSP hard also works as a filter against unpaid contributors.

zarzavat

0 replies

1d14h

2024-09-07 04:09:04 UTC

When it comes to IDEs, small and easy = uncompetitive.

Most users want to use the IDE that saves them the most time and has the features they want, they are not concerned with implementation complexity because that’s someone else’s job.

If you reduce all the complexity of a competitive IDE to a protocol then you get something as complicated as LSP. That’s just how it is.

eropple

0 replies

1d16h

2024-09-07 01:49:15 UTC

Having read the LSP specification, and when I take into account just how many things LSP does, I think a 285-page spec is really tiny. I have specs on my work computer with TMF and CAMARA specs that are nearly 100 pages just to talk about the operation of a half-dozen methods--and there's still ambiguity in them at times.

Coupling that genuine brevity with the ability to avoid serious backwards-compatibility problems makes this charge feel pretty outlandish, TBH.

oblio

2 replies

1d23h

2024-09-06 18:53:52 UTC

Rabbit hole warning (started by the article linked above):

https://matklad.github.io/2022/04/25/why-lsp.html

https://matklad.github.io/2023/10/12/lsp-could-have-been-bet...

sestep

0 replies

1d22h

2024-09-06 19:55:13 UTC

That 2022 post is amazing, thanks for the link!

matklad

0 replies

1d22h

2024-09-06 20:26:00 UTC

And

https://rust-analyzer.github.io/blog/2020/07/20/three-archit...

Is a must-read if you are to build your own LSP server!

pie_flavor

1 replies

1d8h

2024-09-07 09:39:27 UTC

One of my favorite aspects of LSP being 'good enough' is that barebones support is incredibly simple to understand and implement. Old link since it was factored out at some point, but here is the entire LSP implementation for the Micro editor, in ~700 lines of Lua (+ utility functions): https://github.com/AndCake/micro-plugin-lsp/blob/dde90295f09...

bryjnar

0 replies

1d4h

2024-09-07 13:47:36 UTC

Admittedly my proposal to build everything off more powerful state synchronization primitives does raise the bar for getting _something_ working. Ideally you make it so you can progressively support more features, still. But sometimes you do have to choose between "easy at first, then gets progressively harder" vs "hard at first, then stays manageable".

nyanpasu64

1 replies

1d19h

2024-09-06 23:24:47 UTC

Not a LSP developer, but spitballing that perhaps you could address causality by tagging document updates/versions with numbers and messages/replies include the version of the document they correspond to.

bryjnar

0 replies

1d4h

2024-09-07 13:49:41 UTC

LSP does do this, but inconsistently. Document versions aren't always sent (e.g. nothing relating to diagnostics uses them).

It's also not enough to send the versions: you need to actually say what to do with them!

kelnos

0 replies

1d20h

2024-09-06 21:54:14 UTC

So I don’t think it’s really a good idea to do a big re-engineering of the protocol just to make it easier for implementers

I'm not so sure. Maybe it's not the case today, but if the protocol eventually becomes so large, so crufty, so under-specified that implementers struggle to provide their implementations, then users suffer.

csb6

0 replies

1d21h

2024-09-06 20:58:26 UTC

The article touched on it, but it would be nice if there were a standard way to have extensions for LSP, e.g. a protocol that could be shared by language servers for formal specification languages that need proof obligation support. [0]

[0] http://dx.doi.org/10.4204/EPTCS.338.3