return to table of content

JIT WireGuard

akira2501
16 replies
10h51m

I always felt the disappointment of wireguard was wrapping it up into an opinionated network interface. It really should have been a generic "filter" that you could attach to any type of file handle. Then the configuration would be far less strongly coupled, less weirdly communicated to the kernel, and the status of your connection more immediately obvious.

Plus, you could have wireguard files on your local or remote filesystems, or any character device, or named pipes if you felt like it. You could use a "jit" daemon to build tap or other interfaces for you, or just do it individually at the application layer. You could have pre registered keys with the kernel, or you could manage that directly, or generate them randomly.

It's always been a weird smelling underspecified IPSEC clone to me, when it could have been so much more.

d-z-m
10 replies
10h0m

I always felt the disappointment of wireguard was wrapping it up into an opinionated network interface.

Why? WireGuard is a VPN, it's pretty normal for VPN solutions to expose themselves as a network interface.

It really should have been a generic "filter" that you could attach to any type of file handle.

What's the use-case you had in mind here? I'm not sure how generifying it to a "filter" on any type of file descriptor looks for an interactive protocol like wireguard.

It's always been a weird smelling underspecified IPSEC clone to me[...]

Just because there isn't an RFC? I've always found the wireguard paper[0] to be quite readable and thorough in it's specification of the protocol.

[0]:https://www.wireguard.com/papers/wireguard.pdf

justsomehnguy
9 replies
9h47m

Not the OP, but the main problem with WireGuard[0] is not the protocol[1], it's good, but the opinionated tooling around it, be it .INI style configurations (god I hate it), mutual incompatibility of wg and wg-quick or just outright stupid decisions around storing config files and interacting with the user in the Windows client.

Though there are some nuances with the routing selection/filtering too, which gets troublesome when you just need a pipe and run a proper routing protocols over it. ::/0 solves most of it but still there are some rough edges.

[0] well, for *me*

[1] One of the amusing things I discovered what I have a full 10MB/s+ to the SMB server in the DC over the WireGuard tunnel (and that's because it's 100Mbit/s uplink), while the Synology which sits on the same router on a 1Gbit port only makes 3-5MB/s.

vbezhenar
7 replies
7h33m

wg is low-level binary, wg-quick is high-level script wrapper. They're not supposed to have any compatibility at all. You can build any kind of high-level wrapper for wg. One of that wrappers is Network Manager, for example.

My issue with wireguard is that it's not enough for full-fledged VPN solution, for example there's no way to push routes to the client or DNS configuration or something like that. Those are very basic needs. If you have 100 users and you decide to change routing scheme, well, you're in a trouble. It is supposed to be solved by higher-level protocols, but I'm not aware of any open standard-de-facto ones with quality implementations.

tuetuopay
2 replies
6h58m

nothing prevents you from running bgp over wireguard for route exchange. and there are many quality bgp daemons available*

(* for linux and bsd)

vbezhenar
1 replies
4h34m

How do I configure my iPhone with BGP routes? Write my own app for VPN? Android, Windows? Linux users who have no idea what BGP is? That won't work, if you're small.

tuetuopay
0 replies
1h51m

huh yeah, too focused on my infra side of things; mobile vpn is another can of worms.

though I don't know what would prevent any bgp daemon from running in e.g. the wireguard iOS app? there are bgp daemons like gobgp that can be easily integrated in other software.

but this was more meant to be a joke than anything. wireguard is batteries definitely not included, and is why tailscale and the like do exist.

LeBit
2 replies
6h34m

You mean "dynamically" push routes and DNS configurations while the tunnel is already up?

Because you can definitively configure routes and DNS at connection time.

vbezhenar
0 replies
4h35m

I'm generally comparing it with OpenVPN and it allows to do all that.

ta1243
0 replies
4h39m

What you'd normally think of as Wireguard allows routes at connection time sure, however OP wants a VPN which allows peer B ("server") to define a route and advertise that route to peer A ("client"). So one day the client would route 10.1.0.0/24 down the wireguard tunnel, but not 10.2.0.0/24, the next day however from changing peer B, the config on peer A would change.

Obviously there are many things you could do to allow this (run a routing protocol, build a custom client which gets route information, etc), but the "out of the box" wireguard is a kernel interface, a wg command, and a utility script (wg-quick). I think there are some gui based clients for non-linux based OSes, but it's the same principle.

DNS is nothing to do with the wireguard kernel or userspace, it's configured in the "wg-quick script" (there's a bash function called set_dns), but you can do that however you want.

Wireguard alone isn't what an enterprise would consider to be a "VPN solution", it doesn't push configs from a central location, it's very much a peer-to-peer tool. You can build "enterprise" features like centrally defined routes or DNS on top of that, or not, it's not opinionated.

justsomehnguy
0 replies
6h33m

there's no way to push routes to the client or DNS configuration

Yep, this one too.

aragilar
0 replies
8h14m

You don't ever need to use the configuration files, netlink on linux or the cross platform interface documented at https://www.wireguard.com/xplatform/ mean you can write your own tooling around an existing inteface.

The cryptokey routing is pretty fundamental to wireguard, I'm not sure you could have one without the other.

mellutussa
2 replies
10h46m

But why don't use the nice smelling IPSEC if that ticks your boxes?

akira2501
1 replies
10h36m

It doesn't. It just foresaw the need to be able to dynamically configure tunnels on first connection and specified all of that. Which seems to me is a lot of what fly io has just mostly reimplemented here.

In any case the point is I would prefer to just have the basic components available and let me piece them together however I want. Mostly to allow using the underlying technology in more contexts that it is currently available in.

medstrom
0 replies
7h24m

Heh, it really sounds like your needs would be better served with IPSec or something. WireGuard was born precisely because they saw that the whole problem making other existing solutions difficult to audit and insecure-in-practice was their thousand ways to configure. So they did the opposite. Low lines of code, few possibilities.

In software you often choose between a small monolith and a big kitchen sink. Once you have 1 more need than the monolith covers, you have to go over to the kitchen sink.

api
1 replies
8h23m

Wireguard is more or less just Noise (IK I think) for IP packets. It wouldn’t make much sense for files since there is no counterparty. File encryption is done differently.

Edit: http://noiseprotocol.org/

actionfromafar
0 replies
7h4m

Aren’t encrypted files like noise?

irjustin
15 replies
11h38m

For everyone else, I'll shamelessly use this to plug Netmaker[0].

Not affiliated, just a satisfied guy that needs access to private AWS VPCs across multiple accounts and would love to see them be more widely adopted.

[0] https://www.netmaker.io/

denkmoon
7 replies
11h16m

Can’t you do that “AWS native” with private link or vpc peering? I’m a noob with these so I don’t understand the benefit of netmaker

irjustin
6 replies
11h4m

The goal isn't to make the networks seem like one and connect resources across accounts, which is what those products do.

My goal is to access private resources via SSH bastion/jump machines in a specific account. There's a few ways to do this in AWS, but all of them are more costly by a pretty wide margin.

seany
2 replies
10h45m

You can forward ssh through ssm, and dump that into your ssh config file. Works pretty nicely with some of the sso automation for the cli that's around these days.

rmccue
1 replies
8h34m

Is this using SSM’s raw port forwarding support? From what I’ve seen, their protocols seem to lack binary safety (we get weird encoding issues).

zbentley
0 replies
2h54m

Without knowing the specifics of your situation, I would slightly suspect client/configuration if you’re encountering encoding issues. In my experience, integrating ssh and ssm is quite stable (provided you’re using OpenSSH and not a specific language client’s own implementation of the protocol).

poxrud
0 replies
26m

This is the best way to connect to your instances. However you still need the SSM agent installed and the right IAM permissions.

vasco
0 replies
10h52m

AWS VPN is pretty cost effective, we used if for a few years with a multi account setup. And it's pretty much zero work.

porker
4 replies
11h26m

Is Netmaker like Tailscale? From their site I'm unclear what the distinguishing factor is.

thunfischbrot
3 replies
11h19m

Roughly, yes. Netmaker has a self-hostable server though. With tailscsle of course the 3rd-party headscale is available. Netbird also seems promising. See https://github.com/cedrickchee/awesome-wireguard for more alternatives.

computeinbrain
2 replies
7h24m

Yes. Netbird is very recommendable: https://netbird.io/

AnarchismIsCool
0 replies
3h55m

Using it in prod right now, not a fan. It will gleefully create a root ssh tunnel for you through its daemon if the box on the website is checked.

remram
0 replies
2h33m

What is it, a generic VPN platform? Similar to Tailscale etc?

Their site is extremely vague.

mtmk
0 replies
9h10m

Thanks, didn't know about this. I guess Netmaker (or similar) manage the keys for you which would make the admin a lot easier. In a previous job we setup and managed wg across a few Windows and Linux boxes using Ansible. It was OK but was getting a little messy in the end.

dpeckett
9 replies
8h22m

Might as well take the opportunity to shill one of my recent experimental projects, If you are interested in building Go apps that act as userspace WireGuard peers take a look at https://github.com/dpeckett/noisysockets

Based off the excellent work in done by wireguard-go but I've attempted to simplify and make things a lot more idiomatic for library use.

I reckon building a service mesh out of it would be interesting, obviously supporting multiple languages would be hard but maybe you could implement a sockets API. Though it might be hard to compete performance wise with mTLS as I've not seen HW acceleration for WireGuards crypto yet.

FWIW, I'm currently on the market for freelancing roles, so if you're interested in Golang freelancers in the high-speed/secure networking space, please reach out (email is in my profile).

bscphil
4 replies
5h36m

This looks great, nice work!

I have a dream of taking a userspace Wireguard project like this and gluing PAKE over a relay in front of it to exchange Wireguard keys, followed by holepunching and establishing a direct tunnel. Basically Magic Wormhole for arbitrary tunnels - and hopefully vastly improved performance for the file transfer case as well, rather than crapping out at 20-30 MB/s over long fat networks.

xyzzy_plugh
3 replies
4h29m

You're basically describing Tailscale.

mbreese
1 replies
1h14m

Maybe, but I read it as wormholes at the application level. Which would be awesome for me.

I do a lot of work on remote servers for heavy data processing (large files). Sometimes I need to get these files back to my local computer, or just view figures I’ve generated using remote data. I have a small reverse tunnel daemon setup where I can (in my remote terminal) send arbitrary files or data back to my local computer.

    $ rtun send file.txt
    $ rtun view data.pdf
It is a daemon that listens locally and when I setup SSH, a reverse tunnel is configured from one Unix socket to another. These are multiuser servers and Unix sockets handle authorization easily. The same setup could be handled similarly with a zmodem like process and aware terminal.

However, this setup is a bit cumbersome and I can’t just run `ssh host`. If I could have the same setup with an in-app wireguard tunnel, with no other setup, it would be amazing.

dpeckett
0 replies
49m

Hmm interesting usecase, I know a lot of datascience folks are fans of SSHFS but it sounds like you have kind of got an inversion of control thing going on. Is there a reason you aren't mounting the remote filesystem on your local machine and manipulating files/data via that?

I've been looking for interesting applications for noisysockets, and I'm obsessed with network attached storage so this is a fun one for me.

ikiris
0 replies
45m

I for one would appreciate a tailscale that actually does proper networking instead of their weird static routes but not thing.

gtirloni
3 replies
5h35m

Pretty good. Is Noisy Transport something similar to Slack's Nebula [0] in a way or am I mixing things up?

0 - https://github.com/slackhq/nebula

dpeckett
2 replies
5h2m

There's some similarities and differences, instead of going through a TUN/TAP device each application communicates directly to the mesh with it's own userspace TCP/IP stack (and also retains compatibility with wireguard).

johnmaguire
1 replies
31m

(I am a Nebula maintainer.) We recently merged support for gVisor-based services, although it's very new, and I don't know of much experimentation that's been done with it yet: https://github.com/slackhq/nebula/pull/965

dpeckett
0 replies
1m

Really cool! I guess that there's little differences other than noisysockets is explicitly WireGuard compatible (but I actually prefer Nebulas tweaks to the wire protocol).

I actually had some folks from OpenZiti reach out and they have another similar tool in this space.

protoman3000
7 replies
10h8m

The problem you quickly run into to build this design is that Linux kernel WireGuard doesn’t have a feature for installing peers on demand.

I don't seem to understand. You can add peers at runtime, e.g.

https://serverfault.com/questions/1101002/wireguard-client-a...

Can somebody clarify?

EDIT: If I understand correctly, that step is already too late. They want to authenticate a peer before adding it to the interface in order to prevent stale entries on the interface.

They thus put a eBPF filter in front of the interface and do the cryptokey-routing based association to an authorized counterpart by themselves. If it checks out, then they add the peer to the interface and remove it after a timeout.

api
5 replies
8h16m

Seems to me they did this to avoid the alternative of running WG in user space. They wanted a feature the Linux kernel didn’t have to route by cryptographic address first but without leaving the kernel so they hacked it in.???

JIT Wireguard is a weird way to frame this. My mind went to “why? The performance bottleneck is the crypto and per client JIT won’t help with that.”

I would have just gone user space. Use something like tokio-uring or glommio to get the performance. If you keep going in the kernel you are going to keep hitting limitations because Linux is not built to serve millions and millions of active tunnels. Even doing millions of TCP connections per kernel gets hairy sometimes.

Every limitation will require a hack. Every hack will be some system config that has to be applied and managed. The tool chains for provisioning Linux metal boxes are vastly inferior to the tooling for developing apps and services and managing their config.

Or am I stupid and misunderstanding?

vidarh
4 replies
7h1m

It does not seem like they need huge numbers of active tunnels per gateway.

And JIT just as in "just in time" configuration of Wire guard. Once the configuration has been done, their stack stays out of it.

api
3 replies
6h44m

Ahh. In that case they are using the term JIT weirdly. Usually that means just in time compilation of script or byte code to machine code.

NoahKAndrews
1 replies
6h7m

The phrase "just-in-time" can be used for other things besides compilation (it's often used for manufacturing, for instance). I think it's a helpful way to describe lots of things, and that we shouldn't try to limit its usage in tech to just compilation.

gtirloni
0 replies
5h38m

Exactly. The very first time I heard about "JIT" was in the context of manufacturing. The Toyota Production System [0].

I think JIT compilation wasn't popular in ancient times, so I never associated JIT with compilation by default.

0 - https://en.wikipedia.org/wiki/Toyota_Production_System

cchance
0 replies
2h55m

JIT compiling is the term your most used to it being used with but JIT has been around in other fields for longer, and just means what it says... Just In time :)

tptacek
0 replies
3h48m

What you want, and what in the medium term I think Jason plans to provide, is a Netlink API from kernel WireGuard that just gives you a feed of all the public keys the kernel has seen from initiator messages. With that feed, you wouldn't have to install a single WireGuard peer in advance. They could all just live in a SQLite database (or something), and get installed on-demand as clients try to connect with them.

If you're a VPN provider (for instance), the current API is a little clunky. It's not just that at any given time only a small fraction of your peers are actually in use, though that is probably true. It's also that as the number of peers you handle scales, from hundreds of thousands to tens of millions, you lose the ability to store them all in a single instance of the kernel at all; there are just too many. If peers have to be pre-installed, a consequence of that is that they get locked to specific server machines.

But, as the post points out, you can get a facsimile of the interface you need today with simple packet capture. And Jason set the API up so that you can --- very easily --- flip the initiation from server to client so the connection experience is seamless, even though the kernel dropped the first initiation message (because the peer hadn't been installed).

So that's the idea here.

Jann Horn pointed out that we could have taken this a step farther: we could have held on to the initiation packet that we captured, and, once the peer was installed, replayed the packet into the kernel. Which is also a neat idea.

I don't think there's much in this post that is going to change your life. It's just a couple of neat tricks that we thought people would like to know about.

(Though: the next step for us is to build on this to get "floating peers", de-regionalizing them completely, so users no longer have to think about what region their peers are configured in, which I think will actually have a product benefit for users, unlike this, which has primarily nerd benefit.)

mdavid626
7 replies
7h4m

What does this mean?

“We’ve gone a step beyond that: every time you run flyctl, our lovable, sprawling CLI, it conjures a TCP/IP stack out of thin air, with its own IPv6 address, and speaks directly to Fly Machines running on our networks.”

makeworld
2 replies
6h10m

They're saying Wireguard is used.

Spivak
1 replies
5h3m

More than that. They're saying that they're running a userspace implementation of both the tcp/ip stack and wireguard. Your machine isn't a peer to the wireguard tunnel, only flyctl is.

dementik
0 replies
26m

Of course, your machine can be a peer also. You can create peers to your organization with `flyctl wireguard create`.

freedomben
1 replies
4h25m

This basically means they are using userspace Wireguard, such as the go implementation. That is as opposed to the in-kernel wireguard.

The reason to mention conjuring a TCP IP stack out of thin air, is that most of the time the operating system is providing the tcpip stack as part of the kernel. With wireguard go, the tcpip stack is running in user space, meaning it can be created in a normal user space process such as the flyctl command line interface. This is indeed quite magical for people who have been around for many years as a usable in-process userspace tcpip stack is a relatively new and a novel thing.

cchance
0 replies
2h52m

I always felt weird about this, i still cant really wrap my head around userspace tcpip stacks, it just feels... weird

ape4
0 replies
4h39m

Its hard for me to think of a sprawling CLI that I love.

tinco
6 replies
9h42m

While I generally agree with the idea that a direct HTTP request for a single point to point message would be more reliable than routing through a message queue, I'm a little bit surprised that there would be so many messages lost by NATS it had significant impact on their services.

Wouldn't a lost message just mean NATS would retry delivery until succesful? Anyone know why they would experience noticeable unreliability?

dilyevsky
3 replies
9h31m

Wouldn't a lost message just mean NATS would retry delivery until succesful? Anyone know why they would experience noticeable unreliability?

I believe if you're using core nats (not JetStream) there's no option for re-delivering like at all.

tinco
2 replies
9h23m

Well if that's what they did, then then using NATS over HTTP in that scenario is just switching a single point of failure for two single points of failure with no feedback on the second point, did they just pick it for the convenience of the NATS interface and only later realize their mistake?

emmanueloga_
1 replies
7h32m

I’m guessing they realized late that core nats is a message broker without features like durability and at least once delivery. Jetstream was released less than a year ago (in nats 2.2) so perhaps by the time they made the switch it was not an option.

caleblloyd
0 replies
7h7m

NATS 2.2 with initial JetStream support was actually released 3 years ago.

tptacek
0 replies
1h27m

We're not dunking on NATS. We were probably holding it wrong. But it turns out, we didn't need it; a message layer wasn't really making things any more expressive, just harder to test and monitor.

klabb3
0 replies
6h19m

I am also very curious to know more. I’m sure the NATS maintainers are as well. Their architecture is extremely intuitive and appealing to me, so I wonder where things went south. Nats has a lot of tunable parameters with Jetstream. For instance, an in-memory stream with a time-based duplicate detection window, both push and pull semantics, and configurable re-delivery and ack policies.

The one thing where I can see an impedance mismatch is ephemeral single-message connections which I don’t think it’s built for. In either case, more details would be invaluable. (Fly folks, please consider sharing!)

sneak
6 replies
6h30m

Fun fact: the WireGuard macOS client application cannot work on macOS unless it's distributed via the App Store - Apple simply will not provide the required VPN entitlements for web/self distributed apps. You can use the commandline wg tools (which use a different OS API) but not the GUI ones.

This means that WireGuard-the-org can't distribute the mac GUI app directly, and if you want to use the official WireGuard app (eg for accessing a VPN for privacy), somewhat ironically you have to ID yourself (which requires a phone number) and your computer's hardware serial to Apple first to download it via the App Store. It smacks of the totalitarian rms "right to read" essay - provide strong ID and hardware unique identifiers to be allowed to download and use privacy software on your own computer.

https://www.wireguard.com/install/

While not directly relevant to fly.io, I figured this might be relevant in the context of Apple's other anticompetitive actions this week related to web-based native app distribution on iOS in the EU for the DMA. This issue with VPN apps has been true on macOS for years.

I personally didn't know why WireGuard didn't offer direct downloads, but I emailed Jason Donenfeld a couple years ago and he let me know that Apple has been restricting these APIs in non-AppStore apps, which was news to me, as I didn't know that Apple had started any of their AppStore-only bullshit on macOS.

traceroute66
3 replies
6h11m

the WireGuard macOS client application cannot work on macOS unless it's distributed via the App Store

That's bullshit and I'm pretty sure you know it. This is a technical forum and you should know better than to spread unsubstantiated Apple-bashing FUD.

MullvadVPN and ProtonVPN are two very well known VPN providers who's VPN apps you will not find on the Apple App Store and instead you are invited to download from their respective websites. It has been that way basically forever (or at least a very long time now we're in 2024), so its not like its a reflection of a recent Apple change either.

I actually quite like the fact that WireGuard is distributed via the Apple App Store and I thank them for doing that. It makes updating much easier, rather than having a dozen apps each with their own update process.

traceroute66
0 replies
6h5m

Those use different (less efficient, older, and perhaps deprecated-in-the-future) APIs.

Your original statement is still bullshit.

You said "Apple simply will not provide the required VPN entitlements for web/self distributed apps".

I've clearly demonstrated that statement is just plain wrong. MullvadVPN and ProtonVPN are not distributed via the AppStore and they work and are not blocked by MacOS.

bananapub
0 replies
5h31m

Those use different (less efficient, older, and perhaps deprecated-in-the-future) APIs.

then correct your initial post, which says something entirely different:

Fun fact: the WireGuard macOS client application cannot work on macOS unless it's distributed via the App Store - Apple simply will not provide the required VPN entitlements for web/self distributed apps. You can use the commandline wg tools (which use a different OS API) but not the GUI ones.

it really is increasingly annoying that people just post nonsense on HN to back up their dumb prejudices.

Corrado
1 replies
6h21m

This is not entirely accurate. macOS now has System Extensions[0], which allow you to tie into the network to provide VPN services. This is what Tailscale uses in their non-App Store downloaded app. [0] https://developer.apple.com/system-extensions/

sneak
0 replies
6h11m

https://developer.apple.com/documentation/bundleresources/en...

It appears that they will now give out these entitlements for non-MAS apps, but only to members of their developer program.

This means that you can't build one that will work on macOS without IDing yourself to Apple, so you still can't build it from scratch from source (as a non-developer-program person) and expect it to work (without disabling SIP).

rubatuga
5 replies
11h35m

What's stopping the initial handshake packet from being replayed into the network stack? Seems like there would be no packets lost that way. Also, what is the purpose of checking for "udp[8] = 1" in the eBPF filter?

yencabulator
2 replies
3h2m

Yeah, sounds like an NFQUEUE helper that releases the packet after it has added the keys.

tptacek
1 replies
2h59m

Say more!

yencabulator
0 replies
2h35m

So NFQUEUE is an nftables verdict that puts the packet into a numbered queue going to userspace. A userspace process gets packets from the queue, and at some later point issues a verdict on each packet, which can be drop or allow the packet to pass. You can also pcap-style ask to see just first N bytes of the packet, to decrease overhead.

Out of that, you can construct a userspace process that reads a packet from the queue, decrypts the noise initiation, requests configuration, adds wireguard config via netlink, and then releases the packet. And you can do that with multiple packets in flight concurrently.

It also allows things like fail-open if the userspace process is broken or overwhelmed, which would be useful here (already in-kernel wg peers would keep working), and spreading load over multiple workers.

https://netfilter.org/projects/nftables/manpage.html (search for queue)

https://wiki.nftables.org/wiki-nftables/index.php/Queueing_t...

Here's a nice & simple Rust library for the userspace part, to give an idea of what the shape of the API is: https://docs.rs/nfq/latest/nfq/

Also, I'm available for contract work ;) Say hi to Ben from Tv.

zekica
0 replies
11h9m

udp[8] = 1 filters only handshake packets. Without it, data packets would also be sent to the userspace daemon.

I'm not sure if initial handshakes can be replayed, but since WireGuard ignores unknown clients, it might be possible.

tptacek
0 replies
3h37m

Nothing. It's a good idea. (As the sibling comment points out: the BPF filter snags just initiation packets, which is what we want; it's the WireGuard equivalent of sniffing for TCP SYNs to see connections starting).

tschumacher
4 replies
5h48m

Sounds like a whole lot of effort to avoid a GraphQL request each time a flyctl client wants to connect.

sudhirj
3 replies
5h46m

Huh? They do make one to set it up. More a way to avoid having the public keys of every single client every loaded up into the wireguard kernel module on the gateway all the time.

tschumacher
2 replies
5h40m

My implicit suggestion was that clients make a GraphQL request not only before the first connection but before every connection. The gateway server can insert the keys into the kernel in response to an explicit GraphQL request instead of in response to some complicated packet sniffing.

rudasn
0 replies
4h3m

What would the payload of the grapphql request to fetch the wg config for that peer look like, when they don't know from which peer the request is coming from?

mrkurt
0 replies
1h40m

This needs to support any ol' wireguard client. We use it in `flyctl` but people also use it to create gateways so they can, eg, peer with VPCs.

niz4ts
4 replies
10h12m

When I read this, I got a little too excited and thought they managed to get wireguard connections happening in the browser with webassembly (this isn't impossible, but the only attempt[0] I know of so far only works because of extra things tailscale has). It's an idea I've had for a project, but one I haven't had time to dedicate to (yet).

In any case, really cool write-up! I wonder if they thought about making `flyctl` do a check with their API for any command that requires talking over wireguard to ensure the keys would be installed in the gateway. Since `flyctl` knows when the last command was run with it, it could do this only after some inactivity. And on the gateway machines, they'd just clean up any inactive peers with a cron (which they seem to be doing already).

Not a solution as elegant as the one they reached (which is super cool), but I'm assuming the considerably lower effort would make it appealing.

[0]: https://labs.leaningtech.com/blog/webvm-virtual-machine-with...

tptacek
0 replies
3h59m

Tailscale already got all this stuff working in WebAssembly in the browser. :)

The way we think about things, if we were going to try to provide a browser experience of doing something with WireGuard, we'd probably just fork off a tiny Fly Machine VM to run it on. Just a different vibe here.

gz5
0 replies
6h37m

For Chromium-based browsers, an option is to use BrowZer (built on OpenZiti, Apache 2.0). Enables you to connect into a full mesh private network (mTLS, e2e encrypted, no TLS man in middle inspection). 3 examples below with well known apps. Disclosure, I work on the project.

MSFT RDP (video):https://youtu.be/1NMrxRIowog

Private network for Grafana (video):https://youtu.be/l5ktiI-j3eg

Private network for Plex (blog post)https://blog.openziti.io/its-a-zitiful-life

Basically you decide what 'app' you want to deliver via the overlay, e.g. Grafana, Plex, RDP. For those destinations, a (one time) bootstrapping process (invisible to end user) results in your browser receiving a <script> tag which includes some configuration when the browser attempts to connect to the destination (Grafana etc). This ultimately results in the browser downloading some JavaScript and WA, and registering a service worker (the wasm contains the PKI bits).

After successful auth, your browser can then open a websocket to your private OpenZiti overlay network (distributed, OpenZiti overlay network software routers, deployed where you want them), and ultimately hit the app (which no longer needs to listen to anything other than the overlay network; becomes private).

Desktop Chrome is the most tested, followed by Android Chrome.

apignotti
0 replies
9h41m

WebAssembly is not magic, the simple reality is that browsers do not expose low level socket interfaces, so they cannot connect to arbitrary services on the wider internet.

We choose to use Tailscale since they allow WebSocket-based connections via their DERPs.

It is interesting that, originally, DERPs were intended to be a solution for machines in extremely limited networking environment where nothing but HTTP is allowed. Turns out browsers are exactly one of those extremely limited networking environments.

justsomehnguy
4 replies
9h57m

and gateways with hundreds of thousands of peers that will never be used again

My thoughts exactly as I was reading the first paragraphs.

Note that there’s no API call to subscribe for “incoming connection attempt” events. That’s OK! We can just make our own events. WireGuard connection requests are packets, and they’re easily identifiable, so we can efficiently snatch them with a BPF filter and a packet socket.

Nice idea.

When we get an incoming initiation message, we have the 4-tuple address of the desired connection, including the ephemeral source port flyctl is using. We can install the peer as if we’re the initiator, and flyctl is the responder

And this works behind NAT?

TheDong
2 replies
8h59m

And this works behind NAT?

Sure, UDP NAT only knows the 4-tuple (say {wggwd.fly.io, 12345, clientIP, 23456}).

Any UDP packet, whether it's a new "initiator" udp packet, or a response to the outgoing initiation message, will look the exact same to any UDP NAT in the way since it only has the 4-tuple to go on, and the 4-tuple is the same.

ta1243
0 replies
4h33m

Presumably you could have a situation where there's deep-packet inspection on the traffic which would only allow a "handshake response" to come back through, and drop your attempt at an initialisation.

I doubt that happens much, and I assume you'll fall back hapilly enough to the second time the "client" sends an init packet and then you simply respond with the response packet.

justsomehnguy
0 replies
6h26m

The way it's written made me think it's flyctl sending it's own ip:port, which would be private ones behind the NAT. If it's the received packet source:port then it's, obviously, already translated.

chgs
0 replies
9h5m

If the packet goes back to the same ip/port and generated from the same ip/port it will work through nat.

chairmanwow1
3 replies
4h13m

My startup used Fly for almost a year. The core feature of code to deployed code in less than a minute is beautiful. Spinning up / down new nodes for backfills takes seconds.

But the company feels a little immature. Once our API server became unreachable in Fly for 48 hours. I'm not sure if it was my fault for getting config wrong or if they just had another "silent" failure. They have a "db" product, but it's "not managed postgres". Would get consistent disconnections from that. Just feels weird for them to add a top level noun in their cli for postgres and then limit the extent it's a feature they support.

API access to their core service would frequently go down and leave us waiting to deploy new service fixes.

I miss the deployment experience, but I'm frankly happier with Cloud Run on GCP. Just way fewer "surprises" and much more complete documentation.

ZeroCool2u
2 replies
4h0m

Fly looks great to me, though I've never had a chance to use it. For what it's worth though, Cloud Run on GCP is one of my top 3 favorite infrastructure/deployment tools, so you're setting the bar pretty high.

icedchai
0 replies
2h57m

Cloud Run is pretty slick... services, batch jobs, easy to deploy, very flexible.

WuxiFingerHold
0 replies
2h41m

so you're setting the bar pretty high

Why would anyone choose a provider that is inferior (apart from toying)?

I mainly read two kind of stories from fly.io. Their promotional, but well written and interesting technical blogs like this one and stories about issues with their services and miscommunication. So, despite liking their blogs I don't consider using it.

d-z-m
1 replies
10h39m

We can install the peer as if we’re the initiator, and flyctl is the responder. The Linux kernel will initiate a WireGuard connection back to flyctl. This works; the protocol doesn’t care a whole lot who’s the server and who’s the client. We get new connections established about as fast as they can possibly be installed.

Is this(effectively) adding a half round-trip to the handshake? i.e.

  1. ->flyctl sends Initiation
  2. <-peer is added via netlink(which causes new Initiation to be sent)
  3. ->Response from flyctl

OJFord
0 replies
7h50m

My reading was that both peers end up 'thinking' they initiated, but it doesn't matter. i.e. (3) either doesn't happen or just doesn't need to be waited for, or that they could even block (2-new initiation) and then it definitely wouldn't.