How does this compare to IPFS?
I personally found IPFS very disappointing in practice, so I'm very hopeful for a successor.
(The promise of IPFS is great, but it is excruciatingly slow, clunky, and buggy. IPFS has a lot of big ideas but suffers from a lack of polish that would make Augías look clean. And as soon as you scale to larger collections of files, it quickly crumbles under its own weight. You can throw more resources at it, but past some point it just falls over. It just doesn't work outside of small-scale tests.)
if you are looking for something similar to ipfs but a bit more minimalistic and performance oriented, check out iroh https://github.com/n0-computer/iroh .
It is a set of open source libraries for peer to peer networking and content-addressed storage. It is written in rust, but we have bindings to many languages.
One part of iroh is a work in progress implementation of the willow spec. The lower layers include a networking library similar to libp2p and a library for content-addressed storage and replication based on blake3 verified streaming.
Most iroh developers have been active in the ipfs community for many years and have shared similar frustrations... See this talk from me in 2019 :-)
https://youtu.be/Qzu0xtCT-R0?t=169
https://veilid.com/ should also be a great alternative. i haven't had time to use it yet, but it was built to address performance issues with ipfs and allow for both dht style content discovery, but also for direct websocket connections for streaming (and doing that in an anonymous fashion)
This looks very interesting. They made very similar choices than we (iroh) did. Rust, ed keys, blake3.
They seem to do their own streams, while we are adapting QUIC to a more p2p approach. Also the holepunching approach seems to be different. But I would love to get more details.
https://yewtu.be/watch?v=Kb1lKscAMDQ
this was the presentation at DC'31. i will also check out iroh! thanks for working in building something in this space, it is much much needed!
Thanks. This is awesome. I think they are doing more work themselves in terms of crypto, whereas we rely on QUIC+TLS more.
Regarding holepunching, our approach is a bit less pure p2p, but has quite good success rates. We copy the DERP protocol from tailscale.
I am confident that we have a better story regarding handling of large blobs. We don't just use blake3, but blake3 verified streaming to allow for range requests.
Also I wrote my own rust library for blake3 verified streaming that reduces the overhead of the verification data. https://crates.io/crates/bao-tree
I tried to get on their discord at https://veilid.com/discord, but I get an invalid invite. You know a better way to get in touch?
hmm this is strange, i tried the invite and it worked for me. If you are on fedi, @thegibson@hackers.town is part of the team.
thanks for the links, i will get in touch personally when i try ir0h :)
Interesting — does the Rust crate export a C API then?
Not officially. We currently have bindings for rust, python, golang and swift.
These were the most asked for bindings (python for ml, golang for networking and swift for ios apps).
We are using uniffi https://mozilla.github.io/uniffi-rs/
Would you need C or C++ bindings?
Ah, I see. Hm. I might be interested in a C API since that could be used in C, C++, and Lua equally well. I really was just wondering what the common implementation between the bindings was since it struck me as unusual that there would be a number of bindings but not C (which is AIUI the only interface besides Rust that Rust can really export.)
So, I am not the one that is doing the bindings, so take this with a grain of salt.
It seems uniffi does create C compatible bindings in order to make bindings for all these other languages. But these are internal bindings that are ugly and not intended to be used externally.
Perhaps if an issue were opened describing the necessary steps to provide a fluent and stable C API, staying consistent with the uniffi approach you're using for the other wrappers, then someone enterprising could pick up the ball and run with it. :)
iroh seems to have a couple of "killer tools" already, known as dumbpipe[0] and sendme[1].
Although I am concerned that while dumbpipe does mention cryptography, sendme's webpage makes no mention of it (?).
0. https://www.dumbpipe.dev/
1. https://iroh.computer/sendme
It's using the same transport. Basically sendme is like dumbpipe, but adds blake3 verified streaming from the iroh-bytes crate for content-addressed data transport.
I imagined as much, but the website does still not mention encryption.
Which works against it. E2EE is a requirement today.
Thanks for letting us know. We will add a section about encryption.
These tiny tools are basically one week projects to show off the tech, but they try to be useful on it's own as well.
Hi, I'm super intrigued by Willow and your work on iroh. Do you have any kind of documentation on how iroh deviates from Willow, or what parts of Willow are planned to be implemented vs omitted?
Not yet. We have been busy with other stuff, also the willow spec has been a bit of a moving target until now.
We would like to take our rust willow impl and separate it a bit more from our code base, so that iroh documents are just users of the willow crate.
That makes sense. I think I might try to really jam through the Willow docs and get a good understanding. If it all looks good, I might be able to help out splitting these things out =].
Willow solves the biggest problem I have always had with IPFS: it’s content addressable, which is nice for specific things but not generic enough to make the protocol actually practical and usable in the real world outside of specific use cases. (Namely you can’t update resources or even track related updates to resources.)
(Mathematically, a name-addressable system is actually a superset of content-addressable systems as you can always use the hash of the content as the name itself.)
To be fair, IPFS does offer not just content addressing but also a mechanism for mutability with IPNS. You can think of a willow namespace (or iroh document) as a key value store of IPNS entries.
The problem with IPNS is that the performance is... not great... to put it politely, so it is not really an useful primitive to build mutability.
You end up building your own thing using gossip, at which point you are not really getting a giant benefit anymore.
Is the performance a critical design flaw or just an implementation issue?
Imagine if DNS supplied every URL and not just domain names. You need some mechanism to propagate resource changes. IPNS has two practical mechanisms: a global DHT that takes time to propagate, and a pub/sub that requires peers to actively subscribe to your changes.
btlink does DNS per domain name, which you could argue is a sweet spot between too many queries and being too broad. at least in the case of the web, it works nicely.
Difficult to answer.
IPNS uses the IPFS kademlia DHT, which has some performance problems that you can argue are fundamental.
For solving a similar problem with iroh, we would use the bittorrent mainline DHT, which is the largest DHT in existence and has stood the test of time - it still exists despite lots of powerful entities wanting it to go away.
It also generally has very good performance and a very minimalist design.
There is a rust crate to interact with the mainline DHT, https://crates.io/crates/mainline , and a more high level idea to use DHTs as a kind of p2p DNS, https://github.com/nuhvi/pkarr
It's a design flaw.
It's a superset in that sense but not a superset in another sense.
In a content-addressable system, if I post a link to another piece of content by hash, then no one can ever substitute a different piece of content. Like, if I reference the hash of a news article, no one can edit that article after the fact without being detected. This is a super-useful feature of CAS that is not a feature of NAS. Other implications:
* I can review a piece of software, deem that it's not malware in my opinion, and link to the software by hash. No one can substitute a piece of malware without detection.
* Suppose you get a link from a trusted source. Now you can download a copy of the underlying content from any untrusted source, without a care about authentication or trusted identities. This describes BitTorrent.
Can you quantify the breaking point of IPFS and the number of files. I was considering it for a project that has fewer than 200,000 entries.
I'd be very curious what project you have for which IPFS is a good solution.
I'm not sold on IPFS but the idea of using a file system as a top level global index is attractive to me. I find the 2 best references for human information is global location and time. I think an operating system structured around those constants could be a winner.
I'm not sold on IPFS and will look at Willow and IROH.
A global hash-based index is literally an undergraduate project to do well. You could even ride atop bittorrent if you really had to.
It depends on how many files you have, but also the file size. My understanding is that IPFS splits files into 256kB chunks with a content ID (CID), and then when you expose your project, it tries to advertise every CID of every file to peers.
200,000 files could take a while to advertise, but from memory it should work, should hang for less than 15 minutes. But depending on your hardware, file size, quality of connection to your peers, alignment of planets, etc.
If you add one order of magnitude above that, it starts to become tricky. Manageable if you shard over several nodes and look for workarounds for perf issues. But if you keep growing a bit past that point, it can't keep up with publishing every small chunk of every file one by one fast enough.
But it's also very possible perf has improved since the last time I tried it, so definitely take this with a grain of salt, you might want to try installing and running the publish command and see what happens.
"hang" sounds pretty bad.
Even if there's a lot of sharding and propagating and whatever to do, it should happen in the background, and never interfere with user experience.
From your description, it seems their implementation has serious issues.
Being written in Go may have made development of the reference client fast (that was the creators' contention when I asked, anyway) but killed its growth as a standard. Inability to have a portable lib-ipfs that could quickly, easily, and completely give almost any language-ecosystem or daemon ipfs capabilities is a real drag.
https://willowprotocol.org/more/compare/index.html#willow_co...
Consider https://github.com/anacrolix/btlink. It's a proof of concept, and has all the basics. I designed it and I worked for IPFS, and I am the maintainer of a popular DHT and BitTorrent client implementation.