This looks like a fine project for its purpose, but I think git is already open-source and p2p. You don't need sh<(curl) a bunch of binaries, instead simply connect to another git server, use git commadns to directly pull or merge code.
What's missing in git is code issues, wikis, discussions, github pages and most importantly, a developer profile network.
We need a way to embed project metadata into .git itself, so source code commits don't mess up with wikis and issues. Perhaps some independent refs like git notes?
While Git is designed in some way for peer-to-peer interactions, there is no deployment of it that works that way. All deployments use the client-server model because Git lacks functionality to be deployed as-is in a peer-to-peer network.
For one, it has no way of verifying that the repository you downloaded after a `git clone` is the one you asked for, which means you need to clone from a trusted source (ie. a known server). This isn't compatible with p2p in any useful way.
Radicle solves this by assigning stable identities[0] to repositories that can be verified locally, allowing repositories to be served by untrusted parties.
[0]: https://docs.radicle.xyz/guides/protocol#trust-through-self-...
Respectfully disagree here. A repository is a(or multiple) chain(s) of commits, if each commit is signed, you know exactly that the clone you got is the one you asked for. You're right that nobody exposes a UI around this feature, but the capability is there if anyone would have any workflows that require to pull from random repositories instead of well established/known ones.
Here's the problem: how do you know that the commit signers are the current maintainers of the repo?
That problem is social you can never be sure of that even with hardware signing of commits. No tech can ever solve that. Just get "pull requests" from contributors you know and pull from maintainers you trust. Is the social model.
That's not quite right, we solved this in Radicle. Each change in ownership (adding/removing maintainers) is signed by the previous set of owners. You can therefore trace the changes in ownership starting from the original set, which is bound to the Repository ID.
Sure, but again, you've added convenience - or what you feel like it's convenience - for something that probably can be achieved right now with open source tools. A "CONTRIBUTORS" file with sign-offs by maintainers is an example of a solution for the same thing.
I don't deny that your improvements can benefit certain teams/developers but I feel like there are very few people that would actually care about them and they're not making use of alternatives.
A CONTRIBUTORS file is easy to change by anyone hosting the repository - it's useless for the purpose of verification, unless you have a toolchain to verify each change to said file. "Sign-offs by maintainers" it not useful either unless you already know who the maintainers are, and you are kept up to date (by a trusted source) when the maintainers change. This is what Radicle does, for free, when you clone a repo.
All good points, but now you moved the trust requirement from me having to trust the people working on the code, to me having to trust the tool that hosts the code. I'm not convinced your model is better. :P
Can’t debate that :)
How do you fork an abandoned repo?
When you fork an abandoned repo, you are essentially giving it a new repository identity, which is a new root of trust, with a new maintainer set. You'll then have to communicate the new repository identifier and explain that this is a fork of the old repo.
How do I verify the “original set”, or the Repository ID, if not out-of-band communication (like a project’s official website)? And then what advantage does this have over the project maintainer signing commits with their SSH key and publishing the public key out-of-band?
I think there’s room for improvements in distributed or self-hosted git, but I think they exist more in the realm of usability than any technological limitations with the protocol. Most people don’t sign git commits because they don’t know it’s possible—not because it’s insecure.
The repository id can be derived via a hash function from the initial set of maintainers, so all you need to know is that you have the correct repository id.
The advantage of this is that (a) it verifies that the code is properly signed by the maintainer keys, and (b) it allows for the maintainer key(s) to evolve. Otherwise you’d have to constantly check the official website for the current key set (which has its own risks as well)
Does that matter if the signatures are valid?
Yeah, because for eg. I can publish the given repository from my server with an additional signed commit (signed by me) on top of the original history, and that commit could include a backdoor. You have no way of knowing whether this additional commit is "authorized" by the project leads/owners or not.
That is in fact the point, it's decentralized by nature. The entire idea behind git's decentralization is that your version with an additional backdoor is no lesser of a version than any other. You handle that at the pointer or address level i.e. deciding to trust your server.
By the same way I know how the commit signers are who they say they are in "regular" usage of GPG: I have verified the key belongs to them, or their keys are signed by people I trust to have verified, etc, etc. Like a sibling said, the problem is social rather than technical.
Perhaps, but none of that commit history is related to the invocation to git clone. To acquire and verify you need both a url and a hash for each branch head you want to verify
How do you handle the SHA1 breaks in an untrusted p2p setting?
If you mean collision attacks, this shouldn't be a problem with Git, since it uses Hardened SHA-1. Eventually, when Git fully migrates to SHA-2, we will offer that option as well.
From https://shattered.io/
So you use hardened sha1 in radicle? It would be great to see this in the docs.
Everything that is replicated on the network is stored as a Git object, using the libgit2[0] library. This library uses hardened SHA-1 internally, which is called sha1dc (for "detect collision"). Will add to the docs, good idea!
[0]: https://github.com/libgit2/libgit2/blob/ac0f2245510f6c75db1b...
The problem I'd like to see solved is source of truth. It'd be nice if there were a way to sign a repo with an ENS or domain withiut knowing the hash.
Another thing is knowing if the commit history has been tampered with without knowing the hash.
The reason for needing to not know the hash is for cases like tornado cash. The site and repo was taken down. There's a bunch of people sharing a codebase with differing hashes, you have no idea which is real or altered.
This is also important for cases where the domain is hacked.
The entire Linux kernel development team wouldn’t to differ…
Radicle adds issue tracking and pull requests. Probably some of those other features as well.
On mobile there are buttons on the bottom of the screen in the op link, click those and you get to the issue tracking tab and the pull request tabs etc
But that’s not what parent meant. Those things should be embedded in the git repository itself, in some kind of structure below the .git/ directory. That would indeed make the entire OSS ecosystem more resilient. We don’t need a myriad of incompatible git web GUIs, but a standard way of storing project management metadata alongside version control data. GitHub, Gitea, Gitlab, and this project could all store their data in there instead of proprietary databases, making it easy to migrate projects.
Yes, this is how radicle stores this data. ; )
https://app.radicle.xyz/nodes/seed.radicle.xyz/rad:z3trNYnLW...
https://docs.radicle.xyz/guides/protocol is probably a better resource (but this guide is still Work In Progress)
This looks like an interesting approach. I have question, to avoid copy a large .git project, we have partial cloning and cloning depth. If `cobs` grows too large, how can we partially clone it? Like select issues by time range?
The COB types are located in the Stored Copy, you would still be able to partial clone the working copy repo without the issues and patches, with your current git commands. There is a better explainer here: https://docs.radicle.xyz/guides/protocol#storage-layout
Emphasis mine. Doesn't seem to be it seening as this is yet another home grown issue storage.
Yeah, exactly. Radicle doing it this way, Fossil another - see here why that is a problem: https://xkcd.com/927/
Radicle does store such data in git - issues, patches (PRs) etc. Also, the entire project (protocol, cli, web ui etc) is fully open source.
What has the world come to where that is the most important part?
--
I think gerrit used to store code reviews in git.
Repositories and code-sharing are inherently about trust. Even if you personally audit every line of code, you still need to trust that the owner isn't trying to slip one past you. Identity is a key component of trust.
What you say makes sense. But that trust needs to extend to the hosting platform itself, because the platform can manipulate all non-signed data. I don't see how a GitHub profile by itself is trustworthy. You need some additional, external and independent verification that that GitHub profile is really authentic and doesn't contain compromised code.
There is nothing stopping me from creating the accounts IggleSniggle or Iggle5n1ggle on github.
I mean... yeah, you obviously have to trust someone to vouch for the authenticity of an identity. In the case of Github, that's the platform owner. In the case of a digital signature, that's the root certificate authority.
With that being said, your example feels pretty far off the mark. You might be able to phish using a similar looking identity, but that's completely unrelated to the trustworthiness of the platform. It's not as though you'll manage to somehow phish Github into showing someone else's trustworthy work history on a spoofed identity.
Welcome to the era of self-promoters and narcissists.
Classic git does not evade censorship, such as the extremely recent news concerning Nintendo. An idea like this has been rolling around in my head, and I'm overjoyed that someone has done the hard work.
Git evades censorship just fine, since it is properly decentralized and doesn't care about where you got the repository from. Plain HTTP transport however does not and most Git repositories are referred to by HTTP URL.
If you simply host Git on IPFS you have it properly decentralized without the limits of HTTP. IPNS (DNS of IPFS), which you need to point people to the latest version of your repository, however wasn't working all that reliably last time I tried.
Yeah that’s been my experience with IPFS. Very cool idea, practically doesn’t work very well. Haven’t tried recently though, maybe it’s improved.
But with Git you still need to locate an up-to-date source for the repo. If the author is signing commits or you know a desired commit ID then you can verify once you have found a source, but finding the source is the hard part.
IIUC with Radicle you can just request the repository by signature and get the latest released version from the network without needing to track down a source yourself. A trusted publisher (probably the original author/maintainer) can continue to publish commits without a centralization point that can be shut down (like the recent Yuzu case).
You're missing the discovery part. You want to get the repository X from user Y cloned - how do you find it? Especially if you don't know Y and their computer is off?
Also radicle does want to tackle the issues / prs and other elements you mentioned as well.
How do you find a website ?
And presumably the person hosting it will make sure that the computer hosting it is often on, for instance ISP routers and TV boxes are a good way to popularize it, since they often come with NAS capabilities :
https://en.wikipedia.org/wiki/Freebox
(Notably, it also supports torrents and creating timed links to share files via FTP.)
Depends on what you mean by finding :
- finding what the domain name is ? - resolving the DNS to an IP address ?
Radicle solves both problems in theory, but more the latter than the former right now:
- there is some basic functionality to search for projects hosted on Radicle, to find the right repo id (I expect this area will see a lot more activity and improvements in the near future), - given a repo id, actually getting the code onto your laptop. This is where the p2p network comes in, so that the person hosting it doesn't always need to keep their computer/router/tv box on, etc.
Fossil (https://fossil-scm.org) embeds issues, wiki etc. into project repository.
Also Radicle, evidently
Fossil has a few of these.