Hey HN, the engineering team at Bluesky is especially excited to get to this point! We're happy to help answer questions and help anyone trying to run their own PDS host. Things should work pretty well for self-hosters right now, but we're standing by to help if there are any problems.
Technical details and the installer are in the GitHub repo https://github.com/bluesky-social/pds
And we're on Discord available to help: https://discord.com/invite/UWS6FFdhMe
Hi, what is the status of integration with the activitypub protocol? as its currently the most popular protocol in federated social media
There's a bridge nowadays, but… see https://pleromanonx86.wordpress.com/2024/02/17/mastodon-date... which also links to https://wedistribute.org/2024/02/tear-down-walls-not-bridges... https://techcrunch.com/2024/02/14/bluesky-and-mastodon-users... and https://news.itsfoss.com/bluesky-mastodon-bridge/
That was quit the mess. Ryan Barrett is a smart guy and seems quite nice, but it was very ill-advised to unilaterally decide to build an opt-out bridge. In general, if users one platform A want their stuff to be on platform B, they'll find a way to make that happen. If someone else takes it upon themselves to copy everything from A to B, people understandably get pretty bent about it.
If it had been an opt-in system, the response would probably have been far different.
Public is public.
And someone else will just go build an opt-out (or maybe even no opt-out!) bridge.
Nah. Consent is a thing and this wasn't consensual. Yes, the posts were publicly accessible, but the intent of posting to Mastodon isn't to have it show up automatically on another network. It's technically possible, yes. It's still a dick thing to do and it pissed people off.
And again, it wasn't about Bluesky in particular. If Google announced that they were going to ingest all Mastodon content and post it in a new Google Groups kind of thing, they'd be pretty understandably upset about that, too.
In general, "if I wanted my stuff on Bluesky, I would have put it there". It wasn't the bridge creator's decision to make.
exactly like they did with usenet without any issue?
Well, at least they paid money for Deja. Slight difference no?
I'm completely confused under what moral framework the fact that Google paid to buy the Dejanews archive makes a any difference.
To make it clear, for people who don't know:
Google Groups was originally Dejanews, which was a web based archive and front end to Usenet. Google started searching Usenet, but didn't have historical archives so they bought Dejanews.
Obviously no one who posted on Usenet got paid under this transaction.
It's like if Google bought a Mastadon archive off someone now: this argument seems to indicate that would be better somehow than Google archiving Mastadon posts themselves.
I don't understand why at all?
Public = consent for the public to see it. That includes the public on Bluesky. It was consensual. And the ruckus was in fact about Bluesky in particular. That's why the same project already supported other protocols without a big ruckus.
In general, "I want my stuff on Bluesky but don't want to deal with cross-posting to multiple different platforms and keeping up with responses on all of them"
And, "I want my stuff on whatever platform people want to read it on without having to individually approve each one" (which is quite literally the entire point of public posts on Mastodon).
OH - and it wasn't the bridge creator's decision anyway; it was the decision of people on Bluesky to follow you that would trigger your posts to be federated, so...
It was meant for the public to see, not to bulk copy it en masse to somewhere else.
Similarly, I don't want my blog posts used to train LLMs. I know they're likely to be since they're published right there on the Internet for anyone to see and read. But my intent was for other humans to see and read them, not for someone to feed them into a regurgitator. There aren't technical means that let me allow humans to read my stuff without allowing LLMs to ingest it, and someone could make the (bad) case that if I didn't want my work to be used to train an LLM, I shouldn't have made it public. Maybe. However, I reserve the right to think someone's an ass for doing it.
Well, no technical hurdles kept the person from copying data out of the network people meant to post it to. It's probably not illegal. It's not a nice thing to do, though.
Except literally the entire design is for other Mastodon servers to bulk copy it en masse to somewhere else.
Yes there are. Don't make it public.
Of course! You can think anyone is an ass. You can think anything you want. That doesn't mean that person did anything wrong.
What thing is consent?
Mastodon is an odd sort of network, there's more blocking than I expected and it somehow seems as if blocking is an intrinsic part of the design. In Mastodon, blocking looks like a choice one makes for whatever reasons, not an unloved measure needed for fighting abuse.
As if the design doesn't tell users "you can follow people in the fediverse" but rather "your ability to follow people in the fediverse is limited by you and three other parties and the software isn't among the three".
So… if the mastodonish idea of consent doesn't extend to all of the fediverse, what makes bluesky different from some unvetted mastodon site run by weird people? If the poster's/follower's/would-be follower's consent isn't taken for granted in one case and isn't taken for granted in the other, what makes the two cases different? There obviously is a technical difference, but what is the difference wrt. consent?
Absolutely nothing! Fediverse admins block unmoderated sites all the time, for being unmoderated. Bluesky is just, effectively, one unmoderated instance that everyone will block by default.
How about "If I wanted my stuff on the your Mastodon server, I would have put it there"?
"If I wanted my Mastodon content on your RSS feed, I would have put it there".
How about "If I wanted my stuff on the Internet, a publicly available internet, I would have put it there".
This tribalism around network/brands/protocols is beyond stupid. The thing that is killing Twitter is its closedness and the assumption that the means of communication is what matters. It's not. Let open protocols be open.
If people want privacy, then they should use a secure communication protocol and not a social media network.
I thought that was the point of activitypub.
The whole point of a fediverse is it's a federation. Therefore there is implied consent to copying from one instance to another.
Mastodon isn't a network, the network is the fediverse. Mastodon is some software that runs on the network.
I’m a sucker for a particular mix of condescending plus wrong.
And Fediverse admins will block that bridge, just like they would any other site with bad/nonexistent moderation, and will advise each other to block it. That's just how moderation works in the Fediverse. I guess it's sad that, unlike the admin of an instance with bad moderation, the bridge operator can't do anything to fix the problem, but in the end, that's their problem.
I'm surprised that the tool in question is Bridgy Fed. Bridgy Fed has existed for a long time and is a very useful tool. Its alternative, Bridgy, has also been used to bridge between closed social networks and the open IndieWeb.
Why are Fediverse people only angry about it now? It's an open protocol. If you want privacy, don't publish something for the entire world to see. That's just basic common sense. At the very least, use Mastodon's privacy controls. The Fediverse is not special here, it doesn't get to destroy the open Web for everyone else.
Well first not everyone on the fediverse is opposed to the bridge. I agree that public is public. But there are concerns about moderation being incompatible, it’s normal to voice them.
As for the fediverse destroying the open web for everyone else, I think you’re hyperboling quite a bit, the fediverse has done mountains to make social media more open, probably more than everyone else.
Yeah you're right, I think I did overgeneralise there. I was meaning more of the culture of "Mastodon users"; Mastodon itself has done a lot to help the open Web too.
Though I think "voicing concerns" is a bit of an understatement. I feel really bad for the developer of Bridgy Fed, working on their passion project and just getting caught up in all this heat and harassment.
I'm one of the persons who blocks the bridge, not because of privacy concerns but because it bluesky, specifically. I do not like the idea of for-profit, vc-backed entities being given data, or any kind of decision. We all know exactly where that leads, a term has been coined, mountains have been wrtten about it and yet it still happens.
It's the same situation with Threads.
As for privacy I disagree with you. There's nothing because nothing has been discussed, but the technical feasability should never dictate what we want as a society. When a family member dies, even though the news is known you know how to behave, who to share that information with, what to say. Would you be okay with a company coring up to you and saying "hey we learned your mother died, would you like to tweet it ? It is free !"
In general, people on the Fediverse want to be able to make local moderation decisions; the way that extends to other federated sites is by not federating with them. Most Fediverse sites will not federate with sites that have bad or nonexistent moderation (or simply incompatible moderation policies). Bluesky's architecture basically means that it is one big unmoderated site. The normal reaction of Fediverse admins is then to block it.
As a controversy, it's been blown out of proportion. It's just Fediverse admins setting the moderation policies for their own sites, as always.
Awesome! Why did you choose Caddy as a proxy for PDS? (Caddy creator here.)
Thanks for Caddy, Matt! Some of us on the team have been using Caddy for years, for many of our projects. Because it's so simple, sufficiently high performance, and has lots of nice features.
The on-demand TLS certificates with an "ask" endpoint is especially useful for the PDS use-case. Because there's generally a wildcard DNS name that is used to give each new user a domain handle (@alice.example.com) but we don't want to be vulnerable to a TLS certificate DoS/rate limit situation.
Even if it may be simple in some areas, it doesn't handle edge cases such as https://github.com/caddyserver/caddy/issues/1632 in other areas out of the box unlike other server software.
you have been repeatedly posting this incredibly niche complaint for years at this point
I have only brought this up once before on HN and it was over 2 years ago. Not adopting a new project because it is missing something niche is an extremely common reason why people stick with tried and true, mature software. I do not see anything wrong with pointing out niche issues because to some people these issues are important. Because it's broken out of the box it is allowing people who aren't aware of this problem to continue to setup broken sites. Even caddyserver.com. is broken.
Curious. What is the use case here? I’ve spent tens of thousands of hours of my life on the Internet and a lot of that as a sysadm and I’ve not once heard of people accessing or linking to sites this way.
Personally, I just want to properly handle this edge case and it would bother me if my sites didn't handle it correctly. There are advantages for using FQDNs since they are not ambiguous there are extra optimizations that can be done. I don't want my sites to be problematic for people who want to use them so I make sure my sites properly handle them. Usually handling FQDNs is easy as it just works out if the box on most server software.
Can you provide a URL to a page containing a link of this style? I’ll concede that it’s useful to someone if I see it! The term ‘FQDN’ does not always imply the dot at the end?
Do you just want a link?
https://news.ycombinator.com./
If you want a link to a page with a link like this you can click on the github issue I referenced in my original comment as there are links there like that.
If you want a more natural page that is less meta with such a link
https://jameswillia.ms/posts/shortest-urls.html
Yes, but if there is not a . at the end then there is ambiguity of if it is a FQDN or not.
Is it possible you're vonfusing that user with me? I used to be relatively vocal about this issue on HN. For reference, that's not me.
And it's probably not niche if dozens of users are posting about it for years.
That is a bit unfair, as it is intentionally not doing so. You may disagree with it, sure, but as it stands I think your comment implies oversight or immaturity, which is evidently not the case reading the discussion on the issue you linked.
That doesn't change my point. I am pointing out a an easy pitfall Caddy users can fall in since it is not automatically handled for them as it is with other server software, nor is it pointed out in the documentation fkr Caddy. Simple server software would avoid these pitfalls automatically for users. So while it now be simple to get https working, properly configuring the server is now more complex to get right.
An intentional pitfall doesn't mean it isn't a pitfall.
Not for nothing, but when accessed from this HN app on an iPhone, Apple’s website with a trailing dot does not render correctly.
Great reasons -- glad to hear that! Let me know if you encounter any hiccups or have feedback.
Love the fresh federated model btw!
Unrelated to engineering but the recent rebrand to a dead butterfly logo[1][2][3] may be off brand for a platform wishing to communicate a more open, social Internet built on first principles and scientific rigor.
[1]https://www.emilydamstra.com/please-enough-dead-butterflies/
[2]https://news.ycombinator.com/item?id=14460013
[3]https://bsky.social/about/blog/12-21-2023-butterfly
I didn't know this (as most of us I'd guess). It was an interesting read though, thanks.
Pedantic lepidopterists of the world, unite!
Are you joking? This is private enterprise we're talking about. We'll all die before this company or anything similar is built on "scientific rigor" unless it directly relates to their profit margins.
Hey! Congrats on the release.
Does the AT Protocol only optimize for Twitter-like flows, or does it allow for other types of social applications to be built like Activitypub? For example a reddit-like social media.
Currently, atproto works probably best for public social apps, like microblogging, forums, etc. So yes, it's definitely possible to build a reddit-like social app on atproto.
Part of the change today is that the PDS and Relay[1] now support non-app.bsky record types. This is quite new, so there could be issues, but we're prepared to fix any issues that crop up.
1. https://bsky.social/about/blog/5-5-2023-federation-architect...
Would it be possible to use it for macroblogging, i.e. long posts with markdown markup, embedded images, etc? If so is there a python library tghat implements atproto?
Yes, it should be totally possible to build a blogging system on atproto. And the "app.bsky" API should serve as an example for almost all of the functionality required.
Another really neat aspect of atproto, is that apps can interact theoretically. So you might create a blog system but use "app.bsky" (Bluesky) for comments.
OAuth support is coming soon as well, which is a big step in simplifying auth.
Are there any independent projects implementing the AT protocol?
There are a number of independent projects using atproto in various ways.
There's an (incomplete) list here: https://docs.bsky.app/showcase
And the protocol is documented here: https://atproto.com
Thank you, I might be searching for the wrong things, but I don't see any independent servers. There's clients, libraries, bots, but no servers, am I missing something?
My question was motivated by the fact that from the outside the AT proto ecosystem looks pretty monocultural, and personally I don't trust that. :)
Federation was just opened up with this announcement. I don't think there was a lot of energy for working on independent PDSes until after this has happened. In the past day, a bunch of copies of this reference PDS have been deployed. We'll see how things change in the future.
Basically, you're right, but just because you're asking early on. This is about to change real quick.
Congratulations on the release! If I may ask a question - is it possible to register an account without a phone number on a 3-rd party server?
Thanks!
Yes, it's totally up to a PDS operator to decide how they create user accounts. It's also not required on the Bluesky PDS service any longer, in most cases.
By default the self-hosted PDS requires an invite code, to prevent random people from creating an account. Later other options will exist, including OAuth support which is coming soon.
That's great, thanks!
That's also nice to hear - when last time I tried to register an account (shortly after the free registration launch) the phone number field in the registration form was marked as required, if I am not mistaken.
Yeah, you're right, it was. That was temporary measure during the public launch to prevent spam/abuse. We've made some improvements here recently.
I'm a little confused why the PDS server is both dockerized and has an installation exclusive to Ubuntu/Debian.
Yeah, there's nothing preventing someone from running the PDS server on other distributions. The installer just does a few convenient things for you (like install Docker, opens port 80/443 using ufw, etc) and we haven't added and tested support for other distributions.
There is a Docker compose file in the repo, and advanced users shouldn't have any problems running the code on another distribution or even without Docker if they prefer.
Advanced users can just view the installer script as documentation.
Why do you need to open ufw if it runs in Docker? Docker does its own routing magic and will happily blast right through any ufw rules.
Very cool to see this available though, I might have to try it out later this week!
Hi. If the protocol is open, the software is free and the main instance openly federates with self-hosters, what's the monetization strategy here? Clearly it's not "harvest all the data and figure it out later" as that avenue seems to be shut down internationally by strengthened privacy laws and ads don't work well with federation and third party clients. Is "grow first, figure out how to make money later" still a viable strategy in this economy?
managed hosting perhaps? It works in the email industry at least (Google and Microsoft nearly dominate the email biz)
Yeah but that assumes ATP reaches anything even remotely approximating the ubiquity of email rather than ending up like Google Wave (not literally by being handed off to Apache - which took Wave behind the barn in 2018 in case you're wondering what happened to it).
Will this work for bare metal?
I use BSD, and all I see is a installer for Debian/Ubuntu.
No guide in sight for bare metal nor telling you what services/software are required.
yeah it works fine on bare metal, you'll just have to do a bit more set up work yourself (https terminating and such). The installer script should be instructive in how to run it but you'll have to figure out the BSD specific stuff
Look into the service folder in the repo—this repo is just a very thin packaging wrapper for a JS library, which you should be able to run anywhere you can run Node.
Gonna be that guy!
Any chance the team could create a Home Assistant add-on for this? https://www.home-assistant.io/addons/
I think the Home Assistant community would go WILD for being able to self-host their Bluesky data straight from home with just a few clicks.
It's a pretty big crowd of people. https://analytics.home-assistant.io/ 327k willing to opt-in to analytics.
we need a new version of Zawinski's Law: every system capable of deploying plugins will eventually expand until it is a full hosting solution.
I know if there's one thing I'm eager to do it's to host even more stuff in that clunky piece of shit that has half a dozen main menu items for nonsense and buries everything of interest or value under "Settings"
The add-ons are just docker containers?
It's wasteful to get an entire second machine for something that can use the resources available on the machine running Home Assistant OS
If I wanted to create a consumer hardware product that packages the PDS host in a user-friendly interface, does the software license permit that?
Also, services like Twitter started off with a developer friendly open API, and then it got closed off when the business needed to make money off the platform. What's the difference with Bluesky?
(I don't work at bluesky)
It's MIT/Apache 2.0 licensed, so yes. However, because it's also an open protocol, even if it wasn't, you could write your own under whatever license you want.
BlueSky is built off of an open protocol, called AT. https://atproto.com/ BlueSky is a particular app built on the protocol. As such, there's no way to "turn off the API," as BlueSky itself is a participant in the open protocol.
They could like, re-write everything to be a central service, port the user data over to it, and then pull out from the network, but then two things would happen:
1. stuff would break, as it's no longer part of the network.
2. since there is true account portability, users could simply swap to a different PDS and client, and re-route around the damage.
Also given that it's against their entire stated mission and goals, it would be social suicide.
It would probably be worth clarifying in that repo what the license is for both the code in that repo and the code that it's actually running. It looks like it's just a very thin wrapper around @atproto/pds, which is MIT/Apache 2.0 [0], but the repo you link to has no license.
Edit: now it has one! Thanks!
[0] https://www.npmjs.com/package/@atproto/pds
Yup, it's MIT/Apache 2.0. We'll fix that. Thanks for the heads up.
Given the PDS server works on ports 80/443 and I'd like to use a domain (@nytimes.com in the documentation, but say @example.com), how does it interoperate with existing services that already operate on @example.com , for example a website, blog, cloud.
I'd imagine this use case is quite common for self hosters. If it can't operate alongside an existing, say, nginx on this port, are there recommended alternate practices?
I'm excited at separating identity from hosting, of which self hosting identity gets us closer.
Now that individual posts can be viewed without logging in, is there a way to view/load a feed without authentication?
I'm working on a client and there's a specific scenario where I want to be able to show a feed like "Top 20 - Past 3 Hours" before a user has logged in to their Bluesky account.