So, can ATProto scale down? Have Bluesky's "scaling towards decentralization" issues been fixed?
Not fundamentally. There have been advancements in self-hosting efforts, and they're good, but my fundamental analysis of Bluesky and ATProto scaling quadratically have not changed, despite recent efforts being good.
Context on my previous blogposts:
How decentralized is Bluesky really? https://dustycloud.org/blog/how-decentralized-is-bluesky/
Re: Re: Bluesky and Decentralization https://dustycloud.org/blog/re-re-bluesky-decentralization/
However, every now and then someone posts an advancement with someone experimenting with self-hosting ATProto infrastructure. I think those experiments are good, but they still don't change my fundamental analysis, and the recent changes if anything reify that my fundamental analysis is true: Bluesky/ATProto/the shared heap pattern still scales quadratically, and is expensive to run.
The first post which came up which a bunch of people asked me to comment on is "Can atproto scale down?" https://bsky.bad-example.com/can-atproto-scale-down/
It's a positive effort, the author is working on testing self-hosting, but the conclusion given at the start is too strong and leaves out the middle
I like this post! I want to make it clear. And it's also very positive about my writing. I'm glad the author of the post is experimenting with things and I'm happy to see people try to host more of ATProto's tech pieces. That's great!
But! It's still a stripped down AppView. Very stripped down.
But I think the key paragraphs are towards the end:
> I'd like to think of this as a bottom-up approach to scaling down. Can it get us to decentralization? If we scale to millions of copies of micro-AppViews, it will burden the relays.
This highlights point one
Point two:
> If we approach content hydration heavy fetching against PDSs, it could overload them. If you self-host a viral skeet will you get a surprise bandwidth bill?
Point three:
> That's a silly thought experiment, because operating and orchestrating all the little services is not within reach for many people even if it is cheap. It will probably be a tiny number. So then is this a meaningful contribution to decentralization?
(I don't think it's silly)
One of the big hopes pinned at the end points at Free Our Feeds, and the hope that there will be a *second* big player hosting relays, and this will approach decentralization.
See my original blogposts critiquing having just a few instances by big players be "decentralization".
But the other suggestion for saving things is, maybe there will be a different relay approach?
Well there's been some work on that too, so let's examine of that changes the game https://bsky.app/profile/bnewbold.net/post/3lkpdjgj5pk2i
What's being looked at is lightweight relays that are effectively relay *mirrors*.
But what's making them avoid the world-sized growing database is that they're dropping data older than 72 hours.
Given that Bluesky's major argument is "no missed replies!" this isn't a fundamental solution to the shared heap challenge.
And none of this even remotely challenges my primary original argument: if there is no central source of authority and things are really decentralized, the network scales towards decentralization *quadratically*.
These efforts are still good. It's good to see them!
But every now and then someone points me to them and says "see! all those decentralization concerns are over." I still don't think they will be without an architecture change to message addressing.
But if you want me to answer the question of "Can atproto scale down?" my answer is still: no. :)
ATProto still cannot scale down and scale wide, not towards meaningful decentralization.
Thanks for listening!
@cwebber I really appreciate this thread and loved your prior blog post. Especially because it’s very much just looking at the facts, and your respect for the people behind ATProto shines through just as strongly as your realistic criticism of the protocol.
@cwebber@social.coop I always find it funny that bsky argues that this model is good, as it leads to no missed replies, but it only doesn't miss replies if its centralised.