Worried about the dominance of big instances? No, really, this is quite natural.

As an emergent and self-governing system, it could be expected that the size distribution of instances roughly follows Zipf's law.

Does it?

At first you see the top 6 instances, and then the rest. But on a log-log scale the size distribution is close to a straight line, which would be expected from an emergent system.



The deviations are perhaps due to the still young age of the fediverse. Expect it to smooth out. But still, expect that the big instances will always dominate.

Data: 200 biggest instances from instances.social.



Oh, one bonus toot. Based on the top 200 instances, the s factor of Zipf's law on the fediverse is approximately 1.3. If all the instances were taken into account, the factor could change. But I didn't find a quick way to grab the table other than manually, so I only used the top 200 instances.

Show thread

@mayel Thank you! Now if I only knew an easy way to extract just the user counts from that... :D

@mayel more like this:

cat list.json | tr ',' '\n' | grep users | tr -d '"users":' | sort -nr > usercount.txt

Seems to do what I want.

@mayel Ok, now that I got the data for 1746 instances, I can say that the s factor is about 1.33777, so pretty close to the original approximation of 1.3.

@mayel @Stoori
So the total number of #Mastodon user accounts is

(slurp "instances.social/list.json?q%5")))))))


One and a quarter million.

@mayel @Stoori By that methodology we're just over 1.5 million now, but more analysis shows that roughly one in six accounts are 'active', so the number of people participating is much less.

Still, the key point here is that the trend is currently sharply up.

@Stoori In the context of technology though it's important to keep in mind that:

Zipf's law is often driven by unknown or unexamined variables and is not inherently the 'natural' case for all social systems without closer inquiry to eliminate potential causes as the culprit and -

It is a mistake to consider a natural property of emergent systems as a *desirable* property of any technology, or one that promotes the best conditions for it's proliferation and use.

@Ashrand Yeah, sure. The point is, if the fediverse is not centrally governed (that is, it emerges by itself as a laissez-faire system), it will end approximating the Zipf's law.

Of course now the question is, should there be some kind of central government of the fediverse to counter this development.

@Stoori I think that is why I consider it something to worry about.
If the point is decentralization then the fact that people either have to consider central instances that dictate the spec and the standard for content 'in charge' or accept a stewardship of some kind of to manage in the same way then you have already lost, if the point *isn't* decentralization then you need to have a broader conversation about the goals that the project has/should have and how it is doing right now first.

@Ashrand The easiest way would be to implement a hardcoded maximum number of users per instance (eg 10,000).

Of course it could be forked away, but then, in any case, how to stop instances growing too much? Stop federating with oversized instances? That would in practice split the fediverse into different sub-fediverses that are following different rules.

@Stoori Well aside from the fact that splitting in that way is only a bad thing if you consider the point to be a de facto service 'for everyone' with a single agreed set of rules for conduct it also assumes that the issue is technical, when the whole reason silos like facebook and twitter are so toxic is that they try and provide technical solutions to the social problem they have created in trying to provide what amounts to a single instance for the whole world.

@Ashrand Hmm. I see it more like this: If there's a hardcoded user maximum, then every instance going rogue against it would condemn itself to be on a road to a silo, an outcast of the wider fediverse.

And there's always the other end of the distribution. Yes, there are a few massive instances, but there are thousands of smaller instances. That's what will always be missing from siloed networks.

@Stoori @Ashrand

why would smaller instances be missing from 'siloed' networks? a protective garden is sometimes walled, specifically to protect the fragile flowers that won't grow anywhere else

cellular, distributed growth and encapsulation would, I'd think, foster the growth of tons of small instances that benefit from not being squashed?

@sydneyfalk @Ashrand It's more a question of terminological definition: 'silo' or 'walled garden' is, by definition, a monolith.

Of course a nebula of tiny instances around one giant is a possible sub-fediverse topology, but I wouldn't call it a silo.

@Stoori @Ashrand

Neither would I -- and just as different organisms have smaller communities of cells and organisms that make them up physically, I wouldn't see a reason multiple feds might not exist, independent of each other, perhaps with population balancing of some sort.

Once there's a 'largest' of some sort, in social media, there tends to be problematic power effects down the line :(

I worry about it here, but I suppose only time will tell if that's how things will go.

@Stoori @Ashrand I'm curious what your graphs look like if you only consider recently active users - anecdotally I've seen a lot of people serially hop between instances b/c, say, the local timeline is too busy to be useful, or a new, smaller, more targeted instance feels homier to them

I suspect some combination of making it dead easy to start instances / move instances w/o losing followers/history + individuals wanting to be on moderately sized (or moderateable!) will give us a fat tail

@Lioness @Ashrand That would be interesting. However, the data for this is not readily available.

Another measure (available from the same source) could be the number of statuses, which of course grows only when users are active on an instance.

@Stoori Imposing an interconnect cost based on accounts within the protocol would be more robust than code. Though that could be defeated through spinning up more instances.


@Stoori @Ashrand

like how the largest neighborhood on earth doesn't contain 1/3 of all we humans, it may be that this isn't a dataset that will follow Zipf's law inexorably, but instead will distribute like skin cells, some larger and smaller based in function, but none 1/3 of all the mass

or not, don't know, strange morning

@sydneyfalk @Ashrand 1/3 or 1/4 of the mass is indeed a lot for one conglomeration. It could be more, or it could be less. The exact shape of the distribution depends on the parameters, while the distribution itself is a good fit.

So in the future, when the fediverse population is 10m or 100m, the biggest instance may have a smaller weight, and yet the distribution would be Zipfian.

We'll see. This is an interesting phenomenon about emergence.

@Stoori @Ashrand

I'm personally hoping for more of a cell-style growth approach, because I'm increasingly convinced that hierarchical structures built to sink control of resources in a 1/3 Zipfian are kind of why there's bottlenecks of power and those are the source of a lot of unpleasantries

(but it's not like I'm really the person who would know, it's probably paranoid rambling)

@sydneyfalk @Ashrand Mere specialization doesn't prevent a Zipfian distribution, as is evident from company size data, for example.

So to really have an atomic distribution where even the largest instance would be a tiny fraction of the whole, would require a concerted effort to counteract the natural size distribution.

In a subset of instances this might happen, but on the whole there are always those who don't share the same goal.

@Stoori @Ashrand

while I understand this, the fact that a subfed or even a second or third fed can be created may help flatten things

I hope, anyway, because power consolidation is community poison, and user base size is power in social media

@Stoori @Ashrand what's zpif's law again? also i think kenneth arrow might be useful!

@Stoori i tend to be more worried about the tail getting cut off than it disappearing naturally

We’ve had a lot of problems with spam lately, what if, in the future, the say, ten biggest instances decided it was too much of a problem and that they’d only federate with each other? Sure technically you could still run a small instance, but with so many of the newcomers at least starting out on m.s these days, it’s possible 99% would never even think to check if you were out there

@Satsuma This brings into my mind that it would be interesting to see how the spam accounts are distributed around the fediverse. Do they concentrate on big, medium or small instances?

I guess that data is almost impossible to gather.

@Stoori i know my instance (under 15,000) got a fair amount while we had open registrations, so they’re definitely not /just/ targeting huge instances

@Stoori this is obviously a more extreme example, and also something that’d be pretty shocking if it happened in the next several years

But large email servers spam filters are notoriously aggressive towards small email servers so like, it’s not completely implausible

Sign in to participate in the conversation

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!