Worried about the dominance of big instances? No, really, this is quite natural.

As an emergent and self-governing system, it could be expected that the size distribution of instances roughly follows Zipf's law.

Does it?

At first you see the top 6 instances, and then the rest. But on a log-log scale the size distribution is close to a straight line, which would be expected from an emergent system.

1/

The deviations are perhaps due to the still young age of the fediverse. Expect it to smooth out. But still, expect that the big instances will always dominate.

Data: 200 biggest instances from instances.social.

en.wikipedia.org/wiki/Zipf's_l

2/END

Show thread
Follow

Oh, one bonus toot. Based on the top 200 instances, the s factor of Zipf's law on the fediverse is approximately 1.3. If all the instances were taken into account, the factor could change. But I didn't find a quick way to grab the table other than manually, so I only used the top 200 instances.

@mayel Thank you! Now if I only knew an easy way to extract just the user counts from that... :D

@mayel more like this:

cat list.json | tr ',' '\n' | grep users | tr -d '"users":' | sort -nr > usercount.txt

Seems to do what I want.

@mayel Ok, now that I got the data for 1746 instances, I can say that the s factor is about 1.33777, so pretty close to the original approximation of 1.3.

@mayel @Stoori
So the total number of #Mastodon user accounts is

(reduce
+
(remove
nil?
(map
:users
(:instances
(keywordize-keys
(json/read-str
(slurp "instances.social/list.json?q%5")))))))

1295348

One and a quarter million.

@mayel @Stoori By that methodology we're just over 1.5 million now, but more analysis shows that roughly one in six accounts are 'active', so the number of people participating is much less.

Still, the key point here is that the trend is currently sharply up.

@Stoori In the context of technology though it's important to keep in mind that:

Zipf's law is often driven by unknown or unexamined variables and is not inherently the 'natural' case for all social systems without closer inquiry to eliminate potential causes as the culprit and -

It is a mistake to consider a natural property of emergent systems as a *desirable* property of any technology, or one that promotes the best conditions for it's proliferation and use.

@Ashrand Yeah, sure. The point is, if the fediverse is not centrally governed (that is, it emerges by itself as a laissez-faire system), it will end approximating the Zipf's law.

Of course now the question is, should there be some kind of central government of the fediverse to counter this development.

@Stoori I think that is why I consider it something to worry about.
If the point is decentralization then the fact that people either have to consider central instances that dictate the spec and the standard for content 'in charge' or accept a stewardship of some kind of to manage in the same way then you have already lost, if the point *isn't* decentralization then you need to have a broader conversation about the goals that the project has/should have and how it is doing right now first.

@Ashrand The easiest way would be to implement a hardcoded maximum number of users per instance (eg 10,000).

Of course it could be forked away, but then, in any case, how to stop instances growing too much? Stop federating with oversized instances? That would in practice split the fediverse into different sub-fediverses that are following different rules.

@Stoori Well aside from the fact that splitting in that way is only a bad thing if you consider the point to be a de facto service 'for everyone' with a single agreed set of rules for conduct it also assumes that the issue is technical, when the whole reason silos like facebook and twitter are so toxic is that they try and provide technical solutions to the social problem they have created in trying to provide what amounts to a single instance for the whole world.

@Ashrand Hmm. I see it more like this: If there's a hardcoded user maximum, then every instance going rogue against it would condemn itself to be on a road to a silo, an outcast of the wider fediverse.

And there's always the other end of the distribution. Yes, there are a few massive instances, but there are thousands of smaller instances. That's what will always be missing from siloed networks.

@Stoori @Ashrand

why would smaller instances be missing from 'siloed' networks? a protective garden is sometimes walled, specifically to protect the fragile flowers that won't grow anywhere else

cellular, distributed growth and encapsulation would, I'd think, foster the growth of tons of small instances that benefit from not being squashed?

@sydneyfalk @Ashrand It's more a question of terminological definition: 'silo' or 'walled garden' is, by definition, a monolith.

Of course a nebula of tiny instances around one giant is a possible sub-fediverse topology, but I wouldn't call it a silo.

@Stoori @Ashrand

Neither would I -- and just as different organisms have smaller communities of cells and organisms that make them up physically, I wouldn't see a reason multiple feds might not exist, independent of each other, perhaps with population balancing of some sort.

Once there's a 'largest' of some sort, in social media, there tends to be problematic power effects down the line :(

I worry about it here, but I suppose only time will tell if that's how things will go.

@Stoori @Ashrand I'm curious what your graphs look like if you only consider recently active users - anecdotally I've seen a lot of people serially hop between instances b/c, say, the local timeline is too busy to be useful, or a new, smaller, more targeted instance feels homier to them

I suspect some combination of making it dead easy to start instances / move instances w/o losing followers/history + individuals wanting to be on moderately sized (or moderateable!) will give us a fat tail

@Lioness @Ashrand That would be interesting. However, the data for this is not readily available.

Another measure (available from the same source) could be the number of statuses, which of course grows only when users are active on an instance.

@Stoori Imposing an interconnect cost based on accounts within the protocol would be more robust than code. Though that could be defeated through spinning up more instances.

@Ashrand

@Stoori @Ashrand

like how the largest neighborhood on earth doesn't contain 1/3 of all we humans, it may be that this isn't a dataset that will follow Zipf's law inexorably, but instead will distribute like skin cells, some larger and smaller based in function, but none 1/3 of all the mass

or not, don't know, strange morning

@sydneyfalk @Ashrand 1/3 or 1/4 of the mass is indeed a lot for one conglomeration. It could be more, or it could be less. The exact shape of the distribution depends on the parameters, while the distribution itself is a good fit.

So in the future, when the fediverse population is 10m or 100m, the biggest instance may have a smaller weight, and yet the distribution would be Zipfian.

We'll see. This is an interesting phenomenon about emergence.

@Stoori @Ashrand

I'm personally hoping for more of a cell-style growth approach, because I'm increasingly convinced that hierarchical structures built to sink control of resources in a 1/3 Zipfian are kind of why there's bottlenecks of power and those are the source of a lot of unpleasantries

(but it's not like I'm really the person who would know, it's probably paranoid rambling)

@sydneyfalk @Ashrand Mere specialization doesn't prevent a Zipfian distribution, as is evident from company size data, for example.

So to really have an atomic distribution where even the largest instance would be a tiny fraction of the whole, would require a concerted effort to counteract the natural size distribution.

In a subset of instances this might happen, but on the whole there are always those who don't share the same goal.

@Stoori @Ashrand

while I understand this, the fact that a subfed or even a second or third fed can be created may help flatten things

I hope, anyway, because power consolidation is community poison, and user base size is power in social media

@Stoori @Ashrand what's zpif's law again? also i think kenneth arrow might be useful!

Sign in to participate in the conversation
social.coop

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!