Other accounts:
All of my comments are licensed under the following license
Maybe we should look for ways of tracking coordinated behaviour. Like a definition I’ve heard for social media propaganda is “coordinated inauthentic behaviour” and while I don’t think it’s possible to determine if a user is being authentic or not, it should be possible to see if there is consistent behaviour between different kind of users and what they are coordinating on.
Edit: Because all bots do have purpose eventually and that should be visible.
Edit2: Eww realized the term came from Meta. If someone has a better term I will use that instead.
Kagi doesn’t really have its own index either. It mainly relies on other search engines as well and the indexes that are its own that focus on small web stuff is better done by marginalia.nu which is also open source.
It is a meta-search engine so it takes results from other search engines and shows the results. Usually you can decide which search engines to use in preferences. You can host it yourself or find an online instance to use.
I think the observer shows daily and monthly stats for the active users per month and active users per half year so the active users per month wouldn’t change as fast I think.
Also about it being a botfarm I do think that is a possibility. Actually there is more evidence for it when you see extend the graph to 120 days and see a huge uptick in users and servers at the same time. Edit: 2024-7-29
Edit: wording
I was talking about on the fediverse observer. It wouldn’t show up immediately there.
Not immediately though right? since the active users are a month or half-year. Or does it automatically update that too?
Most searxng instances have a similar lens for lemmy comments so you can do that too if you want an open source alternative.
Probably but which instance has over 70,000 users?
Long distances actually don’t really mean much it can’t be guaranteed that they actually correlate to much. It is mostly the local groups that are conserved and a bit of the global structure.
I had to try scraping the websites multiple times because of stupid bugs I put in the code beforehand, so I might of put more strain on the instances than I meant too. If I did this again it would hopefully be much less tolling on the servers.
As for the cost of scraping it actually isn’t that hard I just had it running in the background most of the time.
Yeah that sounds like a good idea so you can see how connected local communities are. Probably makes more sense to use original dimensions so no extra information is lost.
Total communities: 2986
Total users: 21934
So the dimensions were reduced from (2986, 21934) to (2986, 2)
Edit: Also yeah it is using Umap for the algorithm and it does do something pretty similar to what you described.
I was somehow able to get both a picture and url added and it looks much better. Thx.
Either the people in !steamdeck@lemmy.world are pretty horny or its an artifact of the dimensionality reduction and means nothing.
Edit: Actually it could also be that it just didn’t collect enough data on that community and the most recent person was also active in nsfw communities. I was only able to get back 14ish days in the data for lemmy.world. They produce way to many comments and I got kicked out early.
Yeah pretty much. I wanted to see communities that had similar people that commented because I thought that would be a good way to see if there were similar kinds of discussions were happening in those communities.
For example most of the red dots to the top right are nsfw communities and it was able to clump like that because the people that comment in those communities tend to comment in the other nsfw communities as well.
edit: left -> right
I didn’t measure activity for this map. Each dot represents a community. I only used the communities that were on the top 35 instances (except lemmings.world which it couldn’t grab any comments for.)
Well I used dimensionality reduction to make it 2D so the axes are how the algorithm chose to compress it.
The original data had each data point as a community and the features as a frequency of a user posting in that community.
There is actually already a website where people just recreated the bee movie by hand so idk it might actually work as a legal argument.
Agreed. No one needs to answer this thread. Actually don’t upvote either. Just think and share with others.
Anti Commercial-AI license (CC BY-NC-SA 4.0)