Separating fact from fiction regarding the recent drama.
May 13, 2024
NOTE: This post is currently a draft. It's unfinished.
(This was made rather quickly and will probably be edited a lot in the next few days or so: there might be an occasional spelling error or something like that in this revision.)
So, as some people know, there has been recent drama about me on the fediverse. A LOT of drama.
For once in my life I’m going to create an entire blog post explaining things, in details, with honesty. I never saw the need to do this before… but this is big.
Since basically nobody knows (which is entirely my fault, by the way - I never made proper documentation), here are the definitions for the terms thrown around, from the person who actually came up with them (me):
And yes, like many people say, I did have to create a bunch of terms. But that’s because existing terms for what I was trying to describe didn’t (and to my knowledge still don’t) exist.
It’s not. If you take a look at what I’ve said about it, as well as the relevant code, you will quickly realize what it is - it’s an automated filter.
So why do I have a dataset then, and what is it even called?
The why part is pretty clear - it’s for testing accuracy. I didn’t go very far with that idea though. I ended up just having a dataset for future reference.
As for the name, it doesn’t really have one. I have, however, referreed to it as “the AutoDNI dataset” before.
The file path on my main computer for it is also /media/bob/external/git/autodni/dataset
, so, yeah, I can definitely see where this confusion comes from.
It’s not. I don’t add bias to it, and I don’t specifically choose to include people because of who they are and turn a blind eye to people who aren’t a targetted group. It doesn’t work that way. Never has.
Often times I don’t even bother checking the rest of the profile. I only care about the category when carefully reading profiles. And even then I never include it in the dataset. I only include the part that’s relevant.
I don’t. I only copy-paste a few limited things into a CSV. That’s literally all there is to it. Seriously.
And, as I’ve stated before, only a few things are stored in the dataset:
@example@example.com
)I don’t include display names, which is something people have joked about regarding this!
This is not a public dataset either - I only shared the dataset with one person, who I won’t reveal, because I don’t want to get hate thrown towards them.
Could I scrape stuff if I wanted to? Probably. But while it is obviously an idea that I’ve thought of, I’ve never actually done it because it’s by far one of the worst ways to approach this.
I’m not. I’m just mildly annoyed and decided to take a stance.
There are definitely things I would do differently though. And looking back at February, I did a really bad job in doing this which lead to problems later.
It can’t.
How do I know this? I do use Mastodon filters. Extensively!
It just doesn’t work though. It’s a game of whack-a-mole.
I probably should not have even bothered joining the fediverse to begin with. This was a bad idea.
And to think not wanting NSFW accounts sneaking into my timelines would lead to this…