I recently came across a torrent that seems to be an archive of Reddit. It got me thinking if it would be possible to make it locally browsable. However, I also considered the possibility that someone might have already addressed this by creating a public Lemmy instance, enabling the content to be accessible from any federated instance.
Honestly it upsets me enough when I see people or bots mirroring new Reddit posts to fedi without the original author’s permission. A full archive - whether in the form of a torrent or a fedi instance - also makes me feel icky.
I know it’s not possible and it’s entirely against reddit’s interests, but I wish there were a way for subreddits or people or posts to be marked somehow as not for copying or use elsewhere.
It has always weirded me out when I found /r/relationships posts copy-pasted to like BuzzFeed knock-off sites. Then yesterday I saw and blocked a Lemmy bot mirroring like a dozen reddit subs (including gonewild) to its instance.
It may be fine, good, and useful to archive like how-to content or technical support questions and stuff like that as there is a clear utility there. But seeing the more personal stuff that people might not want to see copied around or searchable makes me feel bad.
Yes, yes I know it’s the internet and these people should know better and if they really want to opt out they should submit a request to the wayback machine and set a robotstxt plus there’s no way to stop it and we really really need all of this valuable information preserved for historical purposes and as we all know information wants to be free and you can’t stop the signal. And all the myriad excuses that the less well behaved digital preservationists will lean on.
But at some point and in a lot of circumstances you’re copying people’s personal information and using it in ways they didn’t intend on when they posted it. I don’t know your personal opinion on the reports of reddit admins undeleting posts people have been deleting before they delete their accounts, but people who are upset about that should consider that “preserving” reddit data also takes away peoples’ agency over their data and their right to be forgotten in much the same way.
Are you a reddit employee?
Lmao yep that’s me. Couldn’t be that I just feel weird about copying around threads where people share pictures of their buttholes, ask for help escaping from abusive partners, or seek support on embracing their genders and sexualities without the permission of the original author. Reddit is more than just memes, video game tips, in-depth analyses on a variety of questions by historians and scientists, and celebrities advertising their newest ventures under the guise of AMAs. And therefore commensurate thought should be put into it before saying “yes this is all my information to do whatever I want with.”
On a separate note, I stopped using Twitter in 2018 partially because I was tired of every single conversation jumping right to “Russian bot” accusations. Didn’t matter what your opinion was, if someone disagreed with it, you were a Russian bot sent to sew chaos and division. I’ve been happily on the fediverse since then and sincerely hope I’m not now in for a wave of seeing “are you employed by or otherwise affiliated with reddit or meta you have to answer or this is entrapment” bs.