A High Priority for Moving Away from Lemmy

Chris Remington ( @remington@beehaw.org ) · 2 years ago

A High Priority for Moving Away from Lemmy

Intelligence_Gap ( @Intelligence_Gap@beehaw.org ) · 2 years ago

I’m not sure that’s possible with images being allowed. If Google, Facebook, Instagram, and YouTube all struggle with it I think it will be an issue anywhere images are allowed. Maybe there’s an opening for an AI to handle the task these days but any dataset for something like that could obviously be incredibly problematic

thanevim ( @thanevim@kbin.social ) · 2 years ago

Yeah, the key problem here is that any open forum, of any considerable popularity, since the dawn of the Internet has had to deal with shit like CSAM. You don’t see it elsewhere because of moderators. Doing the very job Op does. It’s just now, Op, you’re in the position. Some people can, and have decided to, deal with moderating the horrors. It may very well not be something you, Op, can do.

d3Xt3r ( @d3Xt3r@beehaw.org ) · edit-2 2 years ago

The thing is though, with traditional forums you get a LOT of controls for filtering out the kind of users who post such content. For instance, most forums won’t even let you post until you complete an interactive tutorial first (reading the rules and replying to a bot indicating you’ve understood them etc).

And then, you can have various levels of restrictions, eg, someone with less than 100 posts, or an account less than a month old may not be able to post any links or images etc. Also, you can have a trust system on some forums, where a mod can mark your account as trusted or verified, granting you further rights. You can even make it so that a manual moderator approval is required, before image posting rights are granted. In this instance, a mod would review your posting history and ensure that your posts genuinely contributed to the community and you’re unlikely to be a troll/karma farmer account etc.

So, short of accounts getting compromised/hacked, it’s very difficult to have this sort of stuff happen on a traditional forum.

I used to be a mod on a couple of popular forums back in the day, and I even ran my own community for a few years (using Invision Power Board), and never once have I had to deal with such content.

The fact is Lemmy is woefully inadequate in it’s current state to deal with such content, and there are definitely better options out there. My heart goes out to @Chris and the staff for having to deal with this stuff, and I really hope that this drives the Beehaw team to move away from Lemmy ASAP.

In the meantime, I reckon some drastic actions would need to be taken, such as disabling new user registrations and stopping all federation completely, until the new community is ready.

Thevenin ( @Thevenin@beehaw.org ) · 2 years ago

So this just got posted on lemmy.dbzer0. They’ve got an AI-based CSAM screen up and running with promising initial results. The model was trained using CLIP, which as far as I understand it means they used written descriptions of what CSAM is or is not.

Could something like this work for Beehaw?

Intelligence_Gap ( @Intelligence_Gap@beehaw.org ) · 2 years ago

I’m sure the mods saw that, and it’s really more of a question for them tbh, but if it works for other Lemmy instances I’m not sure why it wouldn’t work here.

apis ( @apis@beehaw.org ) · 2 years ago

Wonder whether in theory one could use a dataset of… everything else, have the AI exclude what it does not recognise, then run the exclusions against a dataset to see whether or not they contain children. There could be an additional layer of running the exclusions against a dataset of regular sexual content.

One issue is that admin of any site would still want to report any CSAM to authorities. That could be automated by an AI checker, but one would have to have a lot of faith that the AI was decently accurate and not generating many false reports. The workaround I described to avoid using datasets of abuse is unlikely to be particularly accurate - ok for the purposes of protecting admin, but leaves them in an odd spot when it comes to banning a user, especially where a user’s livelihood could be impacted, or things like paid online courses. I guess specialist police departments probably would have to use highly relevant datasets, along with review by humans, but still - nobody wants to inadvertently clog up that system with false reports.

bermuda ( @bermuda@beehaw.org ) · 2 years ago

I’d be fine with not hosting images entirely. I don’t think people come to beehaw primarily to look at pictures

Chobbes ( @Chobbes@beehaw.org ) · 2 years ago

I’ve been thinking lately that I kind of miss things like IRC where you couldn’t really post pictures in chat. With things like Discord and Slack the off topic channels often devolve into people just sharing random memes they found funny at the time, and not really talking to each other. I’m sure there’s value in that too, but I think it can take up a lot of oxygen in the social space, so I’m not sure it’s always a win. Different formats encourage different ways of interacting with each other, I guess, and it’s interesting!

liv ( @liv@beehaw.org ) · edit-2 2 years ago

I just want to say, I am so so so sorry you had to see that.

I accidentally saw some CSAM in the 1990s and you are right, it is burnt into your mind. It’s the real limit case of “what has been seen cannot be unseen” - all I could do was learn to avoid accessing those memories.

If you can access counselling for this, that might be a good option. Vicarious trauma is a real phenomenon.

Chris Remington ( @remington@beehaw.org ) · 2 years ago

If you can access counselling for this, that might be a good option. Vicarious trauma is a real phenomenon.

Thank you for the advice. I’m not sure that I’ll need counseling but I’m open to it if need be. Time will tell.

loops ( @loops@beehaw.org ) · 2 years ago

Be sure to keep tabs on yourself, sometimes these things can really sneak up on you.

flatbield ( @furrowsofar@beehaw.org ) · edit-2 2 years ago

People keep talking about going to another platform. Personally I think a better idea would be to develop lemmy to deal with these issues. This must be a fediverse wide problem. So some discussion with other admins and the developers is probably the way to go on many of these things. Moreover you work with https://opencollective.com/, can they help. Beyond this, especially CSAM, there must be large funding agencies where one could get a grant to get some real professional programming put into this problem. Perhaps we could raise funds ourselves to help with this too.

So frankly I would like to see Beehaw solve the issues with lemmy, rather then just move to some other platform that will have its own issues. The exception may be if the Beehaw people think that being a safe space creates too big a target that you have to leave the Threadiverse to be safe. That to me seems like letting the haters win. It is exactly what they want. My vote will always be to solve the threadiverse issues rather then run away.

Just my feeling. There may be more short term practical issues that take precedence and frankly it is all up to you guys where you want to take this project.

snowe ( @snowe@programming.dev ) · 2 years ago

The solution is to use an already existing software product that solves this, like CloudFlare’s CSAM Detection. I know people on the fediverse hate big companies, but they’ve solved this problem already numerous times before. They’re the only ones allowed access to CSAM hashes, lemmy devs and platforms will never get access to the hashes (for good reason).

flatbield ( @furrowsofar@beehaw.org ) · edit-2 2 years ago

They will still need to have a developer set this up and presumably it should be added as an option to the main code base. I thought I heard the beehaw admins were not developers.

There are a number of other issues that are driving the admins to dump lemmy. Same applies there.

snowe ( @snowe@programming.dev ) · 2 years ago

Not sure what you mean. You do not need to be a developer to set up CloudFlare’s CSAM detection. You simply have email the NCMEC, get an account, then check a box in CF, input some information about your NCMEC account, and then you’re good to go.

flatbield ( @furrowsofar@beehaw.org ) · 2 years ago

How does the scan happen? It has to be linked in some how. Are you saying that choosing cloudflair as your CDN that will flag at distribution time? Or at upload time?

snowe ( @snowe@programming.dev ) · 2 years ago

If you use CloudFlare as your proxy then all your instances traffic gets routed through CF before ever making it to your server. If someone tries to upload CSAM it will immediately be flagged (before ever making it to your server). CloudFlare then quarantines it and automatically files a report with the National Center for Missing and Exploited Children. There’s more to the prices, but the point is that putting it in the lemmy software is not a good solution, especially when industry standard proven solutions already exist. You don’t have to use CF. You can also use solutions from Google, FB, Microsoft, Thorn, etc.

flatbield ( @furrowsofar@beehaw.org ) · 2 years ago

Interesting. Thanks.

thySatannic ( @thySatannic@beehaw.org ) · 2 years ago

Wait… why is no access to csam hashes a good thing? Wouldn’t it make it easier to detect if hashes were public?! I feel like I’m missing something here…

snowe ( @snowe@programming.dev ) · 2 years ago

Giving access to CSAM hashes means anyone wanting to avoid detection simply has to check what they’re about to upload against the db. If it matches then they simply modify the image until it doesn’t. It’s literally guaranteed to make the problem worse, not better.

thySatannic ( @thySatannic@beehaw.org ) · 2 years ago

Ah thanks, hadn’t thought of that!

sarmale ( @sarmale@lemmy.zip ) · 1 year ago

Question, from what I saw it seems like every CSAM image ever is assigned a new hash. Isnt it unscalable to asign a separate hash for everything? does that mean that most CSAM images were detected before?

2 years ago

I’m sure those repugnant assholes do it “for the lulz” and if they want to mess with you they’ll do it anywhere.

There’s this study that says playing Tetris helps ease recently acquired trauma https://www.ox.ac.uk/news/2017-03-28-tetris-used-prevent-post-traumatic-stress-symptoms

And the admin from his eponymous instance dbzero created an interesting script to get rid of CSAM without having to review it manually, take a look -> https://github.com/db0/lemmy-safety

renard_roux ( @renard_roux@beehaw.org ) · edit-2 2 years ago

Just tagging @admin in case they don’t see this ❤️

Edit: aaand I did it wrong 🙄 @admin@beehaw.org 👈 Better?

lerba ( @lerba@beehaw.org ) · 2 years ago

This post seems highly reactive to me. I’m sorry to hear of you being exposed to such disturbing material, but I fail to see at true connection of that happening and using Lemmy as the platform. I absolutely agree that nobody should have to experience what you did, but I disagree with the platform change proposition.

potterman28wxcv ( @potterman28wxcv@beehaw.org ) · 2 years ago

I don’t know of any software platform where that would not happen.

Even with a text-only platform people can still post URLs to unsafe content.

I think OP is referring to some kind of automated scanner but I’m not sure there are publicly available ones. I guess using them would come at a cost - either computational or $$. And even so, there can be false positives so you would probably still have to check the report anyway someday.

edit-2 2 years ago

Sadly, the only 100% way to never have that kind of material ever touch your servers is to not allow image uploads from the public. Whether it’s on Lemmy or another social site, or something you control entirely on your own. Maybe sooner than we think, AI could deal with the moderation of it so a human never has to witness that filth, but it’s not quite there yet.

AndreTelevise ( @AndreTelevise@beehaw.org ) · edit-2 2 years ago

Lemm.ee, another instance I am in, isn’t hosting images anymore or letting people upload images directly due to this issue. When your platform is supposed to be 100% open source and decentralized, there are bound to be issues like this, and they should be dealt with, even if proprietary tech is necessary for it. I’m sorry to hear about this.

PenguinCoder ( @Penguincoder@beehaw.org ) · 2 years ago

Does that mean a platform that does not allow any images to be uploaded? Or a platform that has better access control and remediation controls?

Chris Remington ( @remington@beehaw.org ) · 2 years ago

I’d be willing to consider either and would love your, particular, feedback on this as well.

flatbield ( @furrowsofar@beehaw.org ) · edit-2 2 years ago

By the way. I have always been surprised that Beehaw did host images. The extra cost (they are large and costly in both storage and bandwidth), added security and attack vector possibilities, IP issues, CSAM issues, etc.

flatbield ( @furrowsofar@beehaw.org ) · 2 years ago

Also, I do not think this is a Lemmy specific issue. It is an image availability, and scale issue. Federation of course increases the scale a lot too.

Scary le Poo ( @Scary_le_Poo@beehaw.org ) · 2 years ago

Did you forget to log into your alts or are you unaware of how the edit button functions?

Storage is super cheap, fwiw.

flatbield ( @furrowsofar@beehaw.org ) · 2 years ago

Now be nice. Of course I know about the edit button. The comments were not posted at the same time and generally later editing is discouraged. Nor are long comments or one comment on different topics great.

Why on earth would I have multiple accounts? I am sure people do, but that too is kind of strange behavior and perhaps abusive depending on how they are used.

Scary le Poo ( @Scary_le_Poo@beehaw.org ) · 2 years ago

Edits are not frowned upon unless you’re just editing a post to make someone look bad

PenguinCoder ( @Penguincoder@beehaw.org ) · 2 years ago

The RATE of storage both the increasing and the bandwidth transferring, is the expensive part.

flatbield ( @furrowsofar@beehaw.org ) · 2 years ago

Not as cheap as you think at scale and your renting the bandwidth and space from a hosting company and most of the users are probably free loading. The whole challenge of FOSS and services is that there is no one to pay operating costs.

Rentlar ( @Rentlar@lemmy.ca ) · 2 years ago

May I gently suggest for next time that you reply to yourself in a chain, if you’d like to add something on, and if you are against editing your post? I have trouble reading the order of your posts from the default sorting method.

flatbield ( @furrowsofar@beehaw.org ) · edit-2 2 years ago

I think if a platform has image capabilities this is to be expected. I guess the only exception if there are filters that can be used, but this seems unlikely. So I think it is an image vs. no image decision. The other problem with images is they can be attack vectors from a security point of view. Any complex file format can be an attack vector as interpreters of complex file formats often have bugs.

Can you imagine that the large platforms have whole teams of people that have to look at this stuff all day and filter it out. Not sure how that works, but it is probably the reality. Notice R$ never hosted images.

Kangie ( @Kangie@lemmy.srcfiles.zip ) · 2 years ago

A software platform that makes it nearly impossible for Beehaw to host, in any way, CSAM.

I hate to say it, but you’ll need to find a text-only platform. Allowing any image uploads opens the door to things like this.

Besides that, if your concern is that no moderator should be exposed to anything like that, well on a text-only site you might have to deal with disguised spam links to gore, scam, etc. You’ll still have to click on links to effectively moderate.

Maybe you should consider if this is a position that you want to put yourself in again. It sounds like this may just not be for you.

Chobbes ( @Chobbes@beehaw.org ) · edit-2 2 years ago

This was my immediate thought as well. It’s unfortunate, but there will probably always be people who abuse online platforms like this. It’s totally okay if you’re not up to the task of moderating disturbing content like that — it sounds like it can be a really brutal job. I don’t know what the moderation tools on Lemmy are like, but maybe there’s a way to flag different kinds of moderation concerns for different moderators (so not everybody has to be exposed to this kind of stuff if they’re not comfortable with it). And maybe there could also be a system where if user’s flag the post it can be automatically marked as NSFW and images can be hidden by default so moderators and other users don’t have to be exposed to it without warning (though of course such a system could potentially be abused as well). But beyond that I’m not sure what else you can do, aside from maybe limiting federation.

Storksforlegs ( @storksforlegs@beehaw.org ) · 2 years ago

As others have suggested, I think temporarily suspending images until you guys can settle on a safe alternative to lemmy is a good idea.

Im sorry you had to see something like this, i hope you are able to seek out some counceling asap, talk to someone about it. Even something like https://www.7cups.com/ might be helpful.

Sina ( @Sina@beehaw.org ) · 2 years ago

I think temporarily suspending images until you guys can settle on a safe alternative to lemmy is a good idea.

There is no such thing as a safer alternative to Lemmy. It’s very easy to say things like “use tools” to filter these things, but in actuality it’s anything but, it’s way beyond a foss project. (or Reddit for that matter, though they are trying and good gawd, I just remembered something I saw on reddit and have not thought of for years, damn it)

Storksforlegs ( @storksforlegs@beehaw.org ) · edit-2 2 years ago

Well true, but I meant more like a forum with limited access (no images or links) until you meet certain requirements etc. So not totally safe, but a bit safer than the current setup

👁️👄👁️ ( @mojo@lemm.ee ) · 2 years ago

As long as you can post links or upload images, there is an avenue for CSAM to be spammed. Beehaw should probably start with a whitelist and slowly expand. Refuse to federate with anyone that has open registration.

forestG ( @forestG@beehaw.org ) · edit-2 2 years ago

I don’t think there is a way to have both the option to host images and have zero risk of getting such image uploads. You either completely disable image hosting, or you mitigate the risk by the way image uploads are handled. Even if you completely disable the image uploads, someone might still link to such content. The way I see this there are two different aspects. One is the legal danger you place yourself when you open your instance to host images uploaded by users. The other is the obvious (and not so obvious) and undeniable harmful effects contact with such material has for most of us. The second, is pretty impossible to guarantee 100% on the internet. The first you can achieve by simply not allowing image uploads (and I guess de-federating with other instances to avoid content replication).

The thing is, when you host an instance of a technology that allows for better moderation (i.e. allowing certain kinds of content, such as images, only after a user reaches a certain threshold of activity), actually helps in a less obvious manner. CSAM is not only illegal to exist on the server-side. It’s also illegal and has serious consequences for the people who actually upload it. The more activity history you have on a potential uploader, the easier it becomes to actually track him. Requiring more time for an account before allowing it to post images, makes concealing the identity harder and raises the potential risk for the uploader to the extend that it will be very difficult to go through the process only to cause problems to the community.

Let me also state this clearly: I don’t have an issue with disabling image uploads here, or changing the default setting of instance federation to a more limiting one. Or both. I don’t mind linked images to external sites.

I am sorry you had to see such content. No, it doesn’t seem to go away. At least it hasn’t for me, after almost 2 decades :-/

apis ( @apis@beehaw.org ) · 2 years ago

So, so sorry you had to see that, and thank you for protecting the rest of us from seeing it.

On traditional forums, you’d have a lot of control over the posting of images.

If you don’t wish to block images entirely, you could block new members from uploading images, or even from sharing links. You could set things up so they’d have to earn the right to post by being active for a randomised amount of time, and have made a randomised number of posts/comments. You could add manual review to that, so that once a member has ostensibly been around long enough and participated enough, admin look at their activity pattern as well as their words to assess if they should be taken off probation or not… Members who have been inactive for a while could have image posting abilities revoked and be put through a similar probation if they return. You could totally block all members from sharing images & links via DM, and admin email accounts could be set to reject images.

It is probably possible to obtain the means to reject images which could contain any sexual content (checked against a database of sexual material which does not involve minors), and you could probably also reject images which could contain children and which might not be wholesome (checked against a database of normal images of children).

Aside from the topic in hand, a forum might decide to block all images of children, because children aren’t really in a position to consent to their images being shared online. That gets tricky when it comes to late teens & early 20s, but if you’ve successfully filtered out infants, young children, pre-teens & early teens as well as all sexual content, it is very unlikely that images of teenagers being abused would get through.

Insisting that images are not uploaded directly, but via links to image hosting sites, might give admin an extra layer of protection, as the hosting sites have their own anti-CSAM mechanisms. You’d probably want to whitelist permitted sites. You might also want a slight delay between the posting of an image link and the image appearing on Beehaw - this would allow time for the image hosting site to find & remove any problem images before they could appear on Beehaw (though I’d imagine these things are pretty damn fast by now).

You could also insist that members who wish to post images or links to images can only do so if they have their VPN and other privacy preserving methods disabled. Most members wouldn’t be super-enthused about this, until they’ve developed trust in the admin of the site, but anyone hoping to share images of children being abused or other illegal content will just go elsewhere.

Admin would probably need to be able to receive images of screenshots from members trying to report technical issues, but those should be relatively easy to whitelist with a bot of some sort? Or maybe there’s some nifty plugin for this?

Really though, blocking all images is going to be your best bet. I like the idea of just having the Beehaw bee drawings. You could possibly let us have access to a selection of avatars to pick, or have a little draw plugin so members can draw their own. On that note, those collaborative drawing plugin things can be a fun addition to a site… If someone is very keen for others to see a particular image, they can explain how to find it, or they can organise to connect with each other off Beehaw.

jarfil ( @jarfil@beehaw.org ) · edit-2 2 years ago

block new members from uploading images

I’ve tried those methods something like 10 years ago. It didn’t work; people would pose as decent users, then suddenly switch to posting shit when allowed. I’m thinking nowadays, with the use of ChatGPT and similar, those methods would fail even more.

Modern filtering methods for images may be fine(-ish), but won’t stop NSFL and text based stuff.

Blocking VPN access, to a site intended as a safe space, seems contradictory.

anyone hoping to share […] illegal content will just go elsewhere

Like someone else’s free WiFi. Wardriving is still a thing.

draw plugin so members can draw their own

That can be easily abused, either manually or through a bot. Reddit has the right idea there, where they have an avatar generator with pre-approved elements. Too bad they’re pretty stifling (and sell the interesting ones as NFTs).

apis ( @apis@beehaw.org ) · 2 years ago

Yup, as it gets ever easier to overwhelm systems, there are no good solutions to the matter, aside from keeping it text only + Beehaw’s own drawings.

jarfil ( @jarfil@beehaw.org ) · edit-2 2 years ago

Some text-only creepepastas are equally disturbing and illegal in some places. IIRC some Lemmy instance in Ireland had to close shop because their legislation applies to both “images” and “descriptions of images”.

apis ( @apis@beehaw.org ) · 2 years ago

True, but this is assuming one wishes to have a place to communicate online at all.

And though text can be intensely disturbing, it is inherently different to images/footage of actual children actually being harmed.

jarfil ( @jarfil@beehaw.org ) · 2 years ago

Yeah… you’ll have to excuse me, because while I’d love to delve deeper into the philosophy of perception, the art of rhetoric, or how the AIs can upend it all… I’ll have to leave it here, since I’ve been told in no uncertain terms that this is not the place to discuss this kind of stuff.

Maybe we could meet in some other safe space, focused on pure intellectual discussions, if such existed.

apis ( @apis@beehaw.org ) · 2 years ago

That’s fair.

Not currently using other spaces, nor aware of any suited to the topic (gladly, I suspect).

jarfil ( @jarfil@beehaw.org ) · 2 years ago

There are some… just not safe, and/or not intellectual. I’d start one, but seeing the shitstorms going over here, and my current IRL drama, I kind of don’t feel like it ATM.

Storksforlegs ( @storksforlegs@beehaw.org ) · edit-2 2 years ago

I second everything you said here

jarfil ( @jarfil@beehaw.org ) · edit-2 2 years ago

Those images are burnt into my mind and I would love to get rid of them but I don’t know how or if it is possible

I’m very sorry this happened to you, and I wish I could offer you some advice… but that’s the main reason I stopped hosting open community stuff many years ago. I thought I was hardened enough, but nope; between the spam, the “shock imagery” (NSFL gore, CSAM), the doxxing, and toxic users in general… even having some ads was far from making it all worthwhile. There is a reason why “the big ones” like Facebook or Google churn through 3rd world mods who can’t take it for more than a few months before getting burnt out.

I wish I could tell you that you’ll eventually forget what you’ve seen… but I still remember stuff from 30 years ago. Also don’t want to scare you, but it’s not limited to images… some “fanfiction” with text imagery is evil shit that I still can’t forget either.

Nowadays, you can find automated CSAM identification services, like the one run by Microsoft, so if you integrated that, you could err on the side of caution and block any image it marks as even suspicious. This may or may not work in your jurisdiction, with some requiring you to “preserve the proof” and submit it to authorities (plus different jurisdictions having different definitions of what is an what isn’t breaking the law, and laws against swamping them with false positives… so you basically can’t win). This will also do nothing for the NSFL or text based imagery.

A way to “shield yourself” from all of this as an admin, is to go to an encrypted platform where you can’t even see what’s getting posted, so you never run the risk of seeing that kind of content… but then you end up with zero moderation tools, pushing all the burden onto your users, so not suitable for a safe space.

Honestly, I don’t think there is an effective solution for this yet. It’s been a great time ~~abusing the good will of the admins and mods~~ staying on Beehaw, but if you can’t find a reasonable compromise… oh well.