See THIS POST

Notice- the 2,000 upvotes?

https://gist.github.com/XtremeOwnageDotCom/19422927a5225228c53517652847a76b

It’s mostly bot traffic.

Important Note

The OP of that post did admit, to purposely using bots for that demonstration.

I am not making this post, specifically for that post. Rather- we need to collectively organize, and find a method.

Defederation is a nuke from orbit approach, which WILL cause more harm then good, over the long run.

Having admins proactively monitor their content and communities helps- as does enabling new user approvals, captchas, email verification, etc. But, this does not solve the problem.

The REAL problem

But, the real problem- The fediverse is so open, there is NOTHING stopping dedicated bot owners and spammers from…

  1. Creating new instances for hosting bots, and then federating with other servers. (Everything can be fully automated to completely spin up a new instance, in UNDER 15 seconds)
  2. Hiring kids in africa and india to create accounts for 2 cents an hour. NEWS POST 1 POST TWO
  3. Lemmy is EXTREMELY trusting. For example, go look at the stats for my instance online… (lemmyonline.com) I can assure you, I don’t have 30k users and 1.2 million comments.
  4. There is no built-in “real-time” methods for admins via the UI to identify suspicious activity from their users, I am only able to fetch this data directly from the database. I don’t think it is even exposed through the rest api.

What can happen if we don’t identify a solution.

We know meta wants to infiltrate the fediverse. We know reddits wants the fediverse to fail.

If, a single user, with limited technical resources can manipulate that content, as was proven above-

What is going to happen when big-corpo wants to swing their fist around?

Edits

  1. Removed most of the images containing instances. Some of those issues have already been taken care of. As well, I don’t want to distract from the ACTUAL problem.
  2. Cleaned up post.
  • What, corrective courses of action shall we seek?
    I sent messages to these users, notifying them to come to this thread.
    1. https://startrek.website/u/ValueSubtracted (startek.website)
    2. https://oceanbreeze.earth/u/windocean (oceanbreeze.earth)
    3. https://normalcity.life/u/EuphoricPenguin22 (normalcity.life)
    I blocked / defederated these instances:
    1. https://lemmy.dekay.se/ (appears to just be a spambot server)
  •  Tugg   ( @tugg@lemmyverse.org ) 
    link
    fedilink
    English
    26
    edit-2
    1 year ago

    I dont have much to add other than I am an experienced admin and was dismayed at how vulnerable Lemmy is. Having an option to have open registrations with no checks is not great. No serious platform would allow that.

    I dont know of a bulletproof way to weed put the bad actors, but a voting system that Lemmy can leverage, with a minimum reputation in order to stay federated might work. This would require some changes that I’m not sure the devs can or would make. Without any protection in place, people will get frustrated and abandon Lemmy. I would.

  • The place feels different today than it did just a couple of days ago, and it positively reeks of bots.

    I’m seeing far fewer original posts and far more links to karma-farmer quality pabulum, all of which pretty much instantly somehow get hundreds of upvotes.

    The bots are here. And they’re circlejerking.

  •  o_o   ( @o_o@programming.dev ) 
    link
    fedilink
    English
    21
    edit-2
    1 year ago

    Honestly, I’m interested to see how the federation handles this problem. Thank you for all the attention you’re bringing to it.

    My fear is that we might overcorrect by becoming too defederation-happy, which is a fear it seems that you share. However I disagree with your assertion that the federation model is more risky than conventional Reddit-like models. Instance owners have just as many tools (more, in fact) as Reddit does to combat bots on their instance. Plus we have the nuke-from-orbit defederation option.

    Since it seems like most of these bots are coming from established instances (rather than spoofing their own), I agree with you that the right approach seems to be for instance mods to maintain stricter signups (captcha, email verification, application, or other original methods). My hope is that federation will naturally lead to a “survival of the fittest” where more bot-ridden instances will copy the methods of the less bot-ridden instances.

    I think an instance should only consider defederation if it’s already being plagued by bot interference from a particular instance. I don’t think defederation should be a pre-emptive action.

  • There is no built-in “real-time” methods for admins via the UI to identify suspicious activity from their users, I am only able to fetch this data directly from the database. I don’t think it is even exposed through the rest api.

    The people doing the development seem to have zero concern that their all the major servers are crashing with nginx 500 errors on their front page under routine moderate loads, nothing close to a major website. There is no concern to alert operators of internal federation failures, etc.

    I am only able to fetch this data directly from the database.

    I too had to resort to this, and published an open source tool - primitive and non-elegant, to try and get something out there for server operators: !lemmy_helper@lemmy.ml

    • We need a better solution for this, rather then mass-bulk defederation.

      In my opinion- that is going to greatly slowdown the spread and influence of this platform. Also IMO- I think these bots are purposely TRYING to get instances to defederate from each other.

      Meta is pushing its “fediverse” thing. Reddit, is trying to squash the fediverse. Honestly, it makes perfect sense that we have bots trying to upvote the idea of getting instances to defederate each other.

      Once- everything is defederated- lots of communities will start to fall apart.

      •  db0   ( @db0@lemmy.dbzer0.com ) 
        link
        fedilink
        English
        91 year ago

        I agree. This is why I started the Fediseer which makes it easy for any instance to be marked as safe through human review. If people cooperate on this, we can add all good instances, no matter how small, while spammers won’t be able to easily spin up new instances and just spam.

          •  db0   ( @db0@lemmy.dbzer0.com ) 
            link
            fedilink
            English
            41 year ago

            First we need to populate it. Once we have a few good people who are guaranteeing for new instances regularly, we can extend it to most known good servers and create a “request for guarantee” pipeline. The instance admins can then leverage it by either using it as a straight whitelist, or more lightly by monitoring traffic coming from non-guaranteed instances more closely.

            The fediseer just provides a list of guaranteed servers. It’s open ended after that so I’m sure we can find a proper use for this that doesn’t disrupt federation too much.

                • One recommendation- how do we prevent it from being potentially brigaded?

                  Someone vouches for a bad actor, bad actor vouches for more bad actors- then they can circle jerk their own reputation up.

                  Edit-

                  Also, what prevents actors in “downvoting” instances hosting content they just don’t like?

                  ie- yesterday, half of lemmy was wanting to defederate sh.itjust.works due to a community called “the_donald”, containing a single troll shit-posting. (The admins have since banned, and remove that problem)- but, still, everyone’s knee-jerk reaction was to just defederate. Nuke from orbit.

      • The solution is to choose servers with admins who are enabling bot protections.

        If admins are not using methods to dissuade bot signups, then they’re not keeping their site clean for their users. They’re being a bad admin.

        If they’re not protecting their site against bots, they’re also not protecting the network against hosts. That makes them bad denizens of the Fediverse, and the rest of us should take action to protect the network.

        And that means cutting ties with those who endanger it.

        • See the original post. (may have changes’ since you read it)

          I can spin up a fresh instance in UNDER 15 seconds, and be federated with your server in under a minute.

          There is literally nothing that can be done to stop this currently, unless servers completely wall themselves from the outside world, and follow a whitelisting approach. However, this ruins one of the massive benefits of the fediverse.

          • I can spin up a fresh instance in UNDER 15 seconds, and be federated with your server in under a minute.

            And I can blacklist your instance in less than 5 seconds. We have the answer. Administrators of instances have the power to do whatever disposition they want already.

              • No. You don’t.

                Yes I do. Because I actually understand how servers work. If your just running Lemmy with no understanding of how the internet works… then you’re doing yourself a disservice.

                Edit: Oh I missed this the first time I read it…

                Quit being a twerp, and work with us.

                Yeah no. I have no interest to work with leeches that don’t understand how to run services. Let alone ones that jump straight to ad hominem.

          • Yeah, setting up new instances is a different issue, of course. And there is definitely a lack tools to help with that as of yet. We need things like rate limiting on new federations, or on unusual traffic spikes, mod queues for posts that get caught up in them. Plus the ability to purge all posts and comments from users from defederated sites.

            Among other things.

          •  o_o   ( @o_o@programming.dev ) 
            link
            fedilink
            English
            2
            edit-2
            1 year ago

            There are two worries here:

            1. Bots on established and valid instances (Should be handled by mods and instance admins, just like conventional non-federated forums. Perhaps more tooling is required for this— do you have any suggestions? However, I think it’s a little premature to say that federation is inherently more susceptible or that corrective action is desperately needed right now.).

            2. Bots on bot-created instances. (Could be handled by adding some conditions before federating with instances, such as a unique domain requirement. Not sure what we have in this space yet. This will limit the ability to bulk-create instances. After that, individual bot-run instances can be defederated with if they become annoyances.)

          • I can think of a way to help with the problem, but I don’t know how hard it would be to implement.

            Create some sort of trust score, where instance owners rate other instances they federate with.
            Then the score gets shared in the network. Like some sort of federated whitelisting.
            You would have to be prudent a first, but not do the whole task yourself.

            You could even add an “adventurousness” slider, to widen or restrict the network based on this score.

          • Which is awesome.
            I actually have no idea where Blockchain tech could exist.
            A reputation could be an excellent example. But if it can be manipulated or gamed, it kinda makes it pointless.
            At which point a centralised registry makes sense.
            As long as the central registrar can be trusted.
            But I don’t think Blockchain solves that point of trust.

            So, once again, turns out Blockchain tech is pretty useless.

            • The blockchain would just add the ability to verify somebody said, what it says they said.

              Ie- if I say, hey, towerful is a great person. A blockchain could be leverage to ensure that that was said by me.

              It does have a use- but, there is a big price to pay for using it, in terms of complexity, performance, and sized used.

              In this case, I would call it unnecessary overhead, unless we determine there is foul play occuring at the point of centralization.

              Edit- Although, it is still possible for users to sign messages, and still use a centralized location. That gives the best of both worlds, without the needless added complexity.

  • Hello. The post you mentioned was made as a warning, to prove a point. That the fediverse is currently extremely vulnerable to bots.

    user ‘alert’, made the post then upvoted with his bots. To prove how easy it was to manipulate traffic, even without funding.

    see:
    https://kbin.social/m/lemmy@lemmy.ml/t/79888/Protect-Moderate-Purge-Your-Sever

    It’s proof that anyone could easily manipulate content unless instance owners take the bot issue seriously.

  • @xtremeownage

    I think that one of the most difficult things to deal with more common bots, spamming, reposting, etc.

    Is that parsing all the commentary and dealing with it on a service wide level is really hard to do, in terms of computing power and sheer volume of content. Seems to me that do this on an instance level with user numbers in the 10’s of thousands is a heck of a lot more reasonable than doing it on a 10’s of millions of users service.

    What I’m getting at is that this really seems like something that could (maybe even should) be built into the instance moderation tools, at least some method of marking user activity as suspicious for further investigation by human admins/mods.

    We’re really operating on the assumption that people spinning up instances are acting in good faith, until they prove that they aren’t, I think the first step is giving good faith actors the tools to moderate effectively, then worrying about bad faith admins.

  • Reposting this in comment from a reply elsewhere in the thread.

    If anything there should be SOME centralization that allows other (known, somehow verified) instances to vote to disallow spammy instances from federating. In some way that couldn’t be abused. This may lead to a fork down the road (think BTC vs BCH) due to community disagreements but I don’t really see any other way this doesn’t become an absolute spamfest. As it stands now one server admin could spamfest their own server with their own spam, and once it starts federating EVERYONE gets flooded. This also easily creates a DoS of the system.

    Asking instance admins to require CAPTCHA or whatever to defeat spam doesn’t work when the instance admins are the ones creating spam servers to spam the federation.

  •  Sibbo   ( @Sibbo@sopuli.xyz ) 
    link
    fedilink
    English
    31 year ago

    I really hope that some researchers will get interested into this and develop some cool solutions to this. Maybe we are lucky and they even implement them into Lemmy.

    • This, isn’t a problem specific to activity pub, lemmy, or any individual platform in general.

      Reddit faces this problem every day. Facebook faces this problem. Twitter faces this problem.

      They all do.

      And, each platform has to determine the best method for that platform to deal with this issue.

      •  monobot   ( @monobot@lemmy.ml ) 
        link
        fedilink
        English
        51 year ago

        There are data scientist around and we are monitoring where this goes.

        Bigest problem I currently see is how to effectively share data but preserve privacy. Can this be solved without sharing emails and ip addresses or would that be necessary? Maybe securely hashing emails and ip addresses is enough, but that would hide some important data.

        Should that be shared only with trusted users?

        Can we create dataset where humans would identify bots and than share with larger community (like kaggle), to help us with ideas.

        There are options and will be built, just jt can not happen in few days. People are working non stop to fix (currently) more important issues.

        Be patient, collect the data and let’s work on solution.

        And let’s be nice to each others, we all have similar goals here.

        • Biggest problem I currently see is how to effectively share data but preserve privacy. Can this be solved without sharing emails and ip addresses or would that be necessary? Maybe securely hashing emails and ip addresses is enough, but that would hide some important data.

          So- email addresses and instances are actually only known by the instance hosting the user. That data is not even included in the persons table. Its stored in the local_user table, away from the data in question. As such- it wouldn’t be needed, nor, included in the dataset.

          Regarding privacy- that actually isn’t a problem. On lemmy, EVERYTHING is shared with all federated instances. Votes, Comments, Posts. Etc. As such- there isn’t anything I can share from my data, that already isn’t also known by many other individuals.

          Can we create dataset where humans would identify bots and than share with larger community (like kaggle), to help us with ideas.

          Absolutely. We can even completely automate the process of aggregating and displaying this data.

          db0 also had an idea posted in this thread- and is working on a project to help humans vet out instances. I think that might be a start too.

          •  monobot   ( @monobot@lemmy.ml ) 
            link
            fedilink
            English
            1
            edit-2
            1 year ago

            That sounds great and at least we can try something and learn what can or can not be done. I am totally interested in working on bot detection.

            I know that emails remain locally, but those can also be important part of pattern detection, but it has to be done without them.

            Fediseer sounds great, at least building some in instances.

            I am more thinking on votes, comments and post detection from individual accounts in which fediseer would be quite important weight.

      • That feel pretty much the only way you can easily filter bot out.

        The best ID to check would be a government ID or a bank account ID. The gov/bank are absolutely crazy about making sure that someone is really someone.

        Unfortunately, this is incompatible with anonymity, unless we trust the instance admin.

        I really like the SMS thing.