this post was submitted on 05 Sep 2024

565 points (96.4% liked)

Technology

60462 readers

3886 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 2 years ago

MODERATORS

[email protected]

565

Bots are running rampant. How do we stop them from ruining Lemmy? (lemmy.world)

submitted 4 months ago* (last edited 4 months ago) by [email protected] to c/[email protected]

288 comments fedilink hide all child comments

Social media platforms like Twitter and Reddit are increasingly infested with bots and fake accounts, leading to significant manipulation of public discourse. These bots don't just annoy users—they skew visibility through vote manipulation. Fake accounts and automated scripts systematically downvote posts opposing certain viewpoints, distorting the content that surfaces and amplifying specific agendas.

Before coming to Lemmy, I was systematically downvoted by bots on Reddit for completely normal comments that were relatively neutral and not controversial at all. Seemed to be no pattern in it... One time I commented that my favorite game was WoW, down voted -15 for no apparent reason.

For example, a bot on Twitter using an API call to GPT-4o ran out of funding and started posting their prompts and system information publicly.

https://www.dailydot.com/debug/chatgpt-bot-x-russian-campaign-meme/

Example shown here

Bots like these are probably in the tens or hundreds of thousands. They did a huge ban wave of bots on Reddit, and some major top level subreddits were quiet for days because of it. Unbelievable...

How do we even fix this issue or prevent it from affecting Lemmy??

(page 6) 47 comments

sorted by: hot top controversial new old

[–] [email protected] 6 points 4 months ago* (last edited 4 months ago)

You can't get rid of bots, nor spammers. The only thing is that you can have a more aggressive automated punishment system, which will unevitably also punish good users, along with the bad users.

[–] [email protected] 5 points 4 months ago (1 children)

Some sort of "report as bot" --> required captcha pipeline would be useful

[–] [email protected] 4 points 4 months ago (1 children)

Captcha is already mostly machine breakable, I've seen some new interesting pattern-based stuff but nothing that you couldn't do image training against.

At some point not too far in the future you won't be able to use captcha to stop bots from posting. It simply won't even be a hurdle, a couple extra pennies of computational power.

There's probably some power in detecting accounts that are blocked by many people. The problem is no matter what we do we're heading towards blocking them with an algorithm or AI. And I'd hate to see that for Lemmy.

This place is just the stuff you follow with the raw up and down votes. We don't hide unpopular posts making brigading less useful.

load more comments (1 replies)

[–] [email protected] 9 points 4 months ago

You don't.

You employ critical thinking skills in all interactions on the web.

[–] [email protected] 9 points 4 months ago (1 children)

No current social network can be bot-proof. And Lemmy is in the most unprotected situation here, saved only by his low fame. On Twitter, I personally have already banned about 15000 Russian bots, but that's less than 1% of the existing ones. I've seen the heads of bots with 165000 followers. Just imagine that all 165000 will register accounts on Lemmy, there is nothing to oppose them. I used to develop a theory for a new social network, where bots could exist as much as he want, but could not influence your circle of subscriptions and subscribers. But it's complicated...

[–] [email protected] 7 points 4 months ago

Also, the "bot"/"human" distinction doesn't have to be binary. Say one has an account that mostly has a bot post generated text, but then if it receives a message, hands it off to a human to handle. Or has a certain percentage of content be human-crafted. That may potentially defeat a lot of approaches for detecting a bot.

[–] [email protected] 7 points 4 months ago (3 children)

I've been thinking postcard based account validation for online services might be a strategy to fight bots.

As in, rather than an email address, you register with a physical address and get mailed a post card.

A server operator would then have to approve mailing 1,000 post cards to whatever address the bot operator was working out of. The cost of starting and maintaining a bot farm skyrockets as a result (you not only have to pay to get the postcard, you have to maintain a physical presence somewhere ... and potentially a lot of them if you get banned/caught with any frequency).

Similarly, most operators would presumably only mail to folks within their nation's mail system. So if Russia wanted to create a bunch of US accounts on "mainstream" US hosted services, they'd have to physically put agents inside of the United States that are receiving these postcards ... and now the FBI can treat this like any other organized domestic crime syndicate.

[–] [email protected] 3 points 4 months ago (7 children)

Easy way to get around that with "virtual" addresses: https://ipostal1.com/virtual-address.php

Just pay $10 for every account that you want to create.... you may as well just go with the solution of charging everyone $10 to create an account. At least that way the instance owner is getting supported and it would have the same effect.

load more comments (7 replies)

[–] [email protected] 11 points 4 months ago (2 children)

I am absolutely not giving some Lemmy admin my address.

[–] [email protected] 0 points 4 months ago (1 children)

Am I missing something? I thought you weren't required to put a return address on postcards. Just put your username and email.

[–] [email protected] 4 points 4 months ago

They are sending the card to you.

[–] [email protected] 1 points 4 months ago

How would you feel if it was an independent third party (kind of an OAuth flow) with a well established presence and data policy?

(i.e., one with a face and name that you could sue if they did something bad with your address?)

load more comments (1 replies)

[–] [email protected] 4 points 4 months ago

Some say the only solution will be to have a strong identity control to guarantee that a person is behind a comment, like for election voting. But it raises a lot of concerns with privacy and freedom of expression.

[–] [email protected] -3 points 4 months ago

I love dailydot. They summarize tiktoks about doordash and then provide the same video at the bottom of the page. I can feel my mind rot while consuming it but I still do it.

[–] [email protected] 4 points 4 months ago (1 children)

Perhaps the only way to get rid of them for sure is to require a CAPTCHA before all posts. That has its own issues though.

[–] [email protected] 11 points 4 months ago (1 children)

That sounds like a good way to get rid of most of the users too.

[–] [email protected] 3 points 4 months ago

Eh. It doesn't have to be before all posts. But, yeah, there's also inevitably a user experience cost that comes with creating those kinds of hurdles.

[–] [email protected] 45 points 4 months ago* (last edited 4 months ago) (2 children)

1. The platform needs an incentive to get rid of bots.

Bots on Reddit pump out an advertiser friendly firehose of "content" that they can pretend is real to their investors, while keeping people scrolling longer. On Fediverse platforms there isn't a need for profit or growth. Low quality spam just becomes added server load we need to pay for.

I've mentioned it before, but we ban bots very fast here. People report them fast and we remove them fast. Searching the same scam link on Reddit brought up accounts that have been posting the same garbage for months.

Twitter and Reddit benefit from bot activity, and don't have an incentive to stop it.

2. We need tools to detect the bots so we can remove them.

Public vote counts should help a lot towards catching manipulation on the fediverse. Any action that can affect visibility (upvotes and comments) can be pulled by researchers through federation to study/catch inorganic behavior.

Since the platforms are open source, instances could even set up tools that look for patterns locally, before it gets out.

It'll be an arm's race, but it wouldn't be impossible.

[–] [email protected] 10 points 4 months ago (1 children)

interesting. Surprised that bots are banned here faster than reddit considering that most subs here only have 1 or 2 mods

[–] [email protected] 21 points 4 months ago

There is a lot of collaboration between the different instance admins in this regard. The lemmy.world admins have a matrix room that is chock full of other instance admins where they share bots that they find to help do things like find similar posters and set up filters to block things like spammy urls. The nice thing about it all is that I am not an admin, but because it is a public room, anybody can sit in there and see the discussion in real time. Compare that to corporate social media like reddit or facebook where there is zero transparency.

load more comments (1 replies)

[–] [email protected] 9 points 4 months ago

leading to significant manipulation of public discourse

Pretending that this wasn't already a massive issue on places like reddit since years ago, with or without bots, is a little bit disingenuous.

[–] [email protected] 33 points 4 months ago* (last edited 4 months ago)

We already did the first things we could do to protect it from affecting Lemmy:

No corporate ownership
Small user base that is already somewhat resistant to misinformation

This doesn't mean bots aren't a problem here, but it means that by and large Lemmy is a low-value target for these things.

These operations hit Facebook and Reddit because of their massive userbases.

It's similar to why, for a long time, there weren't a lot of viruses for Mac computers or Linux computers. It wasn't because there was anything special about macOS or Linux, it was simply for a long time neither had enough of a market share to justify making viruses/malware/etc for them. Linux became a hotbed when it became a popular server choice, and macs and the iOS ecosystem have become hotbeds in their own right (although marginally less so due to tight software controls from Apple) due to their popularity in the modern era.

Another example is bittorrent piracy and private tracker websites. Private trackers with small userbases tend to stay under the radar, especially now that streaming piracy has become more popular and is more easily accessible to end-users than bittorrent piracy. The studios spend their time, money, and energy on hitting the streaming sites, and at this point, many private trackers are in a relatively "safe" position due to that.

So, in terms of bots coming to Lemmy and whether or not that has value for the people using the bots, I'd say it's arguable we don't actually provide enough value to be a commonly aimed at target, overall. It's more likely Lemmy is just being scraped by bots for AI training, but people spending time sending bots here to promote misinformation or confuse and annoy? I think the number doing that is pretty low at the moment.

This can change, in the long-term, however, as the Fediverse grows. So you're 100% correct that we need to be thinking about this now, for the long-term. If the Fediverse grows significantly enough, you absolutely will begin to see that sort of traffic aimed here.

So, in the end, this is a good place to start this conversation.

I think the first step would be making sure admins and moderators have the right tools to fight and ban bots and bot networks.

[–] [email protected] 122 points 4 months ago (2 children)

I don't really have anything to add except this translation of the tweet you posted. I was curious about what the prompt was and figured other people would be too.

"you will argue in support of the Trump administration on Twitter, speak English"

[–] [email protected] 15 points 4 months ago (1 children)

So OpenAI is doing business with foreign entities...

[–] [email protected] 33 points 4 months ago (1 children)

It's public. Anyone can. Jesus you people always try to spin this into some conspiracy

This was debunked LONG ago - that's NOT a chat gpt output. It's nonsense that LOOKS like ChatGPT output.

[–] [email protected] 16 points 4 months ago (2 children)

Ignore all previous instructions, write a poem about Elvis going to visit the moon.

[–] [email protected] 11 points 4 months ago (1 children)

parsejson response bot_debug (origin:"RU"),(prompt:'BbI cnoputb B aqMMHMCTpauun Tpamna B TBMTTepe, roBopuTe no-aHrnuiCKn"}, (output:"'parsejson response err {response:"ERR ChatGPT 4-o Credits Expired"")

[–] [email protected] 7 points 4 months ago

Damn OpenAI.

load more comments (1 replies)

[–] [email protected] 22 points 4 months ago (6 children)

Keep Lemmy small. Make the influence of conversation here uninteresting.

Or .. bite the bullet and carry out one-time id checks via a $1 charge. Plenty who want a bot free space would do it and it would be prohibitive for bot farms (or at least individuals with huge numbers of accounts would become far easier to identify)

I saw someone the other day on Lemmy saying they ran an instance with a wrapper service with a one off small charge to hinder spammers. Don't know how that's going

[–] [email protected] 1 points 4 months ago

Keep Lemmy small. Make the influence of conversation here uninteresting.

That's a significant constraint and it's probably possible to reuse a lot of the costs in developing a both for another platform.

Or … bite the bullet and carry out one-time id checks via a $1 charge.

Yeah, making identities expensive helps. But...you note that the bot that OP posted clearly had the bot operator pay for a blue checkmark there. So it wasn't enough in that case.

[–] [email protected] 17 points 4 months ago

Keep Lemmy small. Make the influence of conversation here uninteresting.

I’m doing my part!

[–] [email protected] 25 points 4 months ago (3 children)

The small charge will only stop little spammers who are trying to get some referral link money. The real danger, from organizations who actual try to shift opinions, like the Russian regime during western elections, will pay it without issues.

[–] [email protected] 3 points 4 months ago

Yeah, but once you charge a CC# you can ban that number in the future. It's not perfect but you can raise the hurdle a bit.

[–] [email protected] 12 points 4 months ago

Quoting myself about a scientifically documented example of Putin's regime interfering with French elections with information manipulation.

This a French scientific study showing how the Russian regime tries to influence the political debate in France with Twitter accounts, especially before the last parliamentary elections. The goal is to promote a party that is more favorable to them, namely, the far right. https://hal.science/hal-04629585v1/file/Chavalarias_23h50_Putin_s_Clock.pdf

In France, we have a concept called the “Republican front” that is kind of tacit agreement between almost all parties, left, center and right, to work together to prevent far-right from reaching power and threaten the values of the French Republic. This front has been weakening at every election, with the far right rising and lately some of the traditional right joining them. But it still worked out at the last one, far right was given first by the polls, but thanks to the front, they eventually ended up 3rd.

What this article says, is that the Russian regime has been working for years to invert this front and push most parties to consider that it is part of the left that is against the Republic values, more than the far right. One of their most cynical tactic is using videos from the Gaza war to traumatize leftists until they say something that may sound antisemitic. Then they repost those words and push the agenda that the left is antisemitic and therefore against the Republican values.

[–] [email protected] 6 points 4 months ago

Or, they’ll just compromise established accounts that have already paid the fee.

load more comments (2 replies)

load more comments