Lemmy Support

4654 readers

5 users here now

Support / questions about Lemmy.

Matrix Space: #lemmy-space

founded 5 years ago

MODERATORS

[email protected]

Censorship bot being a pain (lemmy.ml)

submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]

8 comments fedilink hide all child comments

There is a thread in another community regarding some controversies happening in women's chess. I posted to that thread, recommending a book written by WGM Jennifer Shahade who is a multi-time US women's chess champion. I also linked to a review of the book, the url of which contained the book title.

The Open Library page about the book is here: https://openlibrary.org/works/OL5849601W

it seems that the title, as chosen by the female author with considerable self-awareness, contains a word that is sometimes used as a sexist slur. You can see the title by clicking the link above. Unfortunately some kind of bot censored the title from both the post, and the review link (to chessbase.com) that I had posted. I was able to fool the bot by changing a few characters, but the bot's very existence is imho in poor taste.

We are adults here, we shouldn't have robots filtering our language. If we act sexist or abusive then humans should intervene, but not bots. Otherwise we are in an annoying semi-dystopia. The particular post I made, as far as I can tell, is completely legitimate.

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 0 points 1 year ago* (last edited 1 year ago) (1 children)

Let’s see if Lemmy has that too.

I'm aremovedatty today, so why not? :^) [EDIT: yes, it has. I wrote "a bit chatty" without spaces.]

The Scunthorpe problem is an additional issue, caused by failure to identify unit ("word") boundaries correctly. It can be solved with the current means, or at least tweaked for false negatives (e.g. don't identify "fuckingcunt") instead of false positives (e.g. identify "Scunthorpe").

The problem that I'm highlighting is on another level, that even LLMs have a really hard time with: that each unit can be used to convey [at least in theory] an infinite amount of concepts. They usually come "bundled" with a few of them, but as we humans use them, we either add or remove some. For slurs this has the following two effects:

it's possible to pick a word often used as a slur and cancel its slur value in a certain context, or even make it stop being taken as a slur by default.
it's possible to pick any common word and use it as a slur.

I'll post the example that I was thinking about. It doesn't use a slur but it's the same mechanism.

My cat is odd. He whimpers for food when we're dining, chases and fetches toys, and when the doorbell rings he runs to the door, meowing nonstop. It's like I got a really weird, meowing dog instead. My sister even walks this weird dog on a leash once in a while.

In that utterance the word "dog" is not being associated with 🐶, but to an odd example of 🐱, as the meaning of the word has been negotiated through the utterance. It's the same deal with slurs: it's possible to cancel their value as a slur in a certain utterance, depending on the rest of the utterance and external context. Black English speakers often do this with the "n" word* (used to convey "mate, bro, kin" among them), and slur reclamation is basically this on a higher level.

*another IMO legitimate situation is metalinguistic - using the word to refer to the word itself. I'm not using it here but I don't see a problem with it.

[–] [email protected] 0 points 1 year ago

I don't care much about any of these technical intricacies regarding word matching. I want Lemmy to be a human institution, which means no bots editing people's posts beyond possible spam control. If there is a serious trolling problem featuring specific keywords in a community, I'm fine with a moderator manually kicking off some automatic action to remove a bunch of posts at the same time. But we don't need robot nannies surveilling and messing with all of our posts.