thoughtpile i write things here

Ideas on a better content warning system for the Fediverse

Content warnings are a more general form of trigger warnings. They are notices that precede potentially sensitive content, so readers can prepare themselves to adequately engage or, if necessary, disengage for their own well-being.

How content warnings are handled by current implementations

Mastodon and Pleroma abuse the “summary” field in ActivityPub for content warnings.[1][2] The field returned by the Mastodon API is called “spoiler_text”. It’s a simple text field. There are no mechanisms to ensure that content warnings are predictable and they clash with real summaries.

There is no consensus as to how to name these warnings. Some use abbreviations like “mh”, “ec”, “alc” and so on, others use “mental health”, “eye contact” and “alcohol”. Negativity is expressed with “negative”, “neg”, “-” or “--”. Words get misspelled. It makes filtering unnecessary hard.

How to make content warnings better

The goal

The purpose of content warnings is to allow people to decide if they want to see the content. Maybe you don’t want to look at cats under any circumstances, only want to look at nudity sometimes and want to see as much pictures of squirrels as possible. You should be able to filter out all cat pictures, have all posts with nudity collapsed and all posts with squirrels expanded.

Problems and solutions

Different people name content warnings differently

Content warnings should auto-complete. Each server could have a database with all previously used content warnings. If you type “cat”, a little list pops up showing that “cat” was used 17 times and “cats” was used 2 times. You see that most people expect cat pictures to be labeled with “cat”. It could be useful to be able to synchronize the content warning database between servers.

How does the server know where one content warning ends and another begins?

Content warnings should be arrays, not one text field. They could be implemented as their own type or as special tags.

Example for ActivityPub-tags with type “content_warning”:
"tag": [
  {
    "type": "content_warning",
    "name": "nudity",
    "id": "https://example.com/content_warnings/nudity"
  },
  {
    "type": "content_warning",
    "name": "eye contact",
    "id": "https://example.com/content_warnings/eye%20contact"
  }
]

What about posts without content warnings?

This is a social problem that can not be solved technically. However, there are ways to better the situation somewhat. Some posts can be understood good enough by machines to add automatic content warnings. Say for example, you write a post with the text “Look at these cute squirrels! 😍” and an image attached. The server could have an algorithm like this: If “squirrel” is in text and an attachment is present, add content warning “squirrel”. Automatic content warnings should have a different type than user selected content warnings to make it possible for users to ignore these.

What about the compatibility to existing implementations?

If a post (“status” in ActivityPub terms) does not have a content warning field, the “summary” could be translated, either by looking for words that match content warnings in our database or by simply splitting the text on “,” and “;”[3]. These should also use the type for automatic content warnings.

Example of an automatically translated status with summary:
"summary": "squirrels, eye contact",
"tag": [
  {
    "type": "content_warning_auto",
    "name": "squirrels",
    "id": "https://example.com/content_warnings/squirrels"
  },
  {
    "type": "content_warning_auto",
    "name": "eye contact",
    "id": "https://example.com/content_warnings/eye%20contact"
  }
]

How likely is it that existing Fediverse servers change their implementation?

ActivityPub seems to specify whatever Mastodon does[2] and Mastodon does not seem interested to improve the situation.[4][5] Pleroma seems to copy whatever Mastodon does. I don’t know about the other server implementations. The best route right now appears to be to try to get a proper content warning feature into ActivityPub and then hope the servers implement it. Or develop something better than ActivityPub.