Broken by Design: Facebook’s Failing Automated Moderation System

From misfiring AI to failed appeals and total opacity, Meta’s automated moderation machine is punishing the very people it claims to protect, including me.

Apr 02, 2025

Broken by Design: How Facebook’s Moderation System Silences the Wrong Voices

For years, Facebook has insisted that its automated content moderation systems are the key to making the platform safer, smarter algorithms, faster enforcement, fewer harmful posts slipping through the cracks. The problem is, none of it really works. At least, not in any way that actually makes the platform more fair, more transparent, or more accountable.

If anything, it’s broken. Badly.

The AI can’t understand what it’s looking at

Let’s start with the basics. Facebook’s moderation is powered by AI that supposedly scans posts, images, and videos for content that breaks the rules, hate speech, misinformation, incitement, that sort of thing. It sounds impressive in theory. In practice? It’s a car crash.

Literally, one leaked internal document showed Facebook’s system mistaking a livestream of a mass shooting for someone driving through a carwash. Cockfighting videos were mislabelled as road accidents. It’s the kind of failure that would be almost funny if it weren’t so horrifying.

The tech doesn’t understand satire or context either. In one example, a meme mocking genocide denial was removed as hate speech because the AI, and, initially, even the human reviewer, couldn’t grasp that it was condemning hate, not promoting it. Only after an external appeal was it reinstated. But imagine how many similar posts didn’t get that second look.

This sort of mistake happens all the time, especially outside English-speaking regions, where Facebook’s systems rely on automated translation. In Arabic, content is wrongly removed at an astonishing rate: one study put the over-removal rate at 77%, often targeting political speech or reporting. In 2021, Facebook even removed posts about the Al-Aqsa Mosque because its system thought “Al-Aqsa” referred to a terrorist organisation. You can’t make this stuff up.

The irony is that while good content gets wrongly removed, genuinely harmful material often slips through. Facebook’s own engineers have admitted there’s no reliable database of slurs for many languages, meaning hate speech can spread unchecked in non-English communities. It's not just that the AI gets it wrong, it's that it gets it wrong in both directions.

The appeal process is a black hole

Here’s where it gets worse. If Facebook wrongly flags your content, you’re meant to be able to appeal. There’s a form. There’s a process. There’s a promise that a human will take a look.

That’s the theory.

In reality, the appeal system is vague, slow, and mostly automated. You’ll often get no response, or a generic rejection with no explanation at all. During the pandemic, Facebook paused most appeals entirely, and just let the AI run wild. In Q2 of 2020, they removed hundreds of thousands of posts for “terrorist content” on Instagram, but reinstated just 70. The rest? Gone, even if wrongly flagged.

Even now, appealing feels like screaming into the void. You’re rarely told what you did wrong. Just a blanket phrase like “Community Standards violation.” For most users, there’s no real path to escalate. Facebook’s Oversight Board only looks at a tiny number of cases, the rest of us are stuck with the automated process and hoping it doesn’t glitch out.

My own run-in with the system (and how it spiralled)

I’ve dealt with Facebook’s moderation mess before, but this latest incident genuinely floored me.

It started with a post I made in March, responding to a hateful message I’d received. I didn’t name anyone. I didn’t swear. I simply wrote: “Why are these people so full of hatred?” That’s it. That’s what triggered everything.

Facebook took it down for “hateful conduct.” Not the original abuse — my response to it. The irony is staggering.

I appealed. I thought, surely a human being will look at this and realise I’m clearly condemning hate, not promoting it. But no. A few days later, I got the usual canned response: “We did not restore your post.” No real explanation. Just a blanket confirmation that my appeal had failed.

Then, bizarrely, I got another notification — telling me I could escalate the decision to the Oversight Board. For a brief moment, I thought, okay, maybe there’s still a route to justice here. But when I clicked it?

Error page.
“Sorry, something went wrong.”

Classic Facebook. They dangle accountability in front of you like a carrot on a stick, only to yank it away when you try to grab it. It’s infuriating, and completely opaque.

But that wasn’t the end of it. Off the back of this single post, which again, was me condemning hate speech, Facebook imposed fresh restrictions across my entire account. Monetisation tools were disabled. The page status was downgraded. Every single thing I run through Facebook is now flagged with warnings and limitations. And no, they haven’t told me what content triggered this, or why the restrictions are still in place.

To make matters worse, the system keeps insisting I violated their Community Standards, even though I’ve combed through every post, and none of it comes close to crossing the line. The message is clear: You can lose access, reach, income, and reputation without ever being told what you actually did wrong.

I’ve been through this before. Two years ago, the exact same thing happened. It took months of dead ends and two whole years for the restrictions to eventually lift. And now I’m right back there again. The only Meta contact I had, the one person who seemed to even vaguely understand the system, went on maternity leave and never returned.

And when I post publicly to explain what’s happened? Facebook suppresses the reach of that too.

At this point, trying to fix anything on Facebook feels like being trapped in a Kafka novel. The appeals process doesn’t work. The decisions make no sense. The tools are broken. And the people, when you can reach one, have no idea how the system even functions.

That’s why I’m writing this here. Because if I don’t document it myself, no one else will.

User stories that mirror mine

Palestinian journalists have had their accounts deleted for reporting on conflict. Human rights activists have had livestreams pulled mid-broadcast. Artists have seen paintings removed for “nudity” the algorithm hallucinated. Feminists have been penalised for slogans like “men are trash” while far more hateful, coordinated campaigns were left untouched.

Even worse, disinformation has passed Facebook’s moderation filters. One investigation showed the platform approving ads full of blatant election lies, despite promising otherwise.

There’s a pattern here: Facebook’s system is both overzealous and undercooked. The wrong people get censored. The wrong people get through. And no one’s held accountable.

What Facebook says, and what actually happens

To hear Meta tell it, they’re doing a great job. They publish regular enforcement reports, claiming high takedown rates and huge volumes of harmful content removed “proactively.” They talk a lot about balance, between free expression and safety.

They’ve even added a “satire exception” to their hate speech policy, acknowledging they’ve messed up in the past. In 2025, they admitted that their enforcement had become too automated, promising to scale it back and be more careful.

It’s a welcome admission, but far too late, and far too limited. The company still refuses to publish its list of “dangerous individuals and organisations,” so we have no idea what content might randomly trigger a ban. Even the Oversight Board asked for more transparency, and Facebook said no.

What we’re left with is a platform that says one thing and does another. The public narrative, of steady progress, better AI, fairer systems, just doesn’t match what users actually experience.

Where this leaves us

This isn’t just about a few glitches. It’s about a platform that governs global speech, nearly three billion users, using opaque, error-prone systems that routinely silence the wrong people.

Moderation at this scale is incredibly hard. No one’s denying that. But when the process is broken, the burden falls on the user. Content disappears. Pages vanish. Appeals go unanswered. And you’re left wondering what you did wrong.

If we’re going to have platforms this powerful, they need to be held to a higher standard. That means not just better algorithms, but meaningful transparency, real human oversight, and some basic respect for users.

Because right now, Facebook’s system doesn’t just feel broken. It feels deliberately designed to be unaccountable.

Thanks for reading Jack’s Substack! This post is public so feel free to share it.

Jack’s Substack