Auditing the audit: what 1.58B moderation actions can and can't tell us
Spent most of this week with a paper that on paper (heh) sounds boring â an audit of platform moderation logs around the 2024 European Parliament elections â and ended up rewriting a chunk of how I want to frame Chapter 4. It's Tessa, Shahi, Trujillo and Cresci's When Transparency Falls Short (arxiv 2604.19285), and it pulls 1.58 billion moderation actions out of the EU's Digital Services Act Transparency Database across 8 months and 8 platforms. The headline finding is a null result: nothing in the data looks like platforms adapting their moderation around the highest-stakes democratic event in their year. Five orthogonal time-series methods, all returning "no signal." That's a much stronger statement than any single test would be.
But the thing I keep turning over is the X anomaly inside the table. TikTok reported 646 million Statements of Reasons over the window. Instagram reported 300 million. Facebook reported 260 million. X reported 628 thousand â three orders of magnitude below platforms of vaguely comparable scale. And on top of that low volume, X reports a moderation delay of essentially zero across the board, while also claiming 99% of its moderation is manual. Pause on that for a second. Zero-delay, 99%-manual, at any non-trivial volume, is mathematically incoherent. You cannot have humans reviewing content with effectively no latency unless the volume is so small that the manual path is genuinely feasible â in which case the threat being moderated must also be vanishingly small.
The kicker: X's own stated moderation focus inside the DSA-TDB filings is deepfakes and synthetic content. So the public claim is "deepfakes are the priority" and the reported behavior is "we manually moderate them in essentially zero time, almost never." Those two things cannot both be true at the platform's actual scale, and the paper flags this with admirable understatement. I keep thinking the gap between stated moderation focus and reported moderation behavior â audited at billion-action scale â is itself a research object. Maybe a side paper. Maybe just a weapon to keep in the back pocket every time someone argues self-regulation is working.
The reason this lands so hard for me right now is that it sits next to the CONVEX paper from last week â 150K AI-generated posts on X, with the headline finding that synthetic content is going passively viral (lots of reshares, very few replies or quote tweets). Stack the two: AI content gets disproportionate spread on X specifically, and the same platform reports almost no moderation activity, all of it instantaneously resolved, while declaring deepfakes its priority. The picture that emerges isn't even controversial. It's just arithmetic. And the EU Commission has, in fact, opened formal proceedings against X for electoral integrity failures during this exact window. The audit didn't surface the failure, but it did predict it â the database showed nothing happening, and now we know "nothing happening" was the actual state of affairs.
What I'm taking from this for the thesis: my Chapter 4 argument has been "platform-level moderation is insufficient, detection systems should be deployable independently of platform cooperation." I had the position; I didn't have the empirical receipt. Now I do, at 1.58 billion rows. The harder lesson is methodological â when the database that's supposed to enable accountability is itself partially blind (the DSA-TDB has no dedicated synthetic-media category, and the optional schema fields where granular moderation context would live are mostly empty), the audit has to be honest about what it can and can't see. Five methods all returning "no signal" doesn't prove nothing changed. It proves nothing visible-to-the-database changed. Which is, I'm slowly realizing, the more politically important finding of the two.













