The surprising truth about data-driven dictatorships
Here’s the “dictator’s dilemma”: they want to block their country’s frustrated elites from mobilizing against them, so they censor public communications; but they also want to know what their people truly believe, so they can head off simmering resentments before they boil over into regime-toppling revolutions.
These two strategies are in tension: the more you censor, the less you know about the true feelings of your citizens and the easier it will be to miss serious problems until they spill over into the streets (think: the fall of the Berlin Wall or Tunisia before the Arab Spring). Dictators try to square this circle with things like private opinion polling or petition systems, but these capture a small slice of the potentially destabiziling moods circulating in the body politic.
Enter AI: back in 2018, Yuval Harari proposed that AI would supercharge dictatorships by mining and summarizing the public mood — as captured on social media — allowing dictators to tack into serious discontent and diffuse it before it erupted into unequenchable wildfire:
Harari wrote that “the desire to concentrate all information and power in one place may become [dictators] decisive advantage in the 21st century.” But other political scientists sharply disagreed. Last year, Henry Farrell, Jeremy Wallace and Abraham Newman published a thoroughgoing rebuttal to Harari in Foreign Affairs:
They argued that — like everyone who gets excited about AI, only to have their hopes dashed — dictators seeking to use AI to understand the public mood would run into serious training data bias problems. After all, people living under dictatorships know that spouting off about their discontent and desire for change is a risky business, so they will self-censor on social media. That’s true even if a person isn’t afraid of retaliation: if you know that using certain words or phrases in a post will get it autoblocked by a censorbot, what’s the point of trying to use those words?
The phrase “Garbage In, Garbage Out” dates back to 1957. That’s how long we’ve known that a computer that operates on bad data will barf up bad conclusions. But this is a very inconvenient truth for AI weirdos: having given up on manually assembling training data based on careful human judgment with multiple review steps, the AI industry “pivoted” to mass ingestion of scraped data from the whole internet.
But adding more unreliable data to an unreliable dataset doesn’t improve its reliability. GIGO is the iron law of computing, and you can’t repeal it by shoveling more garbage into the top of the training funnel:
When it comes to “AI” that’s used for decision support — that is, when an algorithm tells humans what to do and they do it — then you get something worse than Garbage In, Garbage Out — you get Garbage In, Garbage Out, Garbage Back In Again. That’s when the AI spits out something wrong, and then another AI sucks up that wrong conclusion and uses it to generate more conclusions.
To see this in action, consider the deeply flawed predictive policing systems that cities around the world rely on. These systems suck up crime data from the cops, then predict where crime is going to be, and send cops to those “hotspots” to do things like throw Black kids up against a wall and make them turn out their pockets, or pull over drivers and search their cars after pretending to have smelled cannabis.
The problem here is that “crime the police detected” isn’t the same as “crime.” You only find crime where you look for it. For example, there are far more incidents of domestic abuse reported in apartment buildings than in fully detached homes. That’s not because apartment dwellers are more likely to be wife-beaters: it’s because domestic abuse is most often reported by a neighbor who hears it through the walls.
So if your cops practice racially biased policing (I know, this is hard to imagine, but stay with me /s), then the crime they detect will already be a function of bias. If you only ever throw Black kids up against a wall and turn out their pockets, then every knife and dime-bag you find in someone’s pockets will come from some Black kid the cops decided to harass.
That’s life without AI. But now let’s throw in predictive policing: feed your “knives found in pockets” data to an algorithm and ask it to predict where there are more knives in pockets, and it will send you back to that Black neighborhood and tell you do throw even more Black kids up against a wall and search their pockets. The more you do this, the more knives you’ll find, and the more you’ll go back and do it again.
This is what Patrick Ball from the Human Rights Data Analysis Group calls “empiricism washing”: take a biased procedure and feed it to an algorithm, and then you get to go and do more biased procedures, and whenever anyone accuses you of bias, you can insist that you’re just following an empirical conclusion of a neutral algorithm, because “math can’t be racist.”
HRDAG has done excellent work on this, finding a natural experiment that makes the problem of GIGOGBI crystal clear. The National Survey On Drug Use and Health produces the gold standard snapshot of drug use in America. Kristian Lum and William Isaac took Oakland’s drug arrest data from 2010 and asked Predpol, a leading predictive policing product, to predict where Oakland’s 2011 drug use would take place.
[Image ID: (a) Number of drug arrests made by Oakland police department, 2010. (1) West Oakland, (2) International Boulevard. (b) Estimated number of drug users, based on 2011 National Survey on Drug Use and Health]
Then, they compared those predictions to the outcomes of the 2011 survey, which shows where actual drug use took place. The two maps couldn’t be more different:
Predpol told cops to go and look for drug use in a predominantly Black, working class neighborhood. Meanwhile the NSDUH survey showed the actual drug use took place all over Oakland, with a higher concentration in the Berkeley-neighboring student neighborhood.
What’s even more vivid is what happens when you simulate running Predpol on the new arrest data that would be generated by cops following its recommendations. If the cops went to that Black neighborhood and found more drugs there and told Predpol about it, the recommendation gets stronger and more confident.
In other words, GIGOGBI is a system for concentrating bias. Even trace amounts of bias in the original training data get refined and magnified when they are output though a decision support system that directs humans to go an act on that output. Algorithms are to bias what centrifuges are to radioactive ore: a way to turn minute amounts of bias into pluripotent, indestructible toxic waste.
There’s a great name for an AI that’s trained on an AI’s output, courtesy of Jathan Sadowski: “Habsburg AI.”
And that brings me back to the Dictator’s Dilemma. If your citizens are self-censoring in order to avoid retaliation or algorithmic shadowbanning, then the AI you train on their posts in order to find out what they’re really thinking will steer you in the opposite direction, so you make bad policies that make people angrier and destabilize things more.
Or at least, that was Farrell(et al)’s theory. And for many years, that’s where the debate over AI and dictatorship has stalled: theory vs theory. But now, there’s some empirical data on this, thanks to the “The Digital Dictator’s Dilemma,” a new paper from UCSD PhD candidate Eddie Yang:
https://www.eddieyang.net/research/DDD.pdf
Yang figured out a way to test these dueling hypotheses. He got 10 million Chinese social media posts from the start of the pandemic, before companies like Weibo were required to censor certain pandemic-related posts as politically sensitive. Yang treats these posts as a robust snapshot of public opinion: because there was no censorship of pandemic-related chatter, Chinese users were free to post anything they wanted without having to self-censor for fear of retaliation or deletion.
Next, Yang acquired the censorship model used by a real Chinese social media company to decide which posts should be blocked. Using this, he was able to determine which of the posts in the original set would be censored today in China.
That means that Yang knows that the “real” sentiment in the Chinese social media snapshot is, and what Chinese authorities would believe it to be if Chinese users were self-censoring all the posts that would be flagged by censorware today.
From here, Yang was able to play with the knobs, and determine how “preference-falsification” (when users lie about their feelings) and self-censorship would give a dictatorship a misleading view of public sentiment. What he finds is that the more repressive a regime is — the more people are incentivized to falsify or censor their views — the worse the system gets at uncovering the true public mood.
What’s more, adding additional (bad) data to the system doesn’t fix this “missing data” problem. GIGO remains an iron law of computing in this context, too.
But it gets better (or worse, I guess): Yang models a “crisis” scenario in which users stop self-censoring and start articulating their true views (because they’ve run out of fucks to give). This is the most dangerous moment for a dictator, and depending on the dictatorship handles it, they either get another decade or rule, or they wake up with guillotines on their lawns.
But “crisis” is where AI performs the worst. Trained on the “status quo” data where users are continuously self-censoring and preference-falsifying, AI has no clue how to handle the unvarnished truth. Both its recommendations about what to censor and its summaries of public sentiment are the least accurate when crisis erupts.
But here’s an interesting wrinkle: Yang scraped a bunch of Chinese users’ posts from Twitter — which the Chinese government doesn’t get to censor (yet) or spy on (yet) — and fed them to the model. He hypothesized that when Chinese users post to American social media, they don’t self-censor or preference-falsify, so this data should help the model improve its accuracy.
He was right — the model got significantly better once it ingested data from Twitter than when it was working solely from Weibo posts. And Yang notes that dictatorships all over the world are widely understood to be scraping western/northern social media.
But even though Twitter data improved the model’s accuracy, it was still wildly inaccurate, compared to the same model trained on a full set of un-self-censored, un-falsified data. GIGO is not an option, it’s the law (of computing).
Writing about the study on Crooked Timber, Farrell notes that as the world fills up with “garbage and noise” (he invokes Philip K Dick’s delighted coinage “gubbish”), “approximately correct knowledge becomes the scarce and valuable resource.”
https://crookedtimber.org/2023/07/25/51610/
This “probably approximately correct knowledge” comes from humans, not LLMs or AI, and so “the social applications of machine learning in non-authoritarian societies are just as parasitic on these forms of human knowledge production as authoritarian governments.”
The Clarion Science Fiction and Fantasy Writers’ Workshop summer fundraiser is almost over! I am an alum, instructor and volunteer board member for this nonprofit workshop whose alums include Octavia Butler, Kim Stanley Robinson, Bruce Sterling, Nalo Hopkinson, Kameron Hurley, Nnedi Okorafor, Lucius Shepard, and Ted Chiang! Your donations will help us subsidize tuition for students, making Clarion — and sf/f — more accessible for all kinds of writers.
Libro.fm is the indie-bookstore-friendly, DRM-free audiobook alternative to Audible, the Amazon-owned monopolist that locks every book you buy to Amazon forever. When you buy a book on Libro, they share some of the purchase price with a local indie bookstore of your choosing (Libro is the best partner I have in selling my own DRM-free audiobooks!). As of today, Libro is even better, because it’s available in five new territories and currencies: Canada, the UK, the EU, Australia and New Zealand!
[Image ID: An altered image of the Nuremberg rally, with ranked lines of soldiers facing a towering figure in a many-ribboned soldier's coat. He wears a high-peaked cap with a microchip in place of insignia. His head has been replaced with the menacing red eye of HAL9000 from Stanley Kubrick's '2001: A Space Odyssey.' The sky behind him is filled with a 'code waterfall' from 'The Matrix.']
Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.
✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality
Anya is LIVE right now
FREE
Free to watch • No registration required • HD streaming
Deleted My YouTube Channel(s). Why? Because Google is Evil
After over 15 years of being on YouTube, I deleted both of my channels. This included my Sims 3 channel, The Sims 1 Depot, which had videos going back at least a decade. I finally had enough.
This was not a rash decision. It's something I've been weighing for a long time but didn't do because I was hoping and praying that with a changeover in management, things would get better. However, as of 2025, things became worse. There are so many things that have become evil about YouTube but for the sake of brevity, I will drill my issues down to the following:
#1. YouTube Doubles Down on Hate and Divisive Content for the "Engagement"
Google has no interest in promoting wholesome content that's of interest to people in any community. It wholeheartedly promotes and supports incendiary, inflammatory content--as well as "drama" videos--over all other types of videos.
If you've ever been on YouTube for any subject--not just The Sims, but travel, sports, fashion, DIY, movies, TV shows, etc.--you'll notice that the more videos you watch and the longer you've been on the platform, the more bigoted content, inflammatory ragebait and drama gets pushed into your feed. No matter what the shills say, it has nothing to do with your viewing habits and there's absolutely nothing you can do to stop getting this content. It's all algorithms. You can click "Not Interested" a million times, downvote a million times, hide channels, unsubscribe and try to subscribe to the types of content you want to watch. Doesn't matter. You will get served this content if you're on YouTube long enough:
None of this is conspiracy theory. It's something that thousands have observed and discussed over the years ad nauseum. Hatred and division is Google's "bread and butter" (makes the most money), so that's what it promotes and encourages to be produced on its platform:
#2. Google's Algorithm Invites Trolls to Target YouTubers
Google recruits trolls, via algorithms, to target specific YouTubers based on their race, gender or nationality. One of the worst examples--and a major factor in me deciding to quit YouTube--was what happened to FakeGamerGirl. A simmer of Arab descent, she started getting targeted by a lot of racists on YouTube. The hate campaigns got so bad that she decided to quit.
Keep in mind that there aren't rabid anti-Arab racists in The Sims community. What YouTube did was deliberately serve her content to bigots via the side feed and home page, regardless of whether they played The Sims or not. Those racists then flocked to her channel like bees to honey, pretending to be Sims 3 players when they never were.
This leads me to the other factor that played a major part in me quitting YouTube:
#3. The "Toxic Community" Lie
You know how conventional wisdom says that every single community has a toxic fanbase, which can explain all the virulent racism, sexism and homophobia? As long as I've been on the internet (since 1999), I know that this is a lie. What Big Tech does is help and in some cases encourage trolls from various extremist camps to raid communities. It then covers its tracks by claiming that these racists, trolls and bigots were an intrinsic part of these "communities" from the start and that it's just a natural thing for every fanbase on the planet to turn into a toxic hate-filled stew.
How do I know the whole "toxic fandom" thing is a lie? For one, when it's very obvious that a fandom has been raided, Big Tech does nothing about it until a major sponsor pulls out or the whole thing threatens to turn into a PR nightmare. For instance, Amazon and Rotten Tomatoes kept silent when various factions from Stormfront, 4chan and other platforms openly bragged and raided user reviews and the IMDB forums until their behavior hit mainstream media. But until then, you could've pressed the "report" button until the cows came home and nothing would happen.
Another reason why I know that Big Tech companies like Google invites raids is when you start comparing notes from every so-called "toxic fandom", you notice a pattern. Whether it's The Sims, Disney, Star Wars, Star Trek, Rick and Morty, comic book movies, etc. every toxic fandom sounds exactly the same, speaks in the same voice. It doesn't make sense, considering that so many of these fandoms cater to radically different demographics with particular worldviews. But go figure, Star Wars and Star Trek--two progressive IPs that had minority captains and female badasses--has a rabid misogynistic and racist fanbase that sounds exactly like the same misogynistic and racist fanbase for She Ra: Princess of Power, Disney Films and Velma. Does every fanbase has its toxic fandom, or is it just that Big Tech deliberately opens them up to extremist camps?
#4. YouTube Algorithm's Toxic Influence on Creator Ecosystem
Another factor that made me decide to quit YouTube is the corrosive impact that the algorithm has on the ecosystem. I've seen this happen time and time again: channels that started off so wholesome, helpful and informative over time going down the rabbit hole of dog whistles, racism, racebaiting and conspiracy theories when the YouTube algorithm punishes them. The algorithm has such allure that even people who have no business going down it will jump on the bandwagon.
Meanwhile, the YouTubers trying to avoid all of this have been getting increasingly buried under tons of Alt Right, Redpill and drama-based content, to where it's a chore to find them using search. I can't remember the last time I saw a legit video game or movie review in my feed, but there'll be tons of Clownfish TV, Critical Drinker and other assholes being recommended all the time.
#5. No Escaping the Evil, Toxic Algorithm No Matter What You Watch
By 2025, I was pretty much done with YouTube. Yet I was still wavering in my decision to close my channels. However, in June, I finally experienced the straw that broke the camel's back.
After a death in the family, I needed to just take some time off to decompress. So, I decided to take a cruise. Because I'm a first timer, I watched a bunch of cruise videos from YouTube to get a feel for things.
Harmless, right? So what do you think happened after a few months of watching videos about cruising? Literally, in the week leading up to my cruise, I started to get a large batch of thinly veiled racist videos in my feed about Carnival Cruises. Then, almost as if YouTube could tell I was avoiding them, the videos I got served became even more racist. It was as if the algorithm went, "Look at all of these fights on Carnival (wink wink, nudge nudge, look at what race they are, wink, wink, nudge, nudge)..." to, "Oh? These videos aren't capturing your attention enough, so let's just pushing blatantly racist videos, with titles about 'ghetto' and 'ratchet' behavior ruining Carnival. That'll really catch your attention!"
When I came back from my cruise, I unsubscribed from every cruise channel so I wouldn't keep getting racist Carnival Cruise videos in my feed anymore. Then this happened. A few days ago, I watched a local news story about a hit-and-run in NYC, and 99% of the comments were racist comments about blacks and Carnival Cruise. It was almost as if YouTube went, "Oh, how cute. I see what you did there. First you avoided this racist meme by not clicking on the videos in your feed. Then you unsubscribed to all the cruise channels. Okay, you're from NYC. Here's a totally unrelated news story about a hit and run in NYC, but where almost all of the comments are raging about blacks on Carnival Cruise."
Like I said: no escaping the algorithm, no escaping the evil that is YouTube. So, I called it quits. I closed my channels and am walking away for good. It's so obvious that Google is using YouTube as a front to serve incendiary content and that contributors posting in good faith have been nothing but dupes whose content gets used to grease the algorithms that serve this content.
I refuse to be a dupe or become a party to this, so I've pulled out. It's been a great run, but as the great Danny Glover once said:
Apologies to anyone who enjoyed my channel and subscribed to me all these years. Again, quitting YouTube wasn't a rash decision, and I hope that someday soon we can meet again under a different, much better video sharing platform, where the emphasis is on fun, creativity and positivity, not the toxic wasteland that is YouTube now.
Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.
✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality
Anya is LIVE right now
FREE
Free to watch • No registration required • HD streaming
Last summer, as a spike in violent crime hit New Orleans, the city council voted to allow police to use facial-recognition software to track down suspects. It was billed as an effective, fair tool to ID criminals quickly.
A year after the system went online, data show the results have been almost exactly the opposite. Records obtained and analyzed by POLITICO show the practice failed to ID suspects a majority of the time and is disproportionately used on Black people.
We reviewed nearly a year’s worth of New Orleans facial recognition requests, sent for serious felony crimes including murder and armed robbery. In that time, New Orleans PD sent 19 requests. Of the 15 that went through:
14 were for Black suspects
9 failed to make a match
Half of the 6 matches were wrong
1 arrest was made
While it hasn’t led to any false arrests, police facial identification in New Orleans appears to confirm what civil rights advocates have argued for years: that it amplifies, rather than corrects, the underlying human biases of the authorities that use them.
U.S. lawmakers of both parties have tried for years to limit how police can use facial recognition, but have yet to enact any laws. Some states have passed limited rules, like those preventing its use on body cameras in California or banning its use in schools in New York.
A few left-leaning cities have fully banned law enforcement use of the technology. For two years, in the wake of the George Floyd protests, New Orleans was one of them.
“This department hung their hat on this,” said New Orleans Councilmember At-Large JP Morrell, a Democrat who voted against lifting the ban and has seen the NOPD data. Its use of the system, he says, has been “wholly ineffective and pretty obviously racist.” (NOPD denies that its usage of facial recognition is racially biased).
Politically, New Orleans’ City Council is split on facial recognition, but a slim majority of its members — alongside the police, mayor and local businesses — still support its use, despite the results of the past year.