How a Weibo post gets censored: what keywords trigger the automatic review filters
I wrote up on The Citizen Lab's blog a report into the various ways a post can be censored on Weibo (see image above), with a particular emphasis on the "automatic review" filters. Among the preliminary results were the identification of 66 keywords which cannot be posted to Weibo and 133 keywords which cause posts to be invisible and camouflaged.
Below is my summary and links to the data/screenshots (which include lists of the keywords which trigger various kinds of censorship). Hopefully more research will be done into these other paths of censorship.
(For more of my recent writing, see: Guernica interview with Evan Osnos, Technology Review essay on what memes can and can't do in the fight against censorship, World Policy Journal piece on the challenges Chinese tech companies face, Wall Street Journal article on WeChat, and LA Review of Books review of a book on Internet activism.
Summary:
Part 1: Weibo has removed their conventional censorship notice from searches on the site. This may be a bellwether for enhanced censorship on the site due to a particularly sensitive period in Chinese politics or it may mark a shift toward more obscured censorship. Or it may simply be Weibo testing out new tactics.
Part 2: Automatic review of content refers to moderating messages before they get circulated widely. A number of studies have described these mechanisms, which include keywords which trigger your post to become hidden or your inability to post in the first place. We sought to outline with more clarity the pathways a Weibo post might take—from submission to potential censorship.
Part 3: We accomplished the above by posting sensitive keywords and tracking how they were censored or not based on a number of factors. This is only a preliminary set of tests and will hopefully serve as a basic methodology for others who are interested to generate more rigorous testing.
Data and screenshots: We’ve posted to Github the data used for this test, lists of suspected keywords which trigger automatic review, as well as the screenshots of the various censorship messages.
Data
Part 1: Weibo search data of CDT keywords, tested May 2014 and Nov 2014 (CSV)
Part 2 & 3: Weibo censorship testing data of probable automatic review keywords, tested Nov 8, 2014 and Nov 10, 2014
66 keywords which cannot be posted to Weibo according to our preliminary test (explicit filtering)
14 keywords which return a data error (implicit filtering)
133 keywords which cause posts to be invisible and camouflaged
Screenshots:
Explicit filtering message (box 1 in Figure 1): Chinese | English
Implicit filtering message (box 3 in Figure 1): Chinese | English
Comparison of missing messages in timeline due to “camouflaged”/invisible posts (box 2A1 in Figure 1): own timeline vs when viewed by another user
Message when visiting deleted weibo post (box 4a in Figure 1): screenshot
Post deletion message in inbox (box 4a in Figure 1): screenshot
Account warning, 48 hour ban (box 4C in Figure 1): screenshot
Account abnormal notice (box 4C in Figure 1): screenshot








