How Math Can Be Racist: Giraffing
You may have heard about AOC catching a lot of flack from conservatives for claiming that computer algorithms can be biased â in the sense of being racist, sexist, et cetera. How, these people asked, can something made of math be biased? Itâs math, so it must be objectively correct, right?
Well, any computer scientist or experienced programmer knows right away that being âmade of mathâ does not demonstrate anything about the accuracy or utility of a program. Math is a lot more of a social construct than most people think. But we donât need to spend years taking classes in algorithms to understand how and why the types of algorithms used in artificial intelligence systems today can be tremendously biased. Here, look at these four photos. What do they have in common?
Youâre probably thinking âtheyâre all outdoors, I guessâŚ?â But they have something much more profound in common than that. Theyâre all photos of giraffes!
At least, thatâs what Microsoftâs world-class, state-of-the-art artificial intelligence claimed when shown each of these pictures. You donât see any giraffes? Well, the computer said so. It used math to come to this conclusion. Lots of math. And data! This AI learns from photographs, which of course depict the hard truth of reality. Right?
It turns out that mistaking things for giraffes is a very common issue with computer vision systems. How? Why? Itâs quite simple. Humans universally find giraffes very interesting. How many depictions of a giraffe have you seen in your life? And how many actual giraffes have you seen? Many people have seen one or two, if theyâre lucky. But can you imagine seeing a real giraffe and not stopping to take a photo? Everyone takes a photo if they see a giraffe. Itâs a giraffe!
The end result is that giraffes are vastly overrepresented in photo databases compared to the real world. Artificial intelligence systems are trained on massive amounts of âreal world dataâ such as labeled photos. This means the learning algorithms see a lot of giraffes⌠and they come to the mathematically correct conclusion: giraffes are everywhere. One should reasonably expect there might be a giraffe in any random image.
Look at the four photos again. Each of them contains a strong vertical element. The computer vision system has incorrectly come to the belief that long, near-vertical lines in general are very likely to be a giraffeâs neck. This might be a âcorrectâ adaptation if the vision systemâs only task was sorting pictures of zoo animals. But since its goal is to recognize everything in the real world, itâs a very bad adaptation. Giraffes are actually very unlikely.
Now, hereâs the clincher: there are thousands and thousands of things that are over-represented or under-represented in photo databases. The AI is thoroughly giraffed in more ways than we could possibly guess or anticipate. How do you even measure such a thing? You only have the data you have â the dataset you trained the AI with in the first place.
This is how computer algorithms âmade of mathâ can be sexist, racist, or any other sort of prejudiced that a human can be. Face photo datasets are highly biased towards certain types of appearances. Datasets about what demographics are most likely to commit crimes were assembled by humans who may have made fundamentally racist decisions about who did and didnât commit a crime. All datasets have their giraffes. Hereâs a real world example where the giraffe was the name âJared.â
Any time âa computerâ or âmathâ is involved in making decisions, you need to ask yourself: whatâs been giraffed up this time?
Thanks to Janelle Shane whose tweet showing her asking an AI how many giraffes are in the photograph of The Dress prompted this post.
Please note that Microsoft does try to take steps to correct their computer vision systemâs errors, so the above photos may have improved their detections since they were first evaluated by @picdescbot. (They did all still register as giraffes on 31 Jan 2019.)
















