See How AI Racially Stereotypes You

Screenshot: ImageNet-Roulette

Computers think they know who you are. Artificial intelligence algorithms can recognise objects from images, even faces. But we rarely get a peek under the hood of facial recognition algorithms. Now, with ImageNet Roulette, we can watch an AI jump to conclusions. Some of its guesses are funny, others… racist.

This Facial-Recognition AI Knows Your Girlfriend's Face Better Than You Do

If someone showed you a group photo containing your boyfriend or girlfriend, you could probably spot them without much trouble. But what if the photo was from ten years ago? Or what if their face was partially obscured? What if it contained thousands of people? That's when you might need artificial intelligence to help you out.

Read more

ImageNet Roulette was designed as part of an art and technology museum exhibit called Training Humans to show us the messy insides of the facial recognition algorithms that we might otherwise assume are straightforward and unbiased. It uses data from one of the large, standard databases used in AI research. Upload a photo, and the algorithm will show you what it thinks you are. My first selfie was labelled “nonsmoker.” Another was just labelled “face.” Our editor-in-chief was labelled a “psycholinguist.” Our social editor was tagged “swot, grind, nerd, wonk, dweeb.” Harmless fun, right?

But then I tried a photo of myself in darker lighting and it came back tagged “Black, Black person, blackamoor, Negro, Negroid” In fact, that seems to be the AI’s label for anyone with dark skin. It gets worse: in Twitter threads discussing the tool, people of colour are consistently getting that tag along with others like “mulatto”, “orphan” and “rape suspect.”

These categories are in the original ImageNet/WordNet database, not added by the makers of the ImageNet Roulette tool. Here’s the note from the latter:

ImageNet Roulette regularly classifies people in dubious and cruel ways. This is because the underlying training data contains those categories (and pictures of people that have been labelled with those categories). We did not make the underlying training data responsible for these classifications. We imported the categories and training images from a popular data set called ImageNet, which was created at Princeton and Stanford University and which is a standard benchmark used in image classification and object detection.

ImageNet Roulette is meant in part to demonstrate how various kinds of politics propagate through technical systems, often without the creators of those systems even being aware of them.

Where do these labels come from?

The tool is based on ImageNet, a database of images and labels that was, and still is, one of the biggest and most accessible sources of training data for image recognition algorithms. As Quartz reports, it was assembled from images collected online, and tagged mainly by Mechanical Turk workers — humans who classified images en masse for pennies.

Because the makers of ImageNet don’t own the photos they collected, they can’t just give them out. But if you’re curious, you can look up the photos’ tags and get a list of URLs that were the original sources of the photos. For example, “person, individual, someone, somebody, mortal, soul” > “scientist” > “linguist, linguistic scientist”> “psycholinguist” leads to this list of photos, many of which seem to have come from university faculty websites.

Browsing those images gives us a peek into what has happened here. The psycholinguists tend to be white folks photographed in that faculty headshot sort of way. If your photo looks like theirs, you may be tagged a psycholinguist. Likewise, other tags depend on how similar you look to training images with those tags. If you are bald, you may be tagged as a skinhead. If you have dark skin and fancy clothes, you may be tagged as as wearing African ceremonial clothing.

Far from being an unbiased, objective algorithm, ImageNet reflects the biases in the images that its creators collected, in the society that produced those images, in the mTurk workers’ minds, in the dictionaries that provided the words for the labels. Some person or computer long ago put “blackamoor” into an online dictionary, but since then many somebodies must have seen “blackamoor” in their AI’s tags (in freaking 2019!) and didn’t say, wow, let’s remove this. Yes, algorithms can be racist and sexist, because they learned it from watching us, alright? They learned it from watching us.


Comments

    I do not think it is surprising. AI is based on algorithms. The latter are based on a series of assumptions which you then program into machine code. The assumptions behind the profiles are mostly general observations which will be racist. If we then add the other many labels that modern society adds such as trans,etc it will be very interesting to see how AI will deal with this.

    Seems to me the problem is that the tags are too specific and they're not teaching the AI what the tag actually applies to. For example, if you have a tag of "doctor" and it's mostly white men you need to teach the algorithm what a doctor actually is. Like making it look for *other* medical indicators, like scrubs or medical equipment, not just letting it assume all white men are doctors.

    Similarly the tags need to be hierarchical. Assuming that we're classifying humans, male/female is a pretty high level tag. Same with other obvious physical traits, young/old, black/white and so on. Since they rely on (usually) fairly obvious traits there is no weird bias at play. So the AI could examine an image and say "male, young, Asian". Other tags (like profession, or "rape suspect") become lower priority and again, would need to rely on other indicators.

Join the discussion!

Trending Stories Right Now