Hitting the Books: How biased AI can hurt users or boost a business's bottom line

“Why would you build an expensive algorithm if you can’t bias it in your favor?” - former American Airlines president, Robert Crandall

Former Senior Editor

Sat, Apr 10, 2021, 11:30 AM·7 min read

I'm not sure why people are worried about AI surpassing humanity's collective intellect any time soon, we can't even get the systems we have today to quit emulating some of our more ignoble tendencies. Or rather, perhaps we humans must first detangle ourselves from these very same biases before expecting them eliminated from our algorithms.

In A Citizen's Guide to Artificial Intelligence, John Zerilli leads a host of prominent researchers and authors in the field of AI and machine learning to present readers with an approachable, holistic examination of both the history and current state of the art, the potential benefits of and challenges facing ever-improving AI technology, and how this rapidly advancing field could influence society for decades to come.

A Citizen's Guide to AI by John Zerilli — MIT Press

Excerpted from “A Citizen's Guide to AI” Copyright © 2021 By John Zerilli with John Danaher, James Maclaurin, Colin Gavaghan, Alistair Knott, Joy Liddicoat and Merel Noorman. Used with permission of the publisher, MIT Press.

Human bias is a mix of hardwired and learned biases, some of which are sensible (such as “you should wash your hands before eating”), and others of which are plainly false (such as “atheists have no morals”). Artificial intelligence likewise suffers from both built-in and learned biases, but the mechanisms that produce AI’s built-in biases are different from the evolutionary ones that produce the psychological heuristics and biases of human reasoners.

One group of mechanisms stems from decisions about how practical problems are to be solved in AI. These decisions often incorporate programmers’ sometimes-biased expectations about how the world works. Imagine you’ve been tasked with designing a machine learning system for landlords who want to find good tenants. It’s a perfectly sensible question to ask, but where should you go looking for the data that will answer it? There are many variables you might choose to use in training your system — age, income, sex, current postcode, high school attended, solvency, character, alcohol consumption? Leaving aside variables that are often misreported (like alcohol consumption) or legally prohibited as discriminatory grounds of reasoning (like sex or age), the choices you make are likely to depend at least to some degree on your own beliefs about which things influence the behavior of tenants. Such beliefs will produce bias in the algorithm’s output, particularly if developers omit variables which are actually predictive of being a good tenant, and so harm individuals who would otherwise make good tenants but won’t be identified as such.

The same problem will appear again when decisions have to be made about the way data is to be collected and labeled. These decisions often won’t be visible to the people using the algorithms. Some of the information will be deemed commercially sensitive. Some will just be forgotten. The failure to document potential sources of bias can be particularly problematic when an AI designed for one purpose gets co-opted in the service of another — as when a credit score is used to assess someone’s suitability as an employee. The danger inherent in adapting AI from one context to another has recently been dubbed the “portability trap.” It’s a trap because it has the potential to degrade both the accuracy and fairness of the repurposed algorithms.

Consider also a system like TurnItIn. It’s one of many anti-plagiarism systems used by universities. Its makers say that it trawls 9.5 billion web pages (including common research sources such as online course notes and reference works like Wikipedia). It also maintains a database of essays previously submitted through TurnItIn that, according to its marketing material, grows by more than fifty thousand essays per day. Student-submitted essays are then compared with this information to detect plagiarism. Of course, there will always be some similarities if a student’s work is compared to the essays of large numbers of other students writing on common academic topics. To get around this problem, its makers chose to compare relatively long strings of characters. Lucas Introna, a professor of organization, technology and ethics at Lancaster University, claims that TurnItIn is biased.

TurnItIn is designed to detect copying but all essays contain something like copying. Paraphrasing is the process of putting other people’s ideas into your own words, demonstrating to the marker that you understand the ideas in question. It turns out that there’s a difference in the paraphrasing of native and nonnative speakers of a language. People learning a new language write using familiar and sometimes lengthy fragments of text to ensure they’re getting the vocabulary and structure of expressions correct. This means that the paraphrasing of nonnative speakers of a language will often contain longer fragments of the original. Both groups are paraphrasing, not cheating, but the nonnative speakers get persistently higher plagiarism scores. So a system designed in part to minimize biases from professors unconsciously influenced by gender and ethnicity seems to inadvertently produce a new form of bias because of the way it handles data.

There’s also a long history of built-in biases deliberately designed for commercial gain. One of the greatest successes in the history of AI is the development of recommender systems that can quickly and efficiently find consumers the cheapest hotel, the most direct flight, or the books and music that best suit their tastes. The design of these algorithms has become extremely important to merchants — and not just online merchants. If the design of such a system meant your restaurant never came up in a search, your business would definitely take a hit. The problem gets worse the more recommender systems become entrenched and effectively compulsory in certain industries. It can set up a dangerous conflict of interest if the same company that owns the recommender system also owns some of the products or services it’s recommending.

This problem was first documented in the 1960s after the launch of the SABRE airline reservation and scheduling system jointly developed by IBM and American Airlines. It was a huge advance over call center operators armed with seating charts and drawing pins, but it soon became apparent that users wanted a system that could compare the services offered by a range of airlines. A descendent of the resulting recommender engine is still in use, driving services such as Expedia and Travelocity. It wasn’t lost on American Airlines that their new system was, in effect, advertising the wares of their competitors. So they set about investigating ways in which search results could be presented so that users would more often select American Airlines. So although the system would be driven by information from many airlines, it would systematically bias the purchasing habits of users toward American Airlines. Staff called this strategy screen science.

American Airlines’ screen science didn’t go unnoticed. Travel agents soon spotted that SABRE’s top recommendation was often worse than those further down the page. Eventually the president of American Airlines, Robert L. Crandall, was called to testify before Congress. Astonishingly, Crandall was completely unrepentant, testifying that “the preferential display of our flights, and the corresponding increase in our market share, is the competitive raison d’être for having created the [SABRE] system in the first place.” Crandall’s justification has been christened “Crandall’s complaint,” namely, “Why would you build and operate an expensive algorithm if you can’t bias it in your favor?”

Looking back, Crandall’s complaint seems rather quaint. There are many ways recommender engines can be monetized. They don’t need to produce biased results in order to be financially viable. That said, screen science hasn’t gone away. There continue to be allegations that recommender engines are biased toward the products of their makers. Ben Edelman collated all the studies in which Google was found to promote its own products via prominent placements in such results. These include Google Blog Search, Google Book Search, Google Flight Search, Google Health, Google Hotel Finder, Google Images, Google Maps, Google News, Google Places, Google+, Google Scholar, Google Shopping, and Google Video.

Deliberate bias doesn’t only influence what you are offered by recommender engines. It can also influence what you’re charged for the services recommended to you. Search personalization has made it easier for companies to engage in dynamic pricing. In 2012, an investigation by the Wall Street Journal found that the recommender system employed by a travel company called Orbiz appeared to be recommending more expensive accommodation to Mac users than to Windows users.

Engadget
ISPs are fighting to raise the price of low-income broadband
Internet service providers are objected to the lower rates they need to offer lower income customers if they want to obtain government funds from a new Internet access program.
Engadget
Amazon is giving The Boys the prequel treatment
The cast and crew of Amazon's The Boys announced a bunch of new spinoffs for the supe action series.
Engadget
You can date everything in Date Everything!
Date Everything! is an upcoming dating sim game that lets you date evert
Engadget
The Bioshock movie is still happening but with a reduced budget
The Bioshock movie is still happening, but with steep budget cuts. It’s being reconfigured to become a ‘more personal’ film.
Engadget
Warner Bros. Discovery sues the NBA in a last-ditch effort to block Amazon’s new streaming package
Warner Bros. Discovery followed through on its threat to “take appropriate action” against the NBA for rejecting its broadcasting rights offer. On Friday, the media company sued the league after the NBA turned down its bid to match Amazon’s streaming package.
Engadget
Apple’s M3 MacBook Air with 16GB of RAM is $200 off right now
Apple’s M3 MacBook Air combines Apple’s lightest and thinnest laptop design with the cutting-edge horsepower of the latest Apple silicon chip. You can get the 2024 model on sale for $200 off right now.
Engadget
Here's how to stop Grok's AI models using your tweets for training
X automatically opted users into letting Grok's AI models train on their tweets and interactions with the chatbot. Here's how to opt out.
Engadget
The 10th-generation iPad is back down to $300, plus the rest of this week's best tech deals
The week after Amazon's Prime Day can be a bit sleepy for deals, but we still found a few decent discounts on gear we've tested and recommend.
Engadget
The 65-inch LG C3 OLED TV is nearly half off for today only
The 65-inch LG C3 OLED TV is nearly half off for today only. That brings the set down to a record low of $1,300.
Engadget
NASA's Perseverance rover found a rock on Mars that could indicate ancient life
A Martian rock sample collected by Perseverance contains "chemical signatures and structures" that could've been formed by ancient microbial life from billions of years ago.
Engadget
Apple agrees to stick by Biden administration's voluntary AI safeguards
Apple has joined more than a dozen other tech companies in signing up for the Biden administration's voluntary AI code of practice.
Engadget
North Korean who used ransomware to attack US healthcare providers has been indicted
A grand jury in Kansas City has indicted Rim Jong Hyok, a North Korean intelligence operative who allegedly used ransomware to attack health providers' systems in the US.
Engadget
Samsung Galaxy Ring review: A bit basic, a bit pricey
The Galaxy Ring is comfortable and seemingly basic, but actually delivers detailed insight on your sleep, walks and runs.
Engadget
Apple's 14-inch MacBook Pro laptop with an M3 Pro chip is $300 off at Amazon
Apple's well-specked 14-inch MacBook Pro with an M3 Pro chip, 18GB of memory and 512GB of storage is on sale for the lowest price we've seen yet at Amazon.
Engadget
Gran Turismo 7's more realistic physics update is launching cars into orbit
Gran Turismo 7's latest update is causing some bizarre problems, making cars bounce violently or launch completely into the air.
Engadget
The Morning After: OpenAI reveals its AI-powered search engine, SearchGPT
The biggest news stories this morning: AI video startup Runway reportedly trained on ‘thousands’ of YouTube videos without permission, The best cameras for 2024, WhatsApp hits 100 million monthly active US users.
Engadget
The best fitness trackers for 2024
Here's a list of the best fitness trackers you can buy, as chosen by Engadget editors.
Engadget
The best cameras for 2024
Here's a list of the best cameras you can buy, as chosen by Engadget editors.
Engadget
X's Grok chatbot is misleading voters about the presidential election
Grok's AI chatbot claims that President Biden's name must stay on the ballot in nine states, a claim that is categorically false.
Engadget
Comic-Con leak sparks rumors of two remastered Soul Reaver games
A photo from Comic-Con has leaked possible remasters of two Soul Reaver games from Crystal Dynamics.

Hitting the Books: How biased AI can hurt users or boost a business's bottom line

“Why would you build an expensive algorithm if you can’t bias it in your favor?” - former American Airlines president, Robert Crandall

Latest Stories

ISPs are fighting to raise the price of low-income broadband

Amazon is giving The Boys the prequel treatment

You can date everything in Date Everything!

The Bioshock movie is still happening but with a reduced budget

Warner Bros. Discovery sues the NBA in a last-ditch effort to block Amazon’s new streaming package

Apple’s M3 MacBook Air with 16GB of RAM is $200 off right now

Here's how to stop Grok's AI models using your tweets for training

The 10th-generation iPad is back down to $300, plus the rest of this week's best tech deals

The 65-inch LG C3 OLED TV is nearly half off for today only

NASA's Perseverance rover found a rock on Mars that could indicate ancient life

Apple agrees to stick by Biden administration's voluntary AI safeguards

North Korean who used ransomware to attack US healthcare providers has been indicted

Samsung Galaxy Ring review: A bit basic, a bit pricey

Apple's 14-inch MacBook Pro laptop with an M3 Pro chip is $300 off at Amazon

Gran Turismo 7's more realistic physics update is launching cars into orbit

The Morning After: OpenAI reveals its AI-powered search engine, SearchGPT

The best fitness trackers for 2024

The best cameras for 2024

X's Grok chatbot is misleading voters about the presidential election

Comic-Con leak sparks rumors of two remastered Soul Reaver games

About

Sections

Contribute

Buying Guides