Hitting the Books: The bias behind AI assistants' failure to understand accents

Can a new technology really be revolutionary if it isn't universally accessible?

Former Senior Editor

Sat, Apr 3, 2021, 11:30 AM·10 min read

The age of being able to speak to our computers just as we do with other humans is finally upon us but voice-activated assistants like Siri, Alexa, and Google Home haven't proven quite as revolutionary — or inclusive — as we'd hoped they'd be. While these systems make a commendable effort to accurately interpret commands regardless of whether you picked up your accent in Houston or Hamburg, for users with heavier or less common accents such as Caribbean or Cockney, requests to their digital assistants are roundly ignored. In her essay "Siri Disciplines" for Your Computer Is on Fire from the MIT Press, Towson University professor Dr. Halcyon M. Lawrence, examens some of the more glaring shortcomings of this nascent technology, how those preventable failures have effectively excluded a sizeable number of potential users and the western biases underpinning the issue.

Voice technologies are routinely described as revolutionary. Aside from the technology’s ability to recognize and replicate human speech and to provide a hands-free environment for users, these revolutionary claims, by tech writers especially, emerge from a number of trends: the growing numbers of people who use these technologies, the increasing sales volume of personal assistants like Amazon’s Alexa or Google Home, and the expanding number of domestic applications that use voice. If you’re a regular user (or designer) of voice technology, then the aforementioned claim may resonate with you, since it is quite possible that your life has been made easier because of it. However, for speakers with a nonstandard accent (for example, African-American vernacular or Cockney), virtual assistants like Siri and Alexa are unresponsive and frustrating — there are numerous YouTube videos that demonstrate and even parody these cases. For me, a speaker of Caribbean English, there is “silence” when I speak to Siri; this means that there are many services, products, and even information that I am not able to access using voice commands. And while I have other ways of accessing these services, products, and information, what is the experience of accented speakers for whom speech is the primary or singular mode of communication? This so-called “revolution” has left them behind. In fact, Mar Hicks pushes us to consider that any technology that reinforces or reinscribes bias is not, in fact, revolutionary but oppressive. The fact that voice technologies do nothing to change existing “social biases and hierarchies,” but instead reinforce them, means that these technologies, while useful to some, are in no way revolutionary.

One might argue that these technologies are nascent, and that more accents will be supported over time. While this might be true, the current trends aren’t compelling. Here are some questions to consider: first, why have accents been primarily developed for Standard English in Western cultures (such as American, Canadian, and British English)? Second, for non-Western cultures for which nonstandard accent support has been developed (such as Singaporean and Hinglish), what is driving these initiatives? Third, why hasn’t there been any nonstandard accent support for minority speakers of English? Finally, what adjustments — and at what cost — must standard and foreign-accented speakers of English make to engage with existing voice technologies?

In his slave biography, Olaudah Equiano said, “I have often taken up a book, and have talked to it, and then put my ears to it, when alone, in hopes it would answer me; and I have been very much concerned when I found it remained silent.” Equiano’s experience with the traditional interface of a book mirrors the silence that nonstandard and foreign speakers of English often encounter when they try to interact with speech technologies like Apple’s Siri, Amazon’s Alexa, or Google Home. Premised on the promise of natural language use for speakers, these technologies encourage their users not to alter their language patterns in any way for successful interactions. If you possess a foreign accent or speak in a dialect, speech technologies practice a form of “othering” that is biased and disciplinary, demanding a form of postcolonial assimilation to standard accents that “silences” the speaker’s sociohistorical reality.

Because these technologies have not been fundamentally designed to process non-standard and foreign-accented speech, speakers often have to make adjustments to their speech — that is, change their accents — to reduce recognition errors. The result is the sustained marginalization and delegitimization of nonstandard and foreign-accented speakers of the English language. This forced assimilation is particularly egregious given that the number of second-language speakers of English has already exceeded the number of native English-language speakers worldwide. The number of English as a Second Language (ESL) speakers will continue to increase as English is used globally as a lingua franca to facilitate commercial, academic, recreational, and technological activities. One implication of this trend is that, over time, native English speakers may exert less influence over the lexical, syntactic, and semantic structures that govern the English language. We are beginning to witness the emergence of hybridized languages like Spanglish, Konglish, and Hinglish, to name a few. Yet despite this trend and the obvious implications, foreign-accented and nonstandard- accented speech is marginally recognized by speech-mediated devices.

Gluszek and Dovidio define an accent as a “manner of pronunciation with other linguistic levels of analysis (grammatical, syntactical, morphological, and lexical), more or less comparable with the standard language.” Accents are particular to an individual, location, or nation, identifying where we live (through geographical or regional accents, like Southern American, Black American, or British Cockney, for example), our socioeconomic status, our ethnicity, our cast, our social class, or our first language. The preference for one’s accent is well-documented. Individuals view people having similar accents to their own more favorably than people having different accents to their own. Research has demonstrated that even babies and children show a preference for their native accent. This is consistent with the theory that similarity in attitudes and features affects both the communication processes and the perceptions that people form about each other.

However, with accents, similarity attraction is not always the case. Researchers have been challenging the similarity-attraction principle, suggesting that it is rather context-specific and that cultural and psychological biases can often lead to positive perceptions of non-similar accents. Dissimilar accents sometimes carry positive stereotypes which lead to positive perceptions of the speech or speaker. Studies also show that even as listeners are exposed to dissimilar accents, they show a preference for standard accents, like standard British English as opposed to nonstandard varieties like Cockney or Scottish accents.

On the other hand, non-similar accents are not always perceived positively, and foreign-accented speakers face many challenges. For example, Flege notes that speaking with a foreign accent entails a variety of possible consequences for second-language (L2) learners, including accent detection, diminished acceptability, diminished intelligibility, and negative evaluation. Perhaps one of the biggest consequences of having a foreign accent is that L2 users oftentimes have difficulty making themselves understood because of pronunciation errors. Even accented native speakers (speakers of variants of British English, like myself, for example) experience similar difficulty because of the differences of pronunciation.

Lambert et al. produced one of the earliest studies on language attitudes that demonstrated language bias. Since then, research has consistently demonstrated negative perceptions about speech produced by nonnative speakers. As speech moves closer to unaccented, listener perceptions become more favorable, and as speech becomes less similar, listener perceptions become less favorable; said another way, the stronger the foreign accent, the less favorable the speech.

Nonnative speech evokes negative stereotypes such that speakers are perceived as less intelligent, less loyal, less competent, poor speakers of the language, and as having weak political skill. But the bias doesn’t stop at perception, as discriminatory practices associated with accents have been documented in housing, employment, court rulings, lower-status job positions, and, for students, the denial of equal opportunities in education.

Despite the documented ways in which persons who speak with an accent routinely experience discriminatory treatment, there is still very little mainstream conversation about accent bias and discrimination. In fall 2017, I received the following student evaluation from one of my students, who was a nonnative speaker of English and a future computer programmer:

I’m gonna be very harsh here but please don’t be offended — your accent is horrible. As a non-native speaker of English I had a very hard time understanding what you are saying. An example that sticks the most is you say goal but I hear ghoul. While it was funny at first it got annoying as the semester progressed. I was left with the impression that you are very proud of your accent, but I think that just like movie starts [sic] acting in movies and changing their accent, when you profess you should try you speak clearly in US accent so that non-native students can understand you better.

While I was taken aback, I shouldn’t have been. David Crystal, a respected and renowned British linguist who is a regular guest on a British radio program, said that people would write in to the show to complain about pronunciations they didn’t like. He states, “It was the extreme nature of the language that always struck me. Listeners didn’t just say they ‘disliked’ something. They used the most emotive words they could think of. They were ‘horrified,’ ‘appalled,’ ‘dumbfounded,’ ‘aghast,’ ‘outraged,’ when they heard something they didn’t like.” Crystal goes on to suggest that reactions are so strong because one’s pronunciation (or accent) is fundamentally about identity. It is about race. It is about class. It is about one’s ethnicity, education, and occupation. When a listener attends to another’s pronunciation, they are ultimately attending to the speaker’s identity.

As I reflected on my student’s “evaluation” of my accent, it struck me that this comment would have incited outrage had it been made about the immutable characteristics of one’s race, ethnicity, or gender; yet when it comes to accents, there is an acceptability about the practice of accent bias, in part because accents are seen as a mutable characteristic of a speaker, changeable at will. As my student noted, after all, movie stars in Hollywood do it all the time, so why couldn’t I? Although individuals have demonstrated the ability to adopt and switch between accents (called code switching), to do so should be a matter of personal choice, as accent is inextricable to one’s identity. To put upon another an expectation of accent change is oppressive; to create conditions where accent choice is not negotiable by the speaker is hostile; to impose an accent upon another is violent.

One domain where accent bias is prevalent is in seemingly benign devices such as public address systems and banking and airline menu systems, to name a few; but the lack of diversity in accents is particularly striking in personal assistants like Apple’s Siri, Amazon’s Alexa, and Google Home. For example, while devices like PA systems only require listeners to comprehend standard accents, personal assistants, on the other hand, require not only comprehension but the performance of standard accents by users. Therefore, these devices demand that the user assimilate to standard Englishes — a practice that, in turn, alienates nonnative and nonstandard English speakers.

Engadget
Budget doorbell camera manufacturer fixes security issues that left users vulnerable to spying
Eken Group has issued a firmware update to resolve major security issues with its doorbell cameras that were uncovered by Consumer Reports. The cameras are sold under the brands Eken, Tuck, Fishbot, Rakeblue, Andoe, Gemee and Luckwolf.
2h ago
Engadget
Google asks court to reject the DOJ’s lawsuit that accuses it of monopolizing ad tech
Google filed a motion on Friday in a Virginia federal court seeking summary judgment for the Department of Justice's antitrust case against it. The DOJ sued Google at the beginning of 2023 for alleged monopolistic practices.
5h ago
Engadget
Some Apple users say they’ve been mysteriously locked out of their accounts
Apple users reported starting Friday night that they were signed out of their Apple IDs and in some cases made to change their passwords. It's unclear yet how many users have been affected or what the underlying cause of the issue is.
7h ago
Engadget
I played Fire Emblem Engage on easy mode, and it got me back into gaming
"Sure, winning battles and matches in more difficult modes will feel more rewarding, but not every gaming experience has to be a challenge."
11h ago
Engadget
Apple has reportedly resumed talks with OpenAI to build a chatbot for the iPhone
Apple has resumed talks with OpenAI, the maker of ChatGPT, to build an AI-powered chatbot into the iPhone, according to a new report.
23h ago
Engadget
The FTC accuses Amazon of using Signal’s auto-deleting messages to erase evidence
As part of its antitrust suit against Amazon, the FTC accused the company of using Signal’s disappearing messages feature to conceal communications.
1d ago
Engadget
Drake deletes AI-generated Tupac track after Shakur’s estate threatened to sue
Drake apparently learned it isn’t wise to mess with Tupac Shakur — even nearly three decades after his death. Tthe Canadian hip-hop artist deleted the post with his track “Taylor Made Freestyle,” which used an AI-generated recreation of Shakur’s voice.
1d ago
Engadget
Aaron Sorkin is working on a Jan. 6-focused follow-up to The Social Network
Aaron Sorkin has announced that he’s currently writing a followup script to The Social Network. The original was his take on the initial years of Facebook.
1d ago
Engadget
Samsung's Galaxy S24 Ultra falls to a new low, plus the rest of the week's best tech deals
This week's best tech deals include a new low on the Samsung Galaxy S24 Ultra, Apple's MacBook Air M3 for $989 and Anker's Soundcore Space A40 earbuds for $49, among others.
1d ago
Engadget
Nikon’s Z8 is a phenomenal mirrorless camera for the price
Nikon's Z8 is one of the highest resolution full-frame cameras with 45 megapixels, but is also one of the fastest and has incredible video capabilities too.
1d ago
Engadget
Some of our favorite Bose headphones and earbuds are back to all-time low prices
Amazon has some of the highest-rated Bose headphones on sale for record-low prices. That includes the Bose QuietComfort Ultra headphones, which have best-in-class active noise cancellation (ANC).
1d ago
Engadget
Apple's 13-inch MacBook Air with the M3 chip has never been cheaper
The latest Apple MacBook Air with the M3 chip is down to a new low price at Amazon.
1d ago
Engadget
NHTSA concludes Tesla Autopilot investigation after linking the system to 14 deaths
The National Highway Traffic Safety Administration has concluded a lengthy investigation into Tesla’s Autopilot system. It found 13 fatal crashes due to misuse and software that doesn’t prioritize driver attentiveness.
1d ago
Engadget
Wacom's first OLED pen display is also the thinnest and lightest it has ever made
Wacom's latest pen display model is called Movink, and it's the company's first with a OLED screen. It's also Wacom's thinnest and lightest option ever, while still offering 13 inches of work space.
1d ago
Engadget
It doesn’t matter how many Vision Pro headsets Apple sells
This week, there was a lot of back and forth about Apple Vision Pro production numbers. Here's why they don't matter.
1d ago
Engadget
The Google Pixel Buds Pro are back on sale for $135
Google's Pixel Buds Pro are on sale for $135 at Wellbots, which is the lowest price we've seen this year.
1d ago
Engadget
Dell XPS 13 and XPS 14 review (2024): Gorgeous laptops with usability quirks
Dell’s XPS 13 and 14 are stylish, portable and powerful. You’ll have to get used to some of its design quirks, though, and it’s far pricier than older models.
1d ago
Engadget
OpenAI's Sam Altman and other tech leaders join the federal AI safety board
Sam Altman, OpenAI's CEO, Microsoft chief Satya Nadella, Alphabet CEO Sundar Pichai are joining the government's Artificial Intelligence Safety and Security Board, according to The Wall Street Journal.
1d ago
Engadget
The best gaming gear for graduates
New graduates have earned the time to unwind after a busy year. These pieces of gaming gear would make great gifts for the new college graduate in your life.
a year ago
Engadget
The Morning After: Apple announces an iPad event for May 7
The biggest news stories this morning: Adobe’s new upscaling tech uses AI to sharpen video, BlizzCon 2024 is canceled, The world’s biggest 3D printer can make a house in under 80 hours.
2d ago

Hitting the Books: The bias behind AI assistants' failure to understand accents

Can a new technology really be revolutionary if it isn't universally accessible?

Latest Stories

Budget doorbell camera manufacturer fixes security issues that left users vulnerable to spying

Google asks court to reject the DOJ’s lawsuit that accuses it of monopolizing ad tech

Some Apple users say they’ve been mysteriously locked out of their accounts

I played Fire Emblem Engage on easy mode, and it got me back into gaming

Apple has reportedly resumed talks with OpenAI to build a chatbot for the iPhone

The FTC accuses Amazon of using Signal’s auto-deleting messages to erase evidence

Drake deletes AI-generated Tupac track after Shakur’s estate threatened to sue

Aaron Sorkin is working on a Jan. 6-focused follow-up to The Social Network

Samsung's Galaxy S24 Ultra falls to a new low, plus the rest of the week's best tech deals

Nikon’s Z8 is a phenomenal mirrorless camera for the price

Some of our favorite Bose headphones and earbuds are back to all-time low prices

Apple's 13-inch MacBook Air with the M3 chip has never been cheaper

NHTSA concludes Tesla Autopilot investigation after linking the system to 14 deaths

Wacom's first OLED pen display is also the thinnest and lightest it has ever made

It doesn’t matter how many Vision Pro headsets Apple sells

The Google Pixel Buds Pro are back on sale for $135

Dell XPS 13 and XPS 14 review (2024): Gorgeous laptops with usability quirks

OpenAI's Sam Altman and other tech leaders join the federal AI safety board

The best gaming gear for graduates

The Morning After: Apple announces an iPad event for May 7

About

Sections

Contribute

Buying Guides