Note: links in the video transcript go directly to that commentary, and vice versa. Each excerpted video is indicated with a hyphen; speaker changes are indicated with line breaks and with parenthetical speaker indicators when necessary. When videos provided their own subtitles, these are indicated with “[onscreen subtitles]”. Even with variant spellings, punctuation, and unintuitive translations into English, these self-selected subtitles are preferred, since they navigate accent/language and that's often what's at issue with Siri. Transcriptions are notorious places for subtle linguistic oppression to take place (Lueck, 2017, "Discussion" section; McFarlane & Snell, 2017). Conducting the analysis by video is one way to preserve the complex and ambiguous aspects of the YouTube parodies.
This project brings together for the first time a range of YouTube videos that parody Apple’s voice assistant Siri. Part 1 explored how the YouTubers responsible for these videos all creatively reimagined Siri to be more like them. Part 2 is now devoted to the negative case: what do we miss out on by not having identity-based Siris? What’s lacking or problematic about today’s Siri?
Overall, there’s 2 big arguments.
The first argument is very simple, and revolves around who Siri works well for, who Siri can hear and understand. From the beginning, Siri was promised as being flexible in how it understands requests:
-[Apple presenter] "We want to talk to it any way we’d like. Someone might ask, ‘Will it rain in Cupertino?’ Or ‘Is the weather gonna get worse today?’" (Apple Special Event 2011)
But we already suspect from the California-centered variants that this flexibility only extends so far. According to these parodies, Siri doesn’t understand people with certain accents, like Italian and Chinese, or people who speak non-standard English dialects, like Hawaiian Pidgin English or Scottish. Take this example that pits Hawaiian Pidgin English “vs” standard Siri:
-[on-screen text] “Eh, Siri, try tell Da Kine cancel da pizza… But we need one nadda pound Poke” (Siri vs Hawaiian Pidgin English)
Obviously, this request includes differences from Standard English in how sounds are pronounced, but it also includes how people arrange their words, and what expressions Siri knows. Humorously, of course, an exaggeratedly bad Siri gets lost in all this difference:
-[on-screen text] [Siri] "I don’t understand. Do you mean cancerous leeches between one otter’s brown boto and one ballsack green marbles?"
Siri’s vulgar mishearing also dramatizes how it feels to be made fun of and “unheard,” so to speak. In other words, in these parodies, Siri’s uneven performance is not just a technological failure, but a social one, that reinscribes exclusive patterns from the past.
A few of the videos make this explicit. In a mock Apple commercial, Davy So calls Siri’s performance disparity “racist”:
-“Did you know Siri is racist? ... See if Siri recognizes you with your accent.” (Introducing the Iphone 5c and 5s)
He demonstrates several accents stereotypical of different Asian nationalities.
-[Korean, Chinese, Vietnamese, Indian, with onscreen labels]
If Siri can’t understand some accents, it means not including those people. The term racist draws out that history of discrimination.
Similarly, in the Jewish Siri, what is at first a problem of pronunciation
-“Siri, play avrohaim fried”
[Siri] “Searching for Afghanistan Freed by Mohammed…” (The Jewish Siri)
is quickly clarified as a problem of ethnic acceptance:
-“Siri, play Av-ro-ham Fried.”
[Siri] “That name is too Jewish for me to understand.”
These statements are exaggerations. This isn’t what the “real” Siri says; it pulls out a thread about how it feels to not be included and extrapolates to reference a long history of anti-Semitism.
People are not powerless when they’re misheard, and compensate in various ways, but to add insult to injury, Siri doesn’t understand that, either. We see this in a mix of scripted and unscripted parodies. The Hawaiian video tries one way to compensate:  explain a concept or phrase that Siri might not know:
-“tako poke: cut up octopus with onions” (Siri vs Hawaiian Pidgin English)
A Scottish parody tries this as well:
-“that’s just bread with chips” (Apple Scotland)
In addition, we see people  simplifying their requests
-“forget about the pint” (Apple Scotland)
 justifying their requests;
-[Siri] “Who would you like to call?”
[Son] “Who would you like to call?”
[Mother] “I want to speak to you, please. I want to speak to you, because it’s eh, my son that’s give me the new phone, and me no know how to use it.” (Nonna Paola)
 adding politeness markers like “please”, “excuse me”:
-"I want to speak to you, please." (Nonna Paola)
 asking double questions;
-“Excuse me, what’s the time in Italy? Can you tell me the time please?” (Nonna Paola)
 providing a leading answer
-“I hate tattoos, you like tattoo? I don’t think so” (Nonna Paola)
 and speaking a little slower
-“What’s a highland cow?” (Siri Vz Scottish Accent)
These are not unexpected strategies, and the fact that Siri can’t be bothered here shows that Siri hasn't been designed to interact with people who speak a different dialect or have a different accent.
Is this all on Siri? Most of these videos say yes, expressing frustration at Siri’s selective hearing.
-“Look, you fuckin cow” (Apple Scotland)
-[Siri] “I’m not sure I understand.”
“No, of course you didn’t, you stupid fuckwit.” (Siri Vz Scottish Accent)
In particular, this anger signals that there is a cost to not being understood.
That said, it’s worth mentioning that in three cases, it’s the accented speakers who need to change.
The first case is a set of 10 videos, all helpfully captioned into English and illicitly uploaded to YouTube from a Japanese TV show under the title “Funny when Japanese Try to Speak English with Siri! So Hilarious!” In each segment, a host wheels around a gigantic phone and surprises celebrities with an “English test.” If they say the word in a way that Siri recognizes, they win the match;
-"Alarm" [correct] (Funny when Japanese Try To Speak English With Siri! So Hilarious!)
they have to try again and may lose. There’s more to say about these videos, but for the sake of this project, this makes Siri out to be the arbiter of standard English. It’s not Siri who needs to improve, but their English.
The second and third cases connect being accented to being old or stupid. In Stuff Hyderabadi Moms Say to Siri, the mother’s age and corresponding lack of familiarity with using Siri are partly to blame for Siri not understanding her. Similarly, an adult son coaches his Italian mother to change the way she talks in order for Siri to understand her:
-[Mother] “Oh, come on, she talk."
[Son] "You know, you have to talk proper, you have to talk properly” (Nonna Paola)
These three cases are done playfully, teasing Japanese celebrities and old mothers for not being understood by Siri. But we can also read them as in the process absorbing and extending Siri’s totalizing linguistic logic.
To recap so far, these parodies collectively argue that we’re suffering from a Siri that performs unevenly. By focusing our attention on who Siri works well for, we see that these parodies consistently depict Siri as not being able to understand certain groups of people. This is in continuity with historical patterns of exclusion, such that some parodies call Siri “racist.” To make matters worse, when people try to overcome this exclusion on behalf of Siri, it doesn’t work, increasing the rejection and anger they feel. A few parodies even elicit Siri’s rejection to tease other people. Overall, then, these parodies shift the burden from a technical discussion of how Siri works to a social focus on Siri’s impact on society. In the terms of this project, these videos suggest that we don't need to be afraid of intentionally giving Siri an identity - today's Siri is already a social agent that communicates identity-based inclusion and exclusion.