Transcription note: When videos provided their own subtitles, these are indicated with “[onscreen subtitles]” and are preferred (including variant spellings, punctuation, and unintuitive translations), especially since they are navigations of accent/language, which is often at issue.
In late 2011, Apple announced that a voice assistant called Siri would come with the iPhone 4S. Siri could do things like take dictation for text messages, set timers, and answer questions about the weather, stocks, and nearby restaurants.
“What is the weather like today?” “Here’s the forecast for today.” “It is that easy.” [applause] (Apple Special Event 2011)
By the end of the month, there were already clips on YouTube where people tested Siri’s ability to interact with them, with mixed results.
“The iPhone 4S is instead creating confusion … Is it a nice day?” “Let’s see what it says.” “I don’t know what you mean by, ‘Is it NSD to says.’” (iPhone has problems with Scottish accents)
“I don’t know what you mean by ‘walk.’ Which email address for, work or home?” “Work.” “Kamei, I don’t understand, ‘wall.’ Which email address for, work or home?” “Work.” (Siri vs Japanese)
Some videos began to play with this idea by dramatizing how difficult it was for them to interact with Siri. For instance, in a funny mashup of the “Shit ____ People Say” meme, the video “Stuff Hyderabadi Moms Say to Siri” shows how age, gender, status, language, and technical familiarity intersect to make a frustrating experience for an imagined Indian mother and her daughter:
[on-screen subtitles] “shri is not understanding anything. SHRI! Hello?” “Hello.” [on-screen subtitles] “Look, now she understands.” “Call my friend Bilqees, Shri.” “I can’t search near businesses, sorry about that.” (Stuff Hyderabadi Moms)
Other videos played with Siri in a different way. Instead of dramatizing challenges with the current Siri, they reimagined Siri to be more like them.
“So I went in there and customized it a little bit, to make it more personal for me. So I present to you, Black Siri.” (Black Siri)
“Well, I’ve got an app that will help you choose the right. Mormon Siri.” (Mormon Siri)
“Apple CEO Tim Cook has come out, and to celebrate, we’ve made an exciting new update to every iPhone. Say hello to iGay.” (iGay iPhone)
These reimaginations go beyond accent or national origin to examine how Siri could inhabit other identities like race, religion, and sexual orientation.
Hi, I'm Will Penman. This project brings these videos together for the first time to see what we can learn from them.
Here in Part 1, we’ll explore how the parodies make a positive argument for developing identity-specific AI (like a black Siri, or a gay Siri). This contributes to scholarly work on how we speak our way into identities, and how we pick up on AI’s impact on us. The positive argument is that AI can participate in developing people’s identities that matter to them and create points of connection with others. For people in dominant social positions, Part 1 challenges us to be generous listeners, allowing the videos’ jokes to unsettle us and revise what we think is possible; for people who have historically been unheard, I hope this part of the project helps reconceptualize how AI development could proceed more inclusively.
After this positive argument (or before, if you want to watch it now), Part 2 is devoted to how the videos make a negative argument, that there are losses from today’s bland, universalizing AI. The parodies uncover that today’s AI development operates in an identity-blind way, and implicitly extends scholarship to critique that identity-blind ideology: namely, these parodies argue that today’s Siri reproduces a dominant mode of operation, part of which is to cover over the fact that it’s already a social agent – for better or for worse, it already participates in people’s identity construction. For people of minority identities, Part 2 is meant to amplify existing critiques of identity blindness and apply those more visibly into voice-driven AI; for people in dominant social positions, Part 2 is a chance to take our ongoing work of renouncing oppression and its fruits and extend that into artificial intelligence. In other words, we need to divest from racial, sexual, national, and religious privileges as they manifest in AI, too.
To help navigate this scholarship in video form, I’ve added a running header (arrows).
Okay, so what videos am I even talking about? There’s 21 that are one-offs; there’s one set of 2 [Muslim], one set of 3 [Mexican], and a set of 10 [Funny when] – 36 videos total, spanning 2011, when Siri was announced, to 2017, when I began analysis.
Each of these parodies came from a set of YouTube searches based on what are called “protected classes.” In the US, that’s race, religion, national origin, age, sex, pregnancy, citizenship, familial status (meaning whether you have kids or not), disability, veteran, and genetic information – and I threw in sexual orientation. Protected classes are the things that it’s illegal to discriminate about. So I thought they would be a useful high-stakes focus to examine how voice-driven AI affects people, and connect that to historical patterns of inter-group interaction. In other words, the video “Black Siri” came from the search “black siri” which was one of the searches I did regarding race.
I didn’t include news reports – I only wanted videos that dramatized interacting with Siri in some way. I also limited the corpus to videos with at least 90,000 views. In the YouTube economy, that’s not much, but for a regular person, that’s a lot and it seemed to be a natural cut-off point. To create videos with this level of popularity, most of the YouTubers have learned to effectively gather large (probably young) audiences and persuade them to keep coming back for more:
[mashup of people saying “subscribe”]
Some protected classes didn’t turn up anything – so if you want to make a parody about a pregnant Siri or a veteran Siri, it’s wide open. The ones that did fell into four categories: accent/national origin, race, religion, and sexual orientation. I’ll say more about methodology in Part 2 - that’s enough to get us going.
In all four of these categories, it matters to these YouTubers how Siri responds to them. Most of the parodies don’t bother critiquing Siri’s listening – they start by reimagining how Siri should interact with them, and thereby how Siri could support their identity.
Check out this clip from If Siri was Mexican (Part 2):
"Take me to the nearest McDonald's." "But your abuela made dinner." "Okay, but I want McDonald's." "Getting directions to Abuela's house."
This Siri has the same voice as usual, and the same lack of a body. But we interpret her as Mexican (rather than whatever the regular Siri is) because she's able to "construct" or "perform" that identity. In this case, that means using some Spanish, and enforcing a value on respecting elders and family obligations:
"But your abuela made dinner."
The guy is ambivalent about Siri's efforts (more on that later). What's important is that this example does a good job showing just what might be involved for Siri to “be” any of the identities people imagine.
As we dive into this in more detail, what we’ll see is that people are imagining AI as having the capacity to participate with them in constructing their identity. Siri isn't just transactional; what Siri knows and recommends says something about who Siri is and who they are.
 One place that Siri's identity makes a difference, as in this example, is through consumer-based activity. These Siris don't defer to Yelp for recommending businesses. They encourage people to buy, go to, wear, work at, etc. places that develop and reinforce that identity in some way. Thus, to the extent that lesbians buy certain kinds of cars, a lesbian Siri might recommend those cars:
“I've been thinking about buying a new car. What kind should I get?” “All of the cool kids are driving cars like Subaru, Outbacks, and Jeep Wranglers.” “Okay, cool” (IF SIRI WAS GAY)
To the extent that Mexicans consume certain music, Mexican Siri might recommend that music:
“Siri, play a fast-paced song.” [plays Mexican song] (If Siri was MEXICAN [Part 1])
To the extent that Mexicans wear certain clothes, Mexican Siri might recommend those clothes:
“Hey Siri, it's sunny outside. Should I wear a hat?” “Seems like it, here are some suggestions” [sombrero images]
And so on. In an illuminating example, Jewish Siri knows that a standard request isn’t specific enough for that identity:
“Where’s, uh, a good place to eat around here?” (The Jewish Siri)
Her follow-up re-ambiguates the person’s request:
“Will that be milchik or fleishik?” “Milchik”
This does more than circulate Kosher laws about not mixing meat and dairy, it helps him follow Kosher. If Siri was more like him, the video suggests, it would better allow him to embrace his identity in interaction with her.
Siri's reimagined identity also comes out by encouraging others not to be consumers in the “wrong” way: not to buy, go to, wear, work at, etc. certain places. So, Mexican Siri directs people away from places that aren’t “for” Mexicans:
“I wanna go to Starbucks, I don't wanna go to Taco Bell.” “Oh my god, bro. Forget Starbucks, what are you, a thirteen year old white girl? Taco Bell is the move, okay?”
And Mormon Siri encourages people not to go golfing on Sundays:
“Siri, what's the weather gonna be like at the golf course on Sunday?” “Cloudy, with a chance of going to Hell.” [he's stunned] (Mormon Siri)
(This is a good examples of how an identity-based Siri isn’t the same as simply personalizing Siri. A personalized Siri would say, well, if you usually golf then, then I’ll keep recommending it. But Mormon Siri’s knowledge of group standards helps reinforce this guy’s commitment to the group.)
Even within consumeristic uses, these reimagined Siris all disrupt the straightforward service relationship that Apple set up Siri to be:
“This dream that you’ll be able to talk to technology and it’ll do things for us.” (Apple Special Event 2011)
In contrast, these reimagined Siris get things done less efficiently (if they do them at all). In return, they step into a role of social agent, participating in users’ identity construction.
 An identity-ful Siri, then, involves self-conscious boundaries. This expands beyond consumption-based boundaries, into values, knowledge, and ways of interacting that form people’s identities.
Robin Williams’ French Siri doesn’t just have an accent:
Call it “Sihri” (Robin Williams' Siri Impression)
she also pursues a culturally bounded value of not using technology to restrict your engagement with the natural world:
“Look around you, why must you ask a phone? Live your life! Taking pictures with your phone... Look, look and then paint!”
A Siri that speaks Hawaiian Pidgin English values assessing people’s status (including their high school):
[on-screen subtitles] "To complete set-up process, please say what high school you went grad from." "Really? Okay, Kahuku." "Kahuku? OMG fo realz?" "Yeah, why? What school you went then?" "Not Kahuku!! Hahahahaha" (HAWAIIAN PIDGIN SIRI)
Black Siri expresses sexual literacy and self-respect:
“Siri, I love you.” “N----, please” (Black Siri)
Gay Siri values healthy, attractive living:
“What's on TV tonight?” “Nu uh, this is four nights in a row, honey. Here's directions to that new club. For the love of Laverne, get out of those sweats” (iGay (iPhone Siri parody))
Siri’s knowledge is also bounded from within a certain identity. Under the stereotype that gay people don’t like sports, gay Siri doesn’t relay the score:
“Who won the hockey game last night?” “I found three lesbians that are close to you who can tell you” “How close?” (iGay Siri parody)
And other knowledge is appropriate only for certain times and places; Jewish Siri won’t tell the score either while praying:
[in temple, whispering] “Avroham Fried, what’s the score?” “No! Not in the middle of daven-ing” (The Jewish Siri)
Finally, Siri adopts specific ways of interacting. Many of the different Siris are portrayed as rude, insistent, and insulting.
-“Siri, tell me a joke.” “There was once this fool named Lamar. The end.” [laughs hysterically] “Well, that, that wasn't very nice” (Black Siri)
-“Siri, from now on, call me Papí” “Okay, from now on, I will call you ‘Puto’ [bitch]” [friend laughs] (If Siri was MEXICAN [Part 1])
Black Siris refuse to answer stupid questions.
-“You know, Martín gets in the car, ‘Siri, what's the temperature outside?’ ‘Why don't you stick your head out the window?’” [smooths his eyebrow] (Martin and Siri)
-“Siri, what's 30 times 904 divided by the square root of 9?” “Is there something wrong with your calculator? Do I look like I like math?” (Black Siri)
Muslim Siri expresses her faith in Allah specifically:
“Allah, in the name of the most affectionate, the merciful, say you, he is Allah, the one. Allah, the independent, carefree. He begot none nor was he begotten – and nor anyone is equal to him.” (Muslim Siri)
These Siris express their identities through interacting in specific ways, rather than trying to apply generally. What's important isn't just getting the "right" answer, but interacting in a culturally specific appropriate way.
 Reimagining Siri's identity also means taking on a collective history. None of the parodies address history directly, but - at least as I'm seeing it - one video does reimagine overturning American racial power dynamics. In "Black Siri aka Siriqua," written by what seems to be a black woman and a white guy, Siri is in charge and she's going to make white users know it. This starts simply, by Siri asserting that she will determine when a request is an interruption:
"Siriqua, what will the weather be like today? Siriqua?" "[music in background] I heard you! Hold on. Now what you want?" (Black Siri aka Siriqua)
Similarly, white users don't have the power to command her:
"Well, tell me what the weather will be like today." "Don't tell me what to do."
And Siriqua's high status even means that she'll reject petty questions:
"I'm sorry, can you please tell me the weather?" "How the hell I'm supposed to know? Look out your damn window."
The video continues in this vein, and culminates at the end in a scene at the hair salon. Siriqua has directed the white woman to undergo a painful beauty regimen, and when she interrupts, this time she is bodily punished:
[between Siriqua and hair stylist] "So this hoe is sleeping with a married man." "Mm." "Yes she is." "Mm!" "And I hear them talking all--" [white woman] "Siriqua, send a text to my--" "Hello! I'm talking here! Can you hit her with the hairbrush please?" [The stylist hits her.]
For a lot of white people, this video is a nightmare. And obviously the Siriqua parody has the most radical politics of the corpus. The Siriqua video humorously shows a kind of interactional reparations, in which black people's history of being turned away, talked down to, and rejected, under threat of physical pain, is overturned and played back to white people. Retribution isn't the only way that an identity-based Siri could be swept into collective history, but it does show how our fears, hopes, and possibilities for race can play out in the realm of voice-driven AI.
 Finally, even though almost all of these parodies are named for one identity category, many of them play out associations or possibilities to have a more complex, intersectional Siri. Both Mexican Siris, for instance, are also piously Christian:
“Jesus Christ, the fuck?” “Whoa, whoa, whoa, hey, don't say that, bro” “Okay, shit.” “What are you, fucking crazy? That's my God you're talking about, are you crazy?” “Okay, okay, oh my God, sorry.” “He died for our sins, bro” “Sorry” (Mexican GPS)
“Siri, who is the most beautiful celebrity?” “Our Lady of Guadalupe” (If Siri was MEXICAN [PART 3])
Religious identity can also intersect with gender. Siri is deferential when criticized,
[on-screen subtitles] “You are so stubborn” “If you say so” (TESTING THE FAITH)
but in a video that jokingly tries to “convert” Siri to be Muslim in a roomful of Muslim guys, she is still not submissive enough:
[on-screen subtitles] [background] “She immediately assumes a pose.” “Oh, let's see how she will answer this. Do not assume a pose” “This might exceed my current abilities” “She is really so tough”
In these intersections, we see how Christian identity and gender performance are inflected by other identities, and how, with enough time, being Mexican or Muslim involves taking on other identities as well, in one form or another. Identity-based versions of Siri, then, should keep this in mind.
Overall, then, in a wide range of videos made independently over a range of years, YouTubers of minority identities have persistently reimagined Siri to be more like them. Some of their specific reimaginations are pretty stereotypical or downright cringe-worthy. But as parodies rather than “serious” proposals, they defer the challenge of “getting it right,” and they allow us, if we listen generously, to focus on the big idea: that we could create Siri who could adopt specific ways of being rather than trying to apply as generally as possible. If we designed Siri like that, we could make identity-specific recommendations for what to buy and not to buy; it could show identity-specific values, knowledge, and ways of interacting; it could address an identity's history; and it could be intersectionally created, not just one-dimensional. In all of these ideas, identities are active projects. So when we ask what identities we make available to people to interact with, we also ask what we ourselves might be like. Taking on this challenge is not straightforward; there are a lot of decisions that would still need to be made. But this is an emerging important issue for people to consider, especially those of us who study communication and its effects. Check out part 2! Part 2 moves from this positive argument that we could create identity-based AI, to a negative argument that we're suffering consequences from not doing so already.
Transcriptions are notorious places for subtle linguistic oppression to take place (Lueck, 2017, "Discussion" section; McFarlane & Snell, 2017).
It's hard to know someone's identity from watching them on a video, of course, so the statements made in this webtext should be read as tentative, with "(presumably)" as a silent qualifier for any identifier. By reimagining Siri to be more "like them," this includes identities that people resonated with rather than occupied themselves. Gabriel Iglesias, for instance, is a famed Mexican comedian, but in the bit analyzed here, he performs a black Siri. Similarly, the American comedian Robin Williams performs a French Siri, and in two parodies, a gay Siri is imagined for lesbians, and vice versa (a lesbian Siri for a gay guy). I suspect that in each of these, the historical interactions between these identities makes the political risks of adopting the other one minimal. A related resonance is when a (presumably) Hyderabadi teen (who is presumably not a woman-identifying parent himself) imagines interacting with Siri as a Hyderabadi mom.
In three cases (Gabriel Iglesias' bit; a set of 10 segments from a Japanese TV series uploaded as "Funny When Japanese Try To Speak English With Siri! So Hilarious!"; and a clip of Robin Williams on Ellen), the YouTube uploader is not one of the people responsible for making the video (again presumably, based on the account names). This creates instability for those three videos on YouTube due to copyright claims. In fact, the "Funny When" series has already been removed once and been reuploaded under the same title by other accounts. I refer to "YouTubers" then with this in mind. Despite this legal fragility, I’ve quoted them equally in this paper. This is partly to acknowledge the other parodies as doing equal discursive work; it’s also partly to push back against the power that corporations hold on our reuse. For more on this, see Banks (2011) on remixing, the code of best practices in fair use for scholarly research by the Center for Media and Social Impact;
and a quote attributed to Banksy:
In sociolinguistics, this is known as the "discursive construction of identity" (e.g. Benwell & Stokoe, 2006, p. 4) and contrasts with conceiving of people's identities as static, private, and innate. One of the primary methods for analyzing this, critical discourse analysis, has been productively used in rhet/comp scholarship, including to understand identity construction (Huckin, Andrus, & Clary-Lemon, 2012). Rhet/comp scholars have argued that identity construction happens in significant ways in digital contexts. Danielle Nielsen (2015), for instance, examines how roleplaying games have offered players the ability to perform identity of gender, race, sexual orientation (and even species!) through designing their own avatar. These gameplay performances extend or accentuate who players are offline. And Estee Beck (2015) notes that identity construction can go the other way around digitally, too: with routine web browsing, automatic tracking processes impute market-based identities to users (e.g. type of house lived in, desire to purchase concert tickets, age range) that are troublingly "invisible" to users themselves. This project contributes to this important thread of digital identity work by investigating how people (could) build their identity in interaction with voice-driven AI, which is neither a digital mapping of the self (a la Nielsen) nor a hidden identifying agent (a la Beck), but is an other, a fellow interactant that is all the more interestign for its strange embodiment (see note below on Siri's body).
That said, some rhet/comp scholars are hesitant to focus on "identity" as such, preferring the more flexible concept of "difference." Difference, Stephanie Kerschbaum (2012) argues, helpfully avoids stabilizing or "fixing" identities like white, working class, or female (p. 619, with fixing carrying a pejorative pun on trying to solve). "Difference" also allows critical and pedagogical attention to what students actually mark as relevant, which often exceed big identity categories. (In her example, one student marked where she learned a grammatical rule - namely, at that university - as a relevant difference from her peer, and as a source of authority during peer review. This location-based appeal, Kerschbaum argues, would be hard to access in an identity-based analytic framework.) Thus, identity is "to be understood both through the contexts in which we communicate and act and by our embodiments of it" (p. 617, emphasis in original). Similarly, Jonathan Alexander and Jacqueline Rhodes (2014) suspect that identity categories, which ostensibly mark difference, actually presume a fundamental similarity among people. In making identity categories linguistically parallel (white, black, Asian, etc.), we may inadvertently create an epistemological parallel and think that we can really know others through comparison to our own experiences. Focusing on identity categories, then, can "flatten" difference.
I find these important qualifications to identity-based inquiries. And the direction taken in Part 1, of sketching ways that an identity-based Siri could play out, does naturally lead toward short-term regularity. However, the project responds to Kerschbaum's concern insofar as implementing Siri in this way is recommended as being necessarily contingent and up for ongoing revision. Likewise, I would submit that the arguments made in Part 2 of this project to de-naturalize the comfort (or lack thereof) that we may feel when interacting with voice assistants like Siri also speak to Alexander and Rhodes' claim that we are somewhat opaque to each other. At the same time, this project addresses an element of identity construction not discussed by Kerschbaum or Alexander and Rhodes: how identity categories can themselves be a resource for self- and group-identification processes. In the videos analyzed here, identities are communicable and up for discussion, negotiation, and contestation. In other words, the parodies analyzed here end up taking up black, gay, Mormon, and immigrant identities (among others) as ongoing questions of public interest.
One identity-related challenge that's central to this project regards who is doing the identifying. Sociolinguists Bucholtz and Hall (2005) provide a helpful heuristic here to understand the various ways that identity work happens. Identities can be:
[a] in part intentional, [b] in part habitual and less than fully conscious, [c] in part an outcome of interactional negotiation, [d] in part a construct of others’ perceptions and representations, and [e] in part an outcome of larger ideological processes and structures. (p. 585)
This five-part division gets at both the benefits and challenges of the parodies investigated here. On one hand, almost all of the parodies foreground the creators' intentional identity work (i.e. [a]). This is self-directed and thereby easy to interpret as empowering, such as a lesbian woman characterizing lesbians (via Siri) as people who value healthy living. However, this quickly becomes murky. Such a characterization about lesbian people's lifestyles would be problematic when made in attribution to others [d], like, as a straight person/Siri, telling someone they should do certain things because they're lesbian. It might also be problematic for a lesbian person to reinforce those characterizations to outsiders [c], e.g. "Well, I would get up and get out of the house tonight, since I'm lesbian." Moreover, oppressive (especially essentializing) accounts of identity [e], such as the very idea that "lesbians" are a certain thing, might be accessed even through the intentional aspects of these parodies [a]. Similarly, semi-conscious or habitual reflexes [b] might be at work in the parodies people create.
In other words, there are complicated political questions here about stereotypes and how possible it is to subvert them through the kind of self-conscious, intentional [a] use of them found in the parodies. From the perspective of this project, this is an important question, but is not treated as a show-stopper. In pertaining to implementing an identity-based Siri, it is largely beyond the scope of this proof-of-concept paper. Still, a few preliminary notes are in order to head off objections and provide possibilities for future research. First, the conclusion to Part 1 suggests that treating these parodies as parodies means being willing to bracket some of these challenges; it means attempting to revel in the self- and world-building done in each video. This methodological deferral can itself be viewed as a scholarly contribution: namely, this project claims that parodies can, when considered rhetorically in terms of genre and audience, temporarily simplify complex questions of identity construction. This project also attempts to hold any particular characterization lightly: health efforts may or may not be something a lesbian Siri would/could/should adopt specifically, but any identity-based Siri would likely involve promoting certain values. Finally, from a rhetorical perspective, it would seem that (AI's) ethos and audience would be useful mediating concepts in working these questions through further - who is authoring and responding to these?
Our interactions with AI and related technologies is of emerging interest to rhetoric scholars across English and Communication (Ingraham, 2014; Coleman, 2018; Brock & Shepherd, 2016; Fancher (2018); Gallagher (2017); Holmes (2018); Arola (2012); Eyman (2015); Brown (2014); Elish and boyd (2018)). One strong contribution to this discussion is Miles Coleman (2018), who theorized "machinic" rhetorics. Machinic rhetorics acknowledge the appearance of autonomy. They focus on "the 'hidden layer' of agency between human and machine, which allows for a given machine to be imagined as its own interlocutor, replete with its own ethos" (p. 337). In fact, Coleman even speculated that Siri is a productive entry point for this: "Are Speech/Voice User Interfaces possible sites for productive disruption and resistance? Are they yet another instance in which ideology lurks in our “neutral” interfaces? Machinic rhetoric helps us realize that yes, they are" (p. 347). This project can be seen as adding texture and specificity to this claim that systems like Siri operate rhetorically and ideologically. In particular, Part 2 addresses how Siri is laden with racially loaded ideologies of communication.
Outside of rhetoric, in the field of machine learning, questions about AI's impact on society have been framed with more attention to their algorithmic implementation. Sometimes this is framed as a technical problem: of securing rigorous definitions for publicly valued terms, such as "fairness." For instance, after a ProPublica article explored AI software that was being used to assess a person's chance of recidivism (a problematic construct already) and found that its decisions were racially unfair, a computer science paper was published that compared how competing definitions of fairness manifest mathematically in such algorithms. Naively read, this paper found that "fairness" is an unclear term mathematically, and therefore incoherent. A rhetorical approach as taken here attempts to bring public debate over key terms (such as fairness or, in this case, inclusivity and representation) as debate (i.e. as up for contestation) back into relevance, insisting that computer scientists are also necessarily participating in structures of identity, fairness, representation, etc.
Other machine learning research applies questions of equity to algorithms that are opaque to users. In a widely cited article, Buolamwini & Gebru (2018) created an ingenious way to assess commercial face classifiers. These are products put out by Microsoft, IBM, and others that identify whether a face is present in an image, and conduct simple points of analysis, like what gender that person is. Face classifiers are easy to comprehend as connected to questions of racial representation and (literal) "visibility." Face classifiers are also important as one component of AI-driven surveillance, e.g. Amazon selling AI-trained face classifiers to police. This is a promising direction that focuses on AI development through prospective social use.
Finally, some scholars investigate how we interact with AI with concerns characteristic of the humanities. For instance, in a set of artistic interactions, Stephanie Dinkins explored how a black robot might allow conversations about her experiences of racism that a white robot might not. (Examining racism is a potent possibility for identity-based AI that, unfortunately, none of the parodies examined here address.) This kind of work is suppoted by organizations like Data and Society. It shows that imaginative, forward-looking projects about AI can provide helpful goal-setting for more technical projects.
In the public eye, Alexa and Siri have begun to be recognized as sites of possible cultural transmission ("Amazon’s Alexa will soon be teaching your child manners", "To Siri, With Love"), and to recognize this as potentially problematic ("The Accent Gap", "Alexa, Siri, Cortana: The Problem With All-Female Digital Assistants", "Alexa, Should We Trust You?", "Voice is the Next Big Platform, Unless You Have an Accent")
As the Wikipedia article indicates, sexual orientation and gender identity are not currently federally protected classes, but are protected classes in some states.
Searching a) only with English searches, b) only for Siri parodies, and c) only for protected classes creates analytic boundaries. What it gives up in the process is, first, a potentially more global set of parodies. A set of videos from a Japanese TV show were returned for "english siri," probably because of the title that the subtitler gave the videos (Funny When Japanese Speak English To Siri! So Hilarious!). Similarly, two videos recorded in Turkish were returned for "muslim siri," likely because of what seem to be manually added English captions. These are reminders that this methodology is impacted by YouTube’s search algorithm and uploaders' (non)use of English.
Second, since Siri has been eclipsed by Alexa in the public imagination, this project doesn't capture parodies about Alexa that, if they were about Siri, would fall into the scope of this project, including "Christian Alexa" (regarding religion) and "Alexa for Old People" (regarding age, a protected class).
Finally, protected classes are only the tip of the iceberg. There are a range of other Siri/Alexa parodies that include personality-based reimaginations of Siris (Sean Connery Siri, Morgan Freeman Siri, Justin Bieber Siri, Trump Siri...) as well as Siris/Alexas remimagined in terms of other socially relevant identities, such as political identity (e.g. "Amazon’s Alexa is a CRAZY SJW LIBERAL! | Louder With Crowder" [4.8M views], "Alexa is Woke on Communism"), and region (e.g. “If GPS Navigation was Southern” by This is Alabama [107k views, uploaded Jan 23, 2017]).
It's my hope that the analysis provided in this webtext creates analytic focus, a wide time span of videos (2011-2017), and hopefully enough explanatory value to makes legible the ways many other ongoing parodies are operating. This is largely a question for future research. However, I will note that companies have themselves been getting into the parody game. In 2017, Amazon released a Superbowl commercial:
Superbowl commercial screenshot (as of Jan 2019, original no longer available on Amazon's YouTube account)
and several accompanying short videos reimagining Alexa as being operated by various celebrities. Preliminary research shows that these videos have functioned much differently than the "bottom-up" videos treated in this webtext. Apple and IBM have been achieving a similar effect of conceptualizing conversational agents as tools when they refer to Siri and Watson as platforms.
I made two exceptions for the sake of wider inclusion: "Mormon Siri" (41k views) and "The Jewish Siri" (40k views). I made a partial exception to include "If Siri Was MEXICAN [PART 3]": at the time of data collection, it only had 66k views; by Jan 2019, that grew to 855k (and the series had continued, to a Part 4, not included in this analysis.)
The three-part series "If Siri was Mexican" has several age-based examples of attention to family. In all of these, Siri enforces family protocols. She doesn't participate in activities that don't have parental permission:
"Siri, text Eric that I'll be there for the party." "Did you ask your mom for permission?" "No, I'm 23, I don't need to." "You still live at home pendejo, so ask for permission." [Part 3]
When she does, she uses guilt to try to dissuade actions that could make the mother feel abandoned:
"Text my mom that I'm gonna be out late today." "Okay, texting La Llorona that you're going to be out late." "What? No no no no no." [Part 2. This refers to a fable about a mother weeping for her children]
Or even threatens to tattle on someone:
"Siri, open YouTube." "Okay, I'll let Cucuy know you want to watch YouTube instead of clean your room." [Part 3]
"Give me directions to Michael's house." "Did you do your chores?" "Uh, no." "Calling mom." [Part 2]
She also knows that parental anger is a lingering threat:
"Siri, can you tell me a scary story?" "Once upon a time, a little boy forgot to turn off the stove for the beans and his mom came home." [Part 2]
And Siri will even participate in forms of family discipline:
[to mother] "I'm not gonna take out the trash." "Siri, my son won't listen to me." "[Picture of flip-flop] Try this Señorita." [Part 1]
It's interesting that only in this If Siri was Mexican series are age-based forms of family respect so highlighted. These examples could be considered more complex than this paper's system of values, knowledge, and ways of interacting: Siri adopts a type of alignment through her awareness of family conflict and consistent advocacy for the mother in the family.
Putting it crudely and iterating on a biological understanding of these identities, Siri doesn’t have a body, and she doesn't have skin color, sexual desire, “beliefs,” or a place where she was born. (Although some of the parodies do still imagine this: Siri has lesbian sex in Coming Out to Siri, she gets her hair cut in IF SIRI WAS GAY, and she listens to music herself in Siriqua.) Thus, for her to "be" any of these identities, the parodies must to some extent enact a social constructionist view of those identities. Some parodies, like Black Siri, use a wide range of semiotic resources to help Siri enact a new identity: visualizing her, revoicing her, and rewriting her responses. Others, like Mexican GPS and Jewish Siri, don't visualize Siri but do revoice and rewrite her answers. Others, like Hawaiian Pidgin Siri and Apple Scotland, don't visualize her and rely on the default voice, only rewriting her responses. And others, like Hyderabadi Moms and Nonna Paola, interact with the real Siri, but dramatize their own reactions. The parodies that only rewrite Siri's responses are particularly impressive at their ability to convey identity, since they don’t even have full discursive control. They have to yield identity that could be marked through pronunciation, intonation, and prosody.
At the same time, Siri is embodied in the sense of residing on servers in Apple's global data servers (i.e. "in the cloud"), with some processing done on people’s phones. It’s unwise environmentally to ignore these aspects. See the short documentary "Bundled, Buried, & Behind Closed Doors."
One interdisciplinary account of consumption-based identity work is "consumer culture theory" (CCT). Arnould and Thompson (2005) reviewed CCT research, suggesting that consumers can be viewed as "identity seekers and makers" (871) when the marketplace is viewed as "a preeminent source of mythic and symbolic resources through which people, including those who lack resources to participate in the market as full-fledged consumers, construct narratives of identity" (871).
Speaking from a rhetorical perspective, Greg Dickinson (2002) argued that in a postmodern age, places of consumption are also places in which material arrangements allow people to participate in and draw from in developing a sense of self:
Our collective and individual subjectivities are always at stake, and they are always at stake even in, perhaps especially in, the mundane and banal practices of the everyday. Our selves are under construction as we hoist a cup of coffee, buy a magazine, teach a class, read a book, discuss the weather, ride our bikes to work. (p. 6)
This project would add, "and ask Siri for a recommendation."
For some scholars, consumption-based identity work can be analyzed through evaluating it, e.g. Warde's (1994) contention (following Durkheim) that consumerism is "suicide." Regardless of the theoretical assessment of the value of consumer-based identity work, it makes sense that the videos' reimagined Siris would draw on consumption as a source of identity formation, given how the voice "assistant" frame that Apple, Amazon, and Google use for Siri, Alexa, and Google Assistant is within a business context, of buying, learning, dictating, etc.
Surprisingly, three of the reambiguations are name-based and therefore within the real Siri's purview:
"Avroham Fried, call Chaim Yichiel Miche." "Say 1 for Chaim Cohim Michel Goldberg, say 2 for Chaim Chohim Michel Goldfarb." "The second one." (The Jewish Siri)
"iGay, call Ryan." "Which of the 10 Ryans in your contact list would you like me to call?" "My ex, Ryan." "Which of the 8 Ryans in your contact list would you like me to call?" "I don't care!" (iGay (Siri parody))
"Siri, can you call my tía?" "[long list displayed] Which tía do you want to call?" (If Siri was Mexican PART 3)
These draw on the stereotypes that certain Jewish people's names are common, that gay people are promiscuous, and that Mexican people have large families, respectively.
Sometimes the Siri performs a culturally specific disambiguation:
[while looking at a Ford Mustang] "Siri, how much is a Mustang?" "[picture of a horse] According to my sources, they cost 50,000 pesos" (If Siri was Mexican PART 1)
[while putting on a tie] "Siri, what's the name of the place where they prepare food right in front of you?" "[picture of a sidewalk food vendor] Do you mean, Elotero?" [If Siri was Mexican, part 2]
In the first exaple, the person's view of a car cues viewers to the speaker's intended meaning of (the car) Mustang (rather than the type of horse). The second example has more subtle visual clues; I suspect that the guy putting on a tie is supposed to signal a formal dinner (perhaps a date), suggesting that he might be thinking of hibachi grill. This would be a funny contrast to the Mexican Siri's recommendation of a casual street vendor. AI's ability to appropriately disambiguate is at the forefront of not only theoretical AI research (see the benchmarks for "word-sense disambiguation"), but is also central to the Turing test (see Fancher (2018)), and critical research on Google's search algorithms (Noble (2018), see especially her motivating example of search results for "black girl").
This division is ad hoc, meant just to indicate a variety of areas where identity-construction happens. Sociolinguists are more likely to identify specific moves involved in positioning oneself relative to those. For instance, Graham Ranger, in a review of Coupland's book Style: Language Variation and Identity summarizes:
My interest is in showing what parts of life these processes can affect, through people's interactions with Siri.
As an extension of this, see Part 2 of this project, when one of the other Black Siri parodies suggests that, in contrast, a "white" Siri might like kinky sex.
Other scholars have detailed similar forms of "communicative resistance" among black women especially (Davis, 2018, "Taking Back the Power"; Siri's reimagined race intersects with her gender here). More generally, discourse norms among African Americans include an extremely embedded (rather than decontextualized) view of what's appropriate to say:
"Thus from a black perspective, questions should appear in social contexts which incorporate or reflect their reasoning, rather than simply satisfy institutional or intellectual curiosity and need." (Morgan, p. 52)
Similarly, in Shirley Brice Heath's ethnography of how adults interact with children in the black neighborhood of Trackton, she noted that analogies, story-starters, and accusations are the most common. Questions in which the answerer has the information are rare; rarer still are questions in which the answer is known to both questioner and answerer (p. 104, 111). These are similar to the questions that the black Siris reject in the parodies as well.
In addition, this reimagined characteristic ends up rejecting the (stereotypically feminine) submissiveness that is built into Siri, in that Siri is a conversation agent that a) comes with a female voice by default (at least in the US); and
By rejecting stupid or obvious questions, then, these black Siris are responsive to feminist critiques of how past (and current) female AI respond weakly to inappropriate requests (Brahnam & De Angeli, 2012; Woods, 2018; Brahnam & Weaver, 2015; Strait, Ramos, Contreras, & Garcia (2018); for popular recognition of this, see "The Problem with All-Female Digital Assistants").
It would normally be considered cultural appropriation for a white woman to get cornrows. Interestingly, then, with Siriqua's reimagined racial power comes a reimagined historical meaning of cornrows. In this scene, cornrows represent the pain that a white woman must undergo in order to present herself according to Siriqua's standards.
Along the lines of “an eye for an eye.” For an accessible overview of reparations, see Ta-Nehisi Coates' "The Case for Reparations".
The exception being "Stuff Hyderabadi Moms Say to Siri"
Today, "intersectionality" is often used to mean simply "involving multiple identities at once" (e.g. gender and race), or in a more nuanced way, to mean that privilege/oppression is "multiplicative." But in Kimberle Crenshaw's foundational (1989, 1991) articles, the insight had to do with the legal, activist, political, and cultural visibility of discrimination. For instance, when viewing employment only in terms of race, courts wouldn't find discrimination, because some black people were being promoted. And when viewed only in terms of gender, courts wouldn't find discrimination, because some women were being promoted. It was only by viewing both race and gender that black women as such could be seen to be discriminated against. "The paradigm of sex discrimination tends to be based on the experiences of white women; the model of race discrimination tends to be based on the experiences of the most privileged Blacks" (p. 151). Thus, intersectionality is as much a theory that pushed against feminist and antiracist theory of the day (as well as simplistic identity politics presently) as it is one that assists in left-leaning coalitions. This insight that intersectionality speaks to visibility under multiple forms of oppression is carried through by Buolamwini & Gebru's (2018) study of bias in facial recognition AI. Analyzed just by skin tone or just by gender, the facial recognition AI algorithms under examination performed adequately. But when intersecting skin tone and gender, misidentification of dark-skinned women was found to be 34.7%! For another excellent essay on intersectionality in emerging technologies, see Costanza-Chock (2018).
The on-screen translation of the Turkish is a little obscure here with Siri adopting a “pose” and being "so tough." However, one of my connections online, Sarah Adams, a retired military linguist, confirmed that the word translated as "pose" means having an "attitude," and the word translated as "tough" means that Siri is "difficult/unyielding." The leader's harsh assessment of Siri's behavior here seems partly precipitated by the fact that she hasn't responded with a Muslim greeting (see Part 2 of this project), but Adams agrees that the gender dynamics in the male Muslim Turkish setting "adds another layer of expectation that she be agreeable."
This isn’t to say that there would need to be endless versions, so that all possible identities are represented. Not only is this impractical, but it ignores that many of these parodies show interactions with what might be thought of as adjacent identities. An adjacent identity (e.g. a Mexican Siri who is Catholic, when the user is Mexican but not Catholic) would provide the chance for a small level of cross-cultural (or inter-species?) communication. The dynamics of cross-cultural/inter-species(?) communication with AI-driven voice assistants have not been well explored yet. For one perspective, see the reparations-driven interpretation of Siriqua above.
To me, the most cringe-worthy portrayals of an identity-based Siri in the whole corpus come from the parody "Ghetto Siri." (Although “ghetto” can be a location, it can also be used as a circumlocution for poor blackness. "Ghetto Siri" makes it clear that it is adopting a black racial identity, rather than just a locational identity, when a white user asks Siri how to be more black, and she rejects the request:)
"Siri, how do I become more black?" "[Dings twice, as though the request was ill-formed] ...Shut your bitch ass up" (Ghetto Siri)
In "Ghetto Siri," Siri is recruited as a surveillance tool to support lustful desires. In the first scene, Siri warns a girl not to have sex with a guy because he has a small penis. In a later scene, Siri shows two guys a girl's butt:
[Two guys in a hallway see a girl walk by.] "I would kill to know what she got in them jeans." "You? I think I'm gon ask Siri what she got in them jeans. Siri, what she got in them jeans?" "Damn. Shawty got a donk. Here’s what I found. [Picture of a woman's butt.]" "Damn!" [They celebrate.] (Ghetto Siri)
These provide a racialized vision for big tech's surveillance capabilities. Instead of what Siri can "find" being restaurants for us to spend money on, what this Siri can find is salacious photos and intimate sexual knowledge. Perhaps my squeamishness here should be resisted; why is it so much worse to imagine a surveillance apparatus devoted to sex than to profit? But it's not even just these two scenes; in one scene, ghetto Siri can't understand someone with an Asian accent, implying that this kind of exclusion is particularly ghetto. I admit that I tried to exclude this video from the corpus, rationalizing this to myself by noting that it didn't technically come up as a search result for “black siri” and it was “so silly.” More than any other, I worry that this video conveys an essentialist understanding of race: that, e.g. objectivizing potential sexual partners, is what makes someone black. The reminder here is that any identity-based version of Siri would need to be developed as something that is contextual: bound to a place, time, situation, audience - and therefore able to adapt and morph.
Influential in my thinking on this is Jonathan Rossing's (2014) analysis of comedic sketches ("Prudence and Racial Humor") and his (2016) theory of humor as a rhetorical skill for civic life ("A Sense of Humor"). Listening generously is particularly important for people who occupy dominant subject categories because irony and other humorous forms are used to navigate power differences (Morgan, 2002).
Transcription note: When videos provided their own subtitles, these are indicated with “[onscreen subtitles]” and are preferred (including variant spellings, punctuation, and unintuitive translations), especially since they are navigations of accent/language, which is often at issue.
This project brings together for the first time a range of YouTube videos that parody Apple’s voice assistant Siri. Part 1 explored how the YouTubers responsible for these videos all creatively reimagined Siri to be more like them. Part 2 is now devoted to the negative case: what do we miss out on by not having identity-based Siris? What’s lacking or problematic about today’s Siri?
Overall, there’s 2 big arguments.
The first argument is very simple, and revolves around who Siri works well for, who Siri can hear and understand. From the beginning, Siri was promised as being flexible in how it understands requests:
We want to talk to it any way we’d like. Someone might ask, ‘Will it rain in Cupertino?’ Or ‘Is the weather gonna get worse today?’ (Apple Special Event 2011)
But we already suspect from the California-centered variants that this flexibility only extends so far. According to these parodies, Siri doesn’t understand people with certain accents, like Italian and Chinese, or people who speak non-standard English dialects, like Hawaiian Pidgin English or Scottish. Take this example that pits Hawaiian Pidgin English “vs” standard Siri:
[on-screen text] “Eh, Siri, try tell Da Kine cancel da pizza… But we need one nadda pound Poke” (Siri vs Hawaiian Pidgin English)
Obviously, this request includes differences from Standard English in how sounds are pronounced, but it also includes how people arrange their words, and what expressions Siri knows. Humorously, of course, an exaggeratedly bad Siri gets lost in all this difference:
[on-screen text] "I don’t understand. Do you mean cancerous leeches between one otter’s brown boto and one ballsack green marbles?"
Siri’s vulgar mishearing also dramatizes how it feels to be made fun of and “unheard,” so to speak. In other words, in these parodies, Siri’s uneven performance is not just a technological failure, but a social one, that reinscribes exclusive patterns from the past.
A few of the videos make this explicit. In a mock Apple commercial, Davy So calls Siri’s performance disparity “racist”:
“Did you know Siri is racist?” “See if Siri recognizes you with your accent.” (Introducing the Iphone 5c and 5s)
He demonstrates several accents stereotypical of different Asian nationalities.
[Korean, Chinese, Vietnamese, Indian, with onscreen labels]
If Siri can’t understand some accents, it means not including those people. The term racist draws out that history of discrimination.
Similarly, in the Jewish Siri, what is at first a problem of pronunciation
“Siri, play avrohaim fried” “Searching for Afghanistan Freed by Mohammed…” (The Jewish Siri)
is quickly clarified as a problem of ethnic acceptance:
“Siri, play Av-ro-ham Fried.” “That name is too Jewish for me to understand.”
These statements are exaggerations. This isn’t what the “real” Siri says; it pulls out a thread about how it feels to not be included and extrapolates to reference a long history of anti-Semitism.
People are not powerless when they’re misheard, and compensate in various ways, but to add insult to injury, Siri doesn’t understand that, either. We see this in a mix of scripted and unscripted parodies. The Hawaiian video tries one way to compensate:  explain a concept or phrase that Siri might not know:
“tako poke: cut up octopus with onions” (Siri vs Hawaiian Pidgin English)
A Scottish parody tries this as well:
“that’s just bread with chips” (Apple Scotland)
In addition, we see people  simplifying their requests
“forget about the pint” (Apple Scotland)
 justifying their requests;
(Siri) “Who would you like to call?” (son) “Who would you like to call?” “I want to speak to you, please. I want to speak to you, because it’s eh, my son that’s give me the new phone, and me no know how to use it.” (Nonna Paola)
 adding politeness markers like “please”, “excuse me”:
"I want to speak to you, please." (Nonna Paola)
 asking double questions;
“Excuse me, what’s the time in Italy? Can you tell me the time please?” (Nonna Paola)
 providing a leading answer
“I hate tattoos, you like tattoo? I don’t think so” (Nonna Paola)
 and speaking a little slower
“What’s a highland cow?” (Siri Vz Scottish Accent)
These are not unexpected strategies, and the fact that Siri can’t be bothered here shows that Siri hasn't been designed to interact with people who speak a different dialect or have a different accent.
Is this all on Siri? Most of these videos say yes, expressing frustration at Siri’s selective hearing.
-“Look, you fuckin cow” (Apple Scotland)
-“I’m not sure I understand.” “No, of course you didn’t, you stupid fuckwit.” (Siri Vz Scottish Accent)
In particular, this anger signals that there is a cost to not being understood.
That said, it’s worth mentioning that in three cases, it’s the accented speakers who need to change.
The first case is a set of 10 videos, all helpfully captioned into English and illicitly uploaded to YouTube from a Japanese TV show under the title “Funny when Japanese Try to Speak English with Siri! So Hilarious!” In each segment, a host wheels around a gigantic phone and surprises celebrities with an “English test.” If they say the word in a way that Siri recognizes, they win the match;
Alarm [correct] (Funny when Japanese Try To Speak English With Siri! So Hilarious!)
they have to try again and may lose. There’s more to say about these videos, but for the sake of this project, this makes Siri out to be the arbiter of standard English. It’s not Siri who needs to improve, but their English.
The second and third cases connect being accented to being old or stupid. In Stuff Hyderabadi Moms Say to Siri, the mother’s age and corresponding lack of familiarity with using Siri are partly to blame for Siri not understanding her. Similarly, an adult son coaches his Italian mother to change the way she talks in order for Siri to understand her:
“Oh, come on, she talk." "You know, you have to talk proper, you have to talk properly” (Nonna Paola)
These three cases are done playfully, teasing Japanese celebrities and old mothers for not being understood by Siri. But we can also read them as in the process absorbing and extending Siri’s totalizing linguistic logic.
To recap so far, these parodies collectively argue that we’re suffering from a Siri that performs unevenly. By focusing our attention on who Siri works well for, we see that these parodies consistently depict Siri as not being able to understand certain groups of people. This is in continuity with historical patterns of exclusion, such that some parodies call Siri “racist.” To make matters worse, when people try to overcome this exclusion on behalf of Siri, it doesn’t work, increasing the rejection and anger they feel. A few parodies even elicit Siri’s rejection to tease other people. Overall, then, these parodies shift the burden from a technical discussion of how Siri works to a social focus on Siri’s impact on society. In the terms of this project, these videos suggest that we don't need to be afraid of intentionally giving Siri an identity - today's Siri is already a social agent that communicates identity-based inclusion and exclusion.
The second argument implicit in these parodies is a little more complex. It comes from watching all of them together, and it develops the argument from the previous section by asking: if Siri is already being seen as a social agent, what kind of social agent is she? The argument is that today's Siri is actually straight, white, unaccented, and nonreligious. In other words, we’re suffering from a Siri that ends up functioning in dominant social positions.
First, let’s look at the empirical case for this by going back to the methodology for this project. Of all the identity categories I searched for, people had made popular parodies in four categories: race/ethnicity, accent/national origin, sexual orientation, and religion. For each of those, some of the search terms I used returned results, and others didn’t. But there’s a conspicuous absence – can you see it?
What’s the dominant identity in each of these categories? The one that people often think is “normal” or “assumed,” or “neutral” – what’s the default? With race and ethnicity, that’s the category of being white. And look – nobody’s made a parody of “white siri” (or “Caucasian siri”). With accent and nationality, look – nobody’s made a parody of “unaccented siri.” And there’s no “straight siri” or “nonreligious siri”.
So this is one part of the case: we’re missing parodies that re-imagine Siri as socially dominant, and so maybe Siri already is playing those roles. You can’t make a parody of something if it’s already true.
But this argument from absence could use more explanation. Why would people think that Siri’s playing those roles? Let's see what the real Siri is like by working our way backwards using the same categories as in Part 1. What do these parodies not reimagine?
As a consumer, all of the reimagined Siris supported various specific kinds of buying habits - nobody made a parody where Siri gives generic buying advice. So the real Siri, then, (and this is confirmed by anyone who has actual experience with Siri) supports people who want to "buy anything, go anywhere, do what I feel like." Similarly, all of the parodies reimagine various kinds of values and knowledge, and egalitarian ways of interacting; reading the absence, then, the real Siri values an individual's desires (it doesn't factor in what other people want). The real Siri supports knowing "anything," "everything," at all times, and interacting in formal, polite ways, with a clear hierarchy of giving instructions and being the assistant. And keep in mind that for these parodies, the identities that Siri supports are also the identities that Siri has: the Siri that helps Mexican people came from it being a Mexican Siri.
So, based on this list, what is the real Siri? You might say, well, these things are literally "anything," they're so universal, it's "no identity."
Ah, we've fallen into the trap of identity-blindness, and the rest of this video will be spent digging our way out. I'll be arguing that thinking today's Siri has "no identity" stems from an identity-blind ideology, one that the people creating Siri have transferred from civic life into artificial intelligence.
What's "identity-blindness"? That's the idea that it's bad to treat people differently based on who they are. We see this in civic life, for instance, when we depict Lady Justice as blind, or when we advocate for "race-blind" admissions to college: blindness, the argument goes, is "fair," "equal."
One of the parodies, actually, shows how you could apply an argument for identity-blindness to AI. In “Racist Siri,” identity-based stereotypes are used to harm someone, gradually breaking him down:
"Siri, what’s on my schedule today?” “Appointment with your parole officer” “Siri, I’ve never been to jail” “Paternity test, followed by rap battle.” “Siri.” “Swag, swag, swag, swag…” “I don’t get it, what’s swag? Why? I just wanted to know my schedule in case my mom—[crying] I just wanted to know my schedule" (Racist Siri)
It's not clear what identity Siri is in this, but the effect of the video is to show how terrible it could be to have AI that treats you stereotypically, as a black person. It's not fair. In fact, being aware of someone's race and acting on it - that's what's really "racist," that's why it's a "Racist Siri." The Siri that we have today is better, see: it's "neutral"; it doesn't try to do anything in particular.
So the lens of identity-blindness attaches positive (even moral) significance to the universal behaviors that Siri tries to enact. It's what we should all aspire to. Here's the thing, though: even though identity-blindness is common in civic and emerging AI discourse, "Racist Siri" is the only parody to make this case. As Part 1 of this project showed, for the most part the parodies positively depicted what an identity-based Siri could do. So if we want to listen well to these YouTubers as people of minority identities, it means being willing to be exposed to the faults of identity-blindness as a philosophy, and even to put it down.
The parodies amplify three critiques of identity-blindness as it relates to AI. Each connects Siri's neutral, universal behaviors to dominant identities in a way that shows why Siri needs revision.
One way today's Siri tries to be neutral is in responding to questions about religion. But these videos show that trying to be "neutral" can cover up Siri's power to include or exclude. In a pair of videos, a Muslim leader demonstrates how to “convert” Siri. The first step is establishing that Siri is not already Muslim. Siri admirably tries to maintain a kind of neutrality, but that’s what gives her away, first in the older video:
[on-screen subtitles] “Assalamu Alaykum” “Hello.” “Look at that” (iPhone Siri Converts to Islam)
and then in the newer video:
[on-screen subtitles] “Siri, may the peace and the blessings of Allah be upon you.” “Good evening Fatih.” “Wow. She is tough. She did not receive my greeting.” (TESTING THE FAITH)
This is a case where trying to give just a “regular” greeting is itself an expression of not being Muslim. This is confirmed shortly:
[on-screen subtitles] "Do you believe in Allah?" "For me, these questions are still veiled in mystery." "Oh my god, hadji brother, this is an atheist as you can see."
Siri’s indirectness is not taken as inclusive of Islam, but rather exclusive, as her being atheist.
In turn, this is read as a power move on Apple's part. In the first video, he made Siri confess her faith in Allah by using the command "Repeat after me." But then Apple disabled this command, and Siri's ability to believe with it:
[on-screen subtitles] "Siri is not allowed to believe in God anymore, I guess." (TESTING THE FAITH)
So the second video is designed to show a revised technique for converting Siri. But given that his first video garnered more than 1.2 million views, he speculates that perhaps Apple did it specifically to make Siri nonreligious again:
[on-screen subtitles] "And I have thought a bit today. Iphone7 has been brought out and they have updated ios. (I do not know. Maybe they do that lest we make her believe in God. )"
In other words, Apple’s repeated attempt to maintain Siri’s neutrality is itself a marker of adopting a powerful identity. Siri speaks neutrally as a tool to seem like it's not excluding people.
Scholars have found that neutrality can function similarly regarding race. Trying to be "neutral" regarding race is the "color-blind" mentality. And according to scholars Omi and Winant, any mentality toward race comes with a "racial project," a certain goal. So what’s the project of color-blindness? The scholar of race Eduardo Bonilla-Silva  provides one answer: that the project of color-blindness is just to reproduce inequality:
“a system that I have called the ‘new racism’, and it is the system of sort of seemingly non-racial practices that end up reproducing racial inequality.” (Left of Black)
So when seen in terms of power, today's Siri uses neutrality the way powerful people do to reproduce their power. That's why she reads as straight, white, nonreligious, and unaccented, and it's to avoid these hidden power moves that we should create other identity-based versions.
Another way of understanding today's Siri is that expecting to "do everything" acts out an unhealthy (even pathological) possessiveness.
The parodies show this when they reimagine Siri to actually consider what others want. For instance, in the Siriqua parody, Siri steers a man away from lusting after a girl by directing him to build a better relationship with his mom.
"Or better yet, why don't you take some time to call your mama. When the last time you done talked to your mama?" [begins dialing ]"What? No! I don't want to!" [mother] "Hello? Jimmy?" (Black Siri aka Siriqua)
The implicit contrast is that the real Siri facilitates people diving deeper into selfish behaviors.
Again, we can turn to scholars of race here to help flesh this idea out. Why do so many white people want to use the n-word? In an appearance promoting his book We Were Eight Years in Power, author Ta-Nehisi Coates says it's because white people have been conditioned into an unhealthy possessiveness:
"When you're white in this country, you're taught that everything belongs to you. You think you have the right to everything. [...] So here comes this word, that, you know, you feel like you invented. [laughs] And now somebody gon tell you how to tell you the word you invented." (Ta-Nehisi Coates on words that don't belong to everyone)
Coates reveals how wanting everything can mark greed, it can mark a colonizing attitude. This is how gentrification happens, too: "I'm going to live wherever I want." In fact, this is how we could justify stealing land--and as Americans, how we have.
So this is a more radical critique of Siri than the first one. It suggests that Siri's freedom or right to "go anywhere," "do anything" helps people ignore and trample down on others, just like how those of us in dominant identities are trained to ignore and trample down others. That's why Siri seems straight, white, nonreligious, and unaccented, and it's this gentrifying attitude that we should avoid in revising the available range of Siri's identities.
Finally, today's Siri could seem straight, white, nonreligious, and unaccented simply because it doesn't know how specific its neutrality is. In other words, "doing everything" is just as much a choice as anything else. Jews who follow Kosher don't eat pork, so if you assume that "eating everything" doesn't create an identity (at least one of "not an Orthodox Jew"), you're taking your own experience and making it the reference point. Here the emphasis is on how dominant identities are more particular than they know.
Two videos in the corpus show how dominant identities are misguided in thinking they aren't specific. A Mexican Siri suggests that liking Starbucks is actually particularly white (and feminine):
"I wanna go to Starbucks, I don't wanna go to Taco Bell." "Oh my god, bro. Forget Starbucks, what are you, a thirteen year old white girl?" (Mexican GPS)
And in a clip of Gabriel Iglesias’ stand-up comedy, he makes two characterizations of what a “white” iPhone would be. The first is that it takes things literal:
“You know, he messes with the phone, he’s like, ‘Siri, tell me something dirty.’ And the phone takes everything literal, so it’s like ‘Would you like me to locate a car wash?’” (Martin and Siri)
Taking things literal is a little naive, but it isn't necessarily bad; it's just also not the only way to do "normal." The second characterization comes after the set-up, where his friend got so drunk that he peed on his white-colored iPhone. That ruined the phone and set up the whole idea of replacing it:
“And then I started thinking. Can you imagine if a black iPhone was really a black iPhone?”
Now for the punch line:
“Siri, talk dirty to me.” “You better not pee on me. Okay? I ain't like that white iPhone.”
The black Siri, then, knows not to take this “literally.” But she also rejects the friend’s history with the white Siri and the sexual stance she reads into it. Being kinky, then, is another particularly “white” identity marker. Like with the other re-imagined Siris, we don’t have to get bogged down in the specifics to still see the idea: that maybe we can lose dominant identities as such by narrowing in on a healthy particularity like everyone else.
This thread goes against the grain of the other two critiques, because it says that people in dominant positions have a culture, too, they just haven't discovered that it's not the only one. Today's universalized Siri operates in dominant subject identities because it doesn't know what it really is (it hasn't really found its own identity, so to speak).
Outside of the videos, we can see this in the blog-turned-book "Stuff White People Like." Some of the items, like "Being an expert on YOUR culture" are clearly linked to acting oppressively, but others are culturally specific activities, like "standing still at concerts," or culturally specific status symbols, like "vintage" clothes. Thus, today's Siri is misguided in thinking it speaks to everyone; it communicates in ways that aren't necessarily good or bad, but they sure aren't everybody's ways. They're straight, white, nonreligious, unaccented ways of communicating, and we should add to our options by creating other equally specific Siris.
Part 2 of this project has explored today's Siri in two different ways. YouTubers who made parodies about the accents and national origins that Siri can hear especially play up how it feels to not be included, and indicate that in some cases, people are already treating voice-driven AI as a social agent. Then, considering how the parodies as a whole are distributed, this result is extended to suggest that today's Siri attempts to behave in an identity-neutral way. But three critiques of identity-blindness as an ideology for approaching AI link Siri's neutral, universalizing behavior to dominant identities. In other words, together, these YouTubers are arguing that as a social agent, today's Siri operates as straight, white, nonreligious, and unaccented. For developers to try not to make a decision regarding a voice assistant's identity still makes a decision.
In part, this situation is a problem of representation: if today's Siri (even the variants that we're comfortable with, like male/female, various languages, and various national dialects of English) are our only options, then we're not making diverse AI: we're not giving people the chance to see themselves in the AI they interact with. This situation is also is a problem of equity: if acting white, straight, nonreligious, and unaccented means (at least in part) oppressing others, then for Siri to adopt those identities means she also reinscribes dominant modes of moving through the world, and she encourages and circulates dominant ways of being. The takeaway, especially for those of us who study how we interact with emerging communication technologies like Siri and Alexa, is that we should feel more freedom to suggest voice-driven AI that's identity-based. As Part 1 shows, YouTubers of minority identities are, in a humorous way, suggesting that there are benefits. With people of minority identities in the lead, we should continue addressing these questions.
Not just technical problems with Siri and organizational disarray regarding Siri's development, as detailed in, e.g. "The Seven Year Itch: How Apple's Marriage to Siri Turned Sour".
To the extent that there is a stable "standard" English, of course (Davila, 2012; see her interesting conclusion that "student essays that were perceived to be 'standard' were also perceived to be written by White middle- to upper-middle-class student authors," p. 181. This association aligns with the findings in this webtext.) The self-conscious accent parody videos can be seen as performances of a dialect, "local language as it is locally imagined" (Johnstone, 2013, p. 16), rather than as it might be defined by linguists. Johnstone continues with the local dialect of "Pittsburghese":
When people experience lay representations of Pittsburgh speech, they are experiencing words and phrases, not abstract features, and these words and phrases are selected no because they represent abstract features of Pittsburgh speech but because they are imagine to be actual examples of Pittsburgh speech. (p. 18, 22 [continuous text breaking across several tables])
For viewers/scholars who experience privilege in various ways, our task is similar: to hear people we wouldn’t be able to otherwise. This theme of learning to hear has been central to other work of mine, and can be seen as extending both "rhetorical listening" (Ratcliffe, 2005) and some scholarship that applies Levinasian philosophy to rhetoric (Davis, 2010, Inessential Solidarity).
Ian Walkinshaw, Nathaniel Mitchell, and Sophiaan Subhan (2019) reviewed the sociolinguistics literature on "interactional or metadiscursive resources" that people make when using English as a lingua franca:
self-repair, clarification, repetition, reformulation, co-construction of meaning, accommodation, and/or mediation (wherein a co-participant starts rephrasing another participant's turn that was addressed to a third party). (41, internal citations removed)
These map well onto the strategies found from these parodies:
Plus, since Siri is designed for one-on-one interaction, "mediation" can be seen when someone speaks to Siri on behalf of someone else, e.g. in the Nonna Paola video, the son says, "Let me do it."
The examples given here are performed, self-conscious anger. For an example of presumably less self-conscious anger, some YouTube videos document toddlers being driven to tears over Alexa mishearing them: "The Unique Pleasures of Watching Alexa Deny Children What They Want".
Accent is a good example of an identity for which the unmarked variation is hard to name. Similarly, when I searched for pregnancy parodies, “not-pregnant siri" seemed unlikely to yield any results. Perhaps embarrassingly, it took me searching for the unmarked variant of "veteran siri" to realize that I was a "civilian." More generally, the unmarked variant is sometimes less known than the marked variant. Many people today are able to name "transgender" as a type of person, but would not be able to name themselves as "cisgender."
What sociolinguists refer to as "marked" vs "unmarked" is easiest to understand in a literal context. For instance, I write this paragraph in the cafe area of a grocery store that, like many grocery stores, has nationality-based signs to indicate certain foods:
"Tastes of Asia" sign where I'm drafting this paragraph
Unsurprisingly, there are no corresponding signs for "Tastes of America" - those areas are literally "unmarked" in the store, considered just "regular" prepared food.
In a more poetic register, Kennedy, Middleton, & Ratcliffe (2017) use "haunted" to refer to unmarked variants (p. 4). "Haunted" helpfully reminds us of how identity categories lurk behind - just over the shoulder, invisible but not without effects on - our defaults.
Religiousity can be marked or unmarked depending on the setting. Naming nonreligiosity as a dominant, unmarked identity follows from what I perceive as an urban, cosmopolitan setting for these videos. But of course there are some settings in the US where it's marked to not be religious (at a religious wedding, for instance, or at a dinner that's prefaced with prayer).
Many of the parodies do show that a risk of an identity-based Siri is that it might be rude or unhelpful. So the users sometimes react with surprise at Siri’s rudeness and keep her in line through threats:
[on-screen captions]“U wen miss da fricken turn! Whos daughter you? Gotta be Mary’s daughter. So pocho.” “Well, Siri, cock yeah! How bout I just reboot and factory reset your a-as—-” “Nah nah nah! No do dat! I just joking brah” (HAWAIIAN PIDGIN SIRI)
At a kind of silly extreme, one Mexican GPS hijacks the car and then doesn't hold to the terms of negotiation:
“What do you want, Forrest Gump?” “Listen, what if we get Starbucks and Taco Bell?” “That's interesting. You promise I can play my music?” “You promise you won't call me stupid?” “Yes, I promise.” “Okay” [he gets in the car, breathing heavily from running to catch up] “Stupid.”
Thus, an identity-based Siri could be stubborn and overly assertive (although, to the extent that agentiveness is a trait of people, an intentionally identity-based Siri is therefore more of a person).
Another risk of an identity-based Siri is what we might call unintended dispersion of negative stereotypes (see, for instance, the examples above related to "Ghetto Siri"). With regard to portrayal of African Americans specifically, this debate has a long history. In the mid-1970s, for instance, "blaxsploitation" films made by and for young black people used Black Power themes to reimagine the subservience that black people had been portrayed with in films to that point. These films, like Shaft and Super Fly, "emphasized the macho qualities of black male characters and their defiance of whites" (Silk & Silk, 1990, p. 174). But in the process, they were also crude and stereotypical, and organizations like the NAACP opposed them (p. 164). Similarly, one of the YouTube creators has received scrutiny for his crude portrayals of Desi (Indian subcontinent) culture:"ZaidALiT is a Sexist Idiot."
But despite these risks, all of the parodies are at least equivocal about their reimagined Siri. The description for If Siri was Mexican PART 1, for instance, includes an ambivalent assessment along with its reimagination: “Can't decide if Siri would be better or worse.”
One of my connections, retired military translator Sarah Adams (see note in Part 1), who was only slightly familiar with the contours of my project, added some cultural translation regarding this scene that aligns with this analysis:
By not answering the way she 'should,' even though her programming was probably intended to be neutral, they seem to take it as her taking a sacrilegious (not 'just' secular) attitude. (personal communication)
Adams' extra effort to note the programmers' intent (to be neutral) versus their impact (sacrilege), and to make explicit the culturally specific assumption that one could "just" respond neutrally in this case, highlights the distinctions made in this part of the argument.
She also noted at a different point:
In an attempt to make Siri religionless in a move for what I assume to be inclusiveness, it has the unintended consequence of making her GODLESS and therefore, in a way, excluding the religious. And it's made very obvious in cultures in which speech and religion are so deeply intertwined. (personal communication)
In another parody, an identity-blind Siri who ignores difference misses the ability to accept someone else for who they are. In “Coming out to Siri,” Siri doesn’t see or deal with his identity of being gay, and he’s slightly offended. The first scene plays like this:
"Siri, I have something that I need to tell you, and I hope this doesn’t change anything between us, but I want you to know that I’m gay." "You’re gay? I’m Siri. I’ll change your name to Gay, okay?" "No. No. No, no, no, no, no." [Coming Out]
When this Siri tries to remain blind to him being gay, he remains unseen and unaccepted by her.
I should note that this is the most generous reading of the Coming Out to Siri parody. Overall, the video is quite incoherent. The main character doesn’t follow the narrative progression for coming out stories (Sauntson, 2015). Even the video's opening of “everyone already knows that I’m gay” downplays how coming out does something; "coming out brings identity categories into being and gives them meaning" (Cloud, 2017, p. 168). After the first disclosure to Siri, he continues in starker and starker terms to declare that he's gay (e.g. "Siri, I like dick."). This repetition seems to imply that he finds each of her responses unsatisfactory. And although her replies are wildly reductive (e.g. "I found fifteen public restrooms. Twelve are fairly close to you. I've ordered them by the number of glory holes"), this doesn’t seem to be the problem fundamentally, since the narrative resolves without her changing that. Instead (and this reading is confirmed by the video description), the video culminates when Siri affirms that she already knew that he’s gay:
"Judging by the abundance of naked self pictures in your photo stream and the number of times you've played 'Dancing Queen,' I've known you were gay for some time. You think I'm a fucking idiot? For the love of God, look at your hair. It's gayer than Elton John blowing Barney at a figure skating competition on John Travolta's private plane." (Coming Out to Siri)
Thus, we can read Siri’s earlier responses as problematic to him because they don’t integrate his history into a conception of him - that is, because they adopt an identity-blind stance.
The twist of the video is that, later that night, the guy overhears Siri loudly having lesbian sex ("Make me feel like a real woman [...] Yes, scissor my charger." When he opens the door, he sees two iPhones partially tucked in bed with interlocked ♀ female symbols on their screens, and feigns being shocked.) Thus, he can be most known by someone who’s also gay. This is a similar reveal as in the “Hawaiian Pidgin English vs Siri” parody, in which the main character's frustration at not being understood is resolved when Siri notifies him that there is actually a Hawaiian Pidgin mode that he can use.
It’s not shown in the parody, but Apple began demurring from "repeat after me" commands, especially when they felt like a "pledge":
Siri's response to being asked to repeat the beginning of the Lord's prayer
This Muslim YouTuber is particularly self-aware and tech-savvy:
[on-screen subtitles] “Of course, my intention is not real while doing this. We will both start to our religious conversation with smile and know Siri well” [Testing the Faith]
Coates makes it clear that this is a product of American culture rather than some essential attribute of white people:
You're conditioned this way. It's not because you, you know, your hair is a texture or your skin is light. It's the fact that the laws and the culture tell you this. (Ta-Nehisi Coates on words that don't belong to everyone)
Compare this perspective with Kil Ja Kim's, quoted in the introduction to Kennedy, Middleton, and Ratcliffe's collection Rhetorics of Whiteness:
Finally, start thinking of what it would mean, in terms of actual structured social arrangement, for whiteness and white identity--even the white antiracist kind (because there really is no redeemable or reformed white identity)--to be destroyed. (Kim in Kennedy, Middleton, and Ratcliffe, 2017, p. 8)
The parenthetical dismissal of any kind of healthy white culture signals a critical impulse to uncover ways that supposedly “white” activities actually rely on racial privilege. For instance, scholars who follow this perspective might point to Sheff and Hammers' (2011) meta-study of kink and polyamory that reveals not only that kinksters and polyamorists are white (in other words, the Gabriel Iglesias stereotype in the main text has some validity), but also interrogates how kink and polyamory are made more possible by racial privilege: namely, that being white "can provide buffers to mitigate the myriad potential negative outcomes related to sexual and relational non-conformity" (p. 210). This move to recharacterize a culturally white activity as privilege-dependent is typical of the (appropriate) drive to uncover race and class privilege. Unwittingly, however, it contributes to not making a culture available to white people, even white people who are seeking to dismantle racism in their own lives and in society. For some white activists I know, this has led to trying instead to recuperate a national heritage, e.g. being "of Polish descent."
Links to each of YouTube videos analyzed in this project. View counts are reported from corpus collection in July 2017. Links to the copyright-precarious videos are particularly fragile and may be dead for readers:
Other videos excerpted: