Perception of AI Voices
Published on: July 5 2023
Artificial voices, created via AI, are poised to become an important tool for brands. Already used on phone helplines and interactive voice-based apps (such as Apple’s Siri, and Amazon’s Alexa), they are increasingly considered for advertising. Yet, how will the use of such voices influence consumers’ impressions of a brand?

The concept of computers and robots that can talk has long been a staple of science fiction, showcasing a future where computers communicate as fluently as humans. Pioneering this idea, Philip K. Dick’s ‘Do Androids Dream of Electric Sheep?’ (The basis of the movie ‘Blade Runner’) envisioned androids indistinguishable from humans, including their speech. Movies like ‘2001: A Space Odyssey’ introduced the chillingly eloquent HAL 9000, and ‘Star Wars’ gave us C3PO, a humanoid AI with human-like intelligence, and a command of thousands of languages. And on TV ‘Star Trek’ brought to life Data, an android with an impressive capacity for human speech, and ‘Westworld’ created robotic hosts simulating human behavior seamlessly.

In many of these examples, the AI voices, while fluent, are often overly formal, stilted or lacking emotional cues. These examples have often shaped our expectations of AI, prefiguring current advances in technology, and continue to inspire our understanding of what AI could become. So it may come as a surprise to some that already we have AI voices that sound fully Human and often indistinguishable from real voices, including emotional cues. Unlike, for example, the faces of realistic robots who still tend to occupy the ‘uncanny valley’, creeping us out with a very realistic look but that is not quite realistic enough. An AI voice recording can now genuinely pass as a Human voice.

AI voices and brands

Voice is becoming more important than ever for brands. Selecting the right voice is obviously important for a brand. AI voices can be particularly useful when a brand needs to produce content in multiple languages or dialects, as AI can generate voices with a variety of accents and speech patterns, and AI can use a singular voice, optimised for their brand’s personality, yet seamlessly fluent in many languages. Using AI voices also allows for greater flexibility and scalability, as AI voices can be generated and modified on demand.

Brands using AI voices stand to benefit in several ways beyond scalability and versatility. AI voices are often more cost-effective than hiring professional voice actors, particularly for large projects or those requiring multiple languages. That opens up possibilities for more professional sounding audio in online ads for smaller brands with more modest budgets. The use of AI also ensures consistent voice quality and tone across different content, bolstering brand identity where human voices are more subject to variability.

AI voices can be meticulously customized to achieve specific sounds or tones aligning with the brand’s identity or the message’s context—something that can be challenging with human voices. Furthermore, the use of AI voices signals a brand’s innovative and tech-forward stance, potentially attracting a certain audience segment. It might even sometimes be advantageous for a brand to make a voice more conspicuously AI, simply to contrast with its otherwise perfectly flawless simulation of Human voice.

However, using AI voices brings along potential risks that, when coupled with transparency, can impact a brand’s image. First, customers might perceive brands using AI voices as impersonal or devoid of emotional depth, leading to a potential loss of human touch— a factor crucial in industries or situations that thrive on human connection and empathy. Second, ethical issues might arise, with some customers expressing concerns about AI potentially displacing jobs for voice actors.

Lastly, misrepresentation can become a problem. If a brand uses AI voices without clear disclosure and consumers later discover this, it could breed feelings of deception and potentially tarnish the brand’s reputation.

Implicit perceptions of AI voices

Voices play an important role in our first impressions of a person. A voice can tell you about the physical features of a person, such as their size and age. And, of course, their psychological characteristics. We are adept at rapidly forming first impressions from people just on the basis of their voice. Within 400 MS(1). The two characteristics that seem most important when forming a first impression of a voice are trust and dominance(2). These would have been important from an evolutionary perspective, by helping our ancestors to quickly size-up any stranger to determine if they posed a threat. The process by which we do this is so rapid and mostly outside of our conscious awareness that it's next to impossible for us to verbally describe it. That makes voice perception an ideal topic for implicit research.

At CloudArmy we have run a series of implicit and timed-response tests on people’s perception of AI voices in advertising. Now that AI voices can be practically indistinguishable from real voices, we believe the real advertising question is should brands disclose that a voice is AI generated? There are obviously risks with both disclosing and not disclosing. Our research, in partnership with audio expert Steve Keller and to be presented at this year’s Nudgestock festival used a series of audio ads for invented brands (so that consumers taking part had no baggage from prior perceptions of the brands) recorded by both human voiceover artists and AI voices, and we systematically varied whether participants were told they were about to hear an AI voice.

What we found was that while a slim majority (60%) of people believe they could tell when a voice was AI or Human, when we checked their ability to do this, they were no better than chance!

However, while they could not consciously detect whether a voice was AI, their implicit responses told a different story.

Their Implicit responses to a brand, after they had listened to an ad for it that was read by an AI voice were significantly less positive and trusting than those who listened to the same ad read by a human voice. (Remember, this is even though the AI voice was not consciously detectable as such).

Moreover, even being told that the ad they were about to hear was voiced by an AI created a subsequently less positive Implicit response to the brand - irrespective of if the Ad they heard was voiced by a human or an AI. The knowledge of the AI shaped the Implicit response.

In other words: while people can’t reliably distinguish an AI voice from a human one, they subconsciously have a more negative response to it. And if they are told that it is an AI talking, that negative response persists even if the voice was human.

To watch our Nudgestock 2023 presentation on YouTube, click here.

The world of AI voices is an evolving field, teeming with possibilities. For forward-thinking brands, these tools can offer exciting new ways to connect with audiences and convey their brand’s identity. However, understanding and managing consumers’ perceptions of AI voices is crucial for their successful integration. As we continue to explore the boundaries of AI technology, research such as that conducted by CloudArmy will be vital in shaping its role in brand communication.

(1) Mileva, M. and Lavan, N., 2023. Trait impressions from voices are formed rapidly within 400 ms of exposure. /Journal of Experimental Psychology: General/.)
(2) McAleer, P., Todorov, A. and Belin, P., 2014. How do you say ‘Hello’? Personality impressions from brief novel voices. /PloS one/, /9/(3), p.e90779.