4 min read

Why voice and conversational AI is on the rise

Why voice and conversational AI is on the rise

UX is changing fast, and businesses will have to learn new types of interaction design. Voice promises to be huge, but designing voice-based interactions requires very different skills.

Not that long ago, “designing for mobile” was the hot topic in user experience (UX). But while many businesses are still struggling with the small screen, UX is moving on at an unforgiving pace as conversational AI picks up pace.

Voice, gesture, gaze and text: the new frontiers of UX design

In the next few years, according to Grit’s Nicholas Thompson, those familiar visual experiences will be joined by entirely new ways of interacting with systems, brands and content.

He predicts that “UX is about to fracture,” citing conversational AI like Amazon’s Echo, Google’s Soli, Facebook’s bots and Microsoft’s HoloLens, which are variously operated with voice, gesture, text and/or gaze controls.

The coming UX revolution means businesses will urgently need to develop new design skills. “Companies that don’t build competencies now will face even harsher disruption than those that ignored mobile,” Nicholas warns.

“Companies that don’t build [new UX] competencies now will face even harsher disruption than those that ignored mobile” – Nicholas Thompson, Grit


In truth, though, some skills are more urgent than others. It’ll be a while before a critical mass of people start buying groceries by staring at items in an AR headset display. But one of the “new” types of UX is already with us, and is growing at a fantastic rate.

Voice is on the rise

The stats say it all. According to Kleiner Perkins’ 2016 Internet Trends report, in 2013, only 30% of U.S. smartphone owners used a voice assistant like Apple’s Siri or Google Now. By 2015, that figure had risen to 65% – driven primarily by hands-free convenience.

Source: KPCB, via Business Insider

And people aren’t just talking to their phones – there’s a whole range of voice-controlled, internet-connected devices coming on to the market: black boxes that entirely do away with screens and visual interfaces.

Amazon’s Echo speaker was the first to arrive, launching in the U.S. in June 2015. Its voice persona, Alexa, fired the imagination of pioneering brands who understand that consumers value the convenience of simply being able to say what they want, rather than having to fire up an app and navigate through menus and buttons.

Since the launch of Echo, those brands have developed more than 2,000 “skills” (voice-controlled services) for Alexa. If you have an Amazon Echo speaker – it launched earlier this month in the UK – you can ask Alexa to tell you the weather, order you a pizza from Domino’s, fetch you a cab via Uber, or any of several thousand other tasks – all without a screen, button or hyperlink in sight.

Designing for voice is a whole different proposition

So voice-controlled interfaces are here, now, and growing fast. But for companies that want to take advantage (and who wouldn’t?), it’s going to require a very big shift in user experience design thinking. Designing for voice is a wildly different proposition from designing for graphical interfaces, requiring very different ways of modelling, designing and developing human-machine interactions.

At VoxGen we know – perhaps better than most – that it’s very hard to get voice-based interactions right, and very easy to get them wrong. That’s because we’ve spent over 10 years designing IVR applications; a domain where voice has always been the primary interaction method.

Is IVR a shining example of how to design effortless, convenient voice interactions? Not usually. You don’t need me to tell you that when it’s done badly, IVR is a terrible, hated interface. So many brands still have awful IVRs that sound like robots, fail to understand what you’re saying, make you repeat things over and over, and sometimes even cut you off in mid-flow.

So many brands still have awful IVRs that sound like robots, fail to understand what you’re saying, make you repeat things over and over, and sometimes even cut you off in mid-flow.


At VoxGen we’ve always known IVR can be done better, and for the past 10 years that’s been our mission. We’ve poured all of our energy, talents and resources into designing smart, connected voice interactions that feel like a natural conversation. We know first-hand how hard that is to get right.

We also know that it takes imagination and drive to assemble a team with the right skills to deliver an exceptional voice-based customer experience, including:

Creative skills to design an on-brand voice persona, including the ability to incorporate appropriate social factors like region/accent, class, occupational dialect into the persona design.

Linguistic skills to design dialogs that meet users’ needs and expectations within the capabilities of the underlying technologies.

Specialized audio engineering skills to create flawless, smooth-sounding sequences of automated speech to support a natural and engaging speech interaction.

An understanding of psycholinguistics and how users will relate to voice interactions over time and with repeat usage.

UX research skills for voice interaction: the skills and underlying process to embrace the cultural and domain factors of conversational interfaces as well as individual differences.

Investing in the voice experience will pay off

Outside of IVR, most businesses are starting at ground zero when it comes to designing voice-based interactions. But if Alexa and her kind continue to take off, the rewards for getting it right will be huge.

Those that pay close attention to how people want to interact via voice, and invest in the right skills to create those user-friendly experiences, will gain an edge in the UX stakes. They’ll not only deliver the kind of experience that customers want, but they’ll also be able to achieve efficiencies by automating frequent tasks – whether in the IVR or (eventually) one of the new voice interaction platforms.

We don’t think voice-based interfaces are ready to overtake screen-based experiences just yet. Our experience in IVR tells us there will be a tough learning curve to navigate first. But as Forbes Technology Council member David Rajan, put it:

“Voice is now maturing in a way where it will become an unparalleled part of the user experience, and we will need to consider how we ‘design’ voice experiences more and more in the coming years.”


One final thought: if you’re looking for somewhere to “start making and playing” with voice-based interactions, your IVR might be just the place. And if you need expert outside advice on how to turn your IVR into a smart, connected, conversational experience, we’ll be only too happy to help.

Related Posts