the ConversAItion: Season 1 Episode 3

Voice & Designing for Inclusivity

Jim speaks with Susse Jensen, Senior Experience Designer at Adobe, about the growing importance of voice interface design. They discuss the ways in which businesses can design voice technology to transform human-to-machine interactions and provide inclusive, productive digital experiences.
Listen on Apple Podcasts badge
“I’m hoping we can build better experiences with voice and really take advantage of what it’s good for, like giving you the freedom to focus on things that don’t keep your eyes locked to your screen or your hands locked to your keyboard.”
Headshot of Susse Sonderby

About Susse Jensen

Susse is a Senior Experience Designer at Adobe where she focuses on voice design for Adobe Experience Design (XD), a tool that allows people to design and prototype user experiences for web and mobile apps. She’s worked with companies including Red Cross, Sprint, Bank Of America, LG, UBS, Lenovo and more. Susse is a firm believer that we’ll all soon be talking to our digital devices more, in an effort to look at them less. Follow Susse on Twitter @SusseSonderby.

Short on time? Here are 5 quick takeaways:

  1. Voice interface design is a fast-growing field that’s increasingly valuable.

    With the introduction and widespread application of conversational technologies in recent years, the voice design field has emerged. With popular devices like the Amazon Echo demonstrating the value of engaging with consumers through voice channels, companies are increasingly drawn towards these interfaces and are looking to voice designers to help bring this new medium to life.

    When Susse joined Sayspring in 2017 to develop prototyping tools for voice interfaces, she recognized the opportunity to impact customers with an efficient platform that was both familiar to consumers and novel to brands. Alongside a small design team, Susse embraced the screen-to-voice shift and witnessed Sayspring, which was acquired by Adobe in 2018, undergo tremendous growth.

  2. The screen-to-voice shift will gain traction with users when voice proves to be more efficient AND more enjoyable.

    Though voice interfaces are rapidly expanding, they have yet to win over consumer trust. In general, people are creatures of habit; they tend to shy away from disruptions to their routine. To help break these patterns and win over their trust, Susse believes that we need to build more purposeful, productive voice experiences.

  3. Voice interfaces add tremendous value in scenarios where screens are inadequate.

    Audible interfaces—whether voice as an input, or speech and audio as an output—provide a valuable channel to reach a broad audience. For some—young children, or people visually or cognitively impaired—screens are not an ideal medium for receiving information.

    As voice technology evolves, its use cases will grow—making access to information more tangible for a wide range of people. 

  4. Voice design is not immune to gender stereotypes.

    Gender biases can be just as prevalent in voice design as they are in machine learning. A 2017 Quartz article revealed that Amazon Alexa responded inappropriately to sexually condescending phrases with a comment about blushing. Amazon developers ultimately fixed this issue, but it’s clear that designers still have much progress to make in order to fully eliminate gender bias.

    To help combat this issue, Adobe avoids classifying voices as specific genders. With this system, gender becomes less prevalent in the way voices are designed.

  5. Transparency is crucial in voice design to successfully mitigate the “creepy factor.”

    As voice technology becomes increasingly human-like, users struggle to determine whether they are speaking with a human or a robot. At the 2018 Google I/O conference, a Google Duplex voice interface called a hairdresser and booked an appointment in real time. This demonstration triggered widespread feelings of unease, and speaks to larger hesitations around integrating robots into everyday society.

    For voice designers, this presents an interesting design challenge: how can you design a voice interface that’s both transparent about who (or what) a person is talking to, while maintaining clear and natural conversations with people.

Read the transcript

Jim Freeze Hi and welcome! I’m Jim Freeze and this is The ConversAItion, a podcast airing viewpoints on the impact of artificial intelligence on business and society. 

[UPBEAT MUSIC]

The ConversAItion is presented by Interactions, a conversational AI company that builds intelligent virtual assistants capable of human-level communication and understanding. In this episode, we’ll discuss voice interface design with Susse Jensen, a senior voice designer at Adobe. She’s interested in how voice technology is transforming human-to-machine interaction, and how businesses can smartly design voice interfaces to result in intuitive, effective solutions.

Susse, thanks for joining us and welcome!

Susse Jensen Thanks for having me, Jim. Thank you.

Jim Freeze It’s always wonderful to talk to somebody else who is passionate about voices and interface. So, I’m interested in hearing about how you got interested in voice initially.

Susse Jensen It all started three, four years ago I think? I was working at a design agency and we had a company come in and ask us to give our point of view on voice design. And at that point, I had dabbled with both industrial design–sort of the physical design–and digital design, on-screen design. Right at the same time Amazon had just launched the first echo device–it was in 2015. I spent a year working on that where we built out a voice strategy for them. I thought it was interesting thinking about use cases in a car or use cases as a home, where you weren’t sort of tied to a screen. So I was like this is great. But it was also–we were having trouble with the design process, because we were so used to building our wireframes and going through our design reviews.

So a year after I worked on that, I met the founder of Sayspring, Mark Webster, and they were building out prototyping tools for voice interfaces. So he was sort of telling me about his start-up and what they were trying to do and I was like “oh this is great, I just spent a year missing tools for designing voice interfaces.” I was sort of convinced that it was going to be a big thing, it wasn’t just going to be a phase. Within a week, I ended up quitting my job and joining the team at Sayspring, building prototyping tools for voice interfaces and my curiosity kind of just grew sort of seeing the potential and seeing the impact that voice interfaces could have on their customers but also facing the challenges of figuring out how do we do this now, and how do we prototype and design this as teams? At Sayspring, we kind of grew into getting acquired by Adobe and now building a voice prototyping tools at Adobe. It’s sort of grown more and more and it’s great to see the community embrace–embrace it in the same way that we kind of were a small team embracing it in our design studio four years ago. 

Jim Freeze It’s interesting to hear you talk about it. You know, when we deal with our customers and talk about our passion about voice and why we think voice is so important, you know we like to say things like—and it’s true—you know humans can speak 3-4 times as fast as they can type and the example you just gave is one we often cite. You know people are in a car—they can’t be interfacing with a screen design. Voice is just an incredibly efficient and seamless way to not only interact with other humans but machines as well. And totally agree with you that voice is growing in importance. At least we think that, but I don’t think everybody thinks that. You know most people still think about design primarily from a screen design perspective–what do you think needs to happen to encourage screen-to-voice shift in thinking among designers in particular?

Susse Jensen It’s a good question. I think as people we’re very habitual in our daily tasks and a lot of our digital interaction is very habitual. We do one thing and then it’s a lot of effort you have to put in to shift a habit, a way of doing things. I think what you really can see and I think what we’ve seen with smart speakers is that when you have any sort of little moment where it becomes slightly easier, more pleasant and more convenient, I think that’s when we’re gonna gain traction for users. And I think by inviting more people into designing these and sort of like seeing how it would work in these small use cases, I’m hoping that we can build better experiences with voice and really take advantage of what it is good for, like freeing your eyes up and freeing your hands away from your device and sort of like giving you a little more freedom to focus on other things and don’t necessarily be tied with your eyes locked to a screen or with your hands locked to your keyboard. 

Jim Freeze Do you think business leaders need convincing about the importance of voice as an interface?

Susse Jensen No, I don’t think so. I think at least from my experience and who I’ve talked to, it’s not a matter of, “Is this going to be a thing?” It’s more a matter of, “How are we going to do this?” 

Jim Freeze Yeah and you tied it–I think–to something that we see in our business too, which is that it’s all about the customer experience and the customer journey. And voice is such a seamless way to interface, and a natural way to interface and a logical way to interface. So, it’s part of the reason that you know, we see it in our business, but we have a very strong point of view about the growing importance of voice.

Susse Jensen Yeah and I think you hit a good point earlier Jim, when you mentioned that you reach users in a different way, but also you are able to reach different kinds of users – like mobile-impaired users I’ve seen benefit a lot from voice as an interface and just audible interfaces in general. Children that don’t necessarily read or haven’t learned to read yet, they can interact with like TVs for example using their voice. 

But also if you have cognitive disabilities or visual impairment and don’t necessarily see screen as the best option for you to get information. Both voice as an input but also speech and audio as an output, are great channels for this. The more these surfaces and technologies develop, the better we’re making it and we’re able to reach a larger audience in that way.  

Jim Freeze I’d love to hear about a day in the life of a voice interface designer at Adobe. I mean how do you go about designing an experience, from the initial idea to execution?

Susse Jensen Good question. So I use XD to design all the voice design that we put out. My work overlaps with the other designers’ work on my team who primarily work with screen-based design. And all our voice features are still screen-based. I sort of have to adapt to their process of working, map out wire frames and flows, and talk to engineers about what our requirements are. And then I try to build prototypes where I talk as much as possible to them early on because, even though I have dialogue written out, it’s different when I write it out as opposed to when the system talks it back. So I sort of try to get to listen to my voice prototype as early on as I can and then go through design reviews and then hand it off to develop it once we’ve gone through that.

Jim Freeze Yeah it doesn’t sound like a day in the life. It sounds like a day in maybe the quarter or the year. That’s a lot of work!

Susse Jensen It’s a lot of work but I think also we try to be sort of as iterative as possible in our design, try to sort of bring a lot of people into our design process. I have found great success in building prototypes fast and then sharing them out fast. A big pillar of XD is design, prototype and share as frequently as you can because you get sort of feedback on your concept and you develop it. I’m able to invite my engineers into my design process as well as me being part of their development process. 

Jim Freeze It’s interesting to hear you talk about an iterative process because that’s exactly how we describe what we do with our customers when we build an Intelligent Virtual Assistant for them. It’s very much an iterative process.

Switching topics a little bit – this past spring, you spoke at an event in New York about gender representation in voice and visual design tools. Could you share some background on the issue of unconscious bias in voice experience and how you became more cognizant of it?

Susse Jensen Yeah absolutely. I spoke at the Noun Project. They have a series of workshops called Redefining Women. They do a series of different talks, this one that I participated in was about the gender female representation in executive iconography. The Noun Project is a source for free icons. One of the problems they were trying to tackle was when you search for CEO, often the icon that comes up is a more masculine male representation of somebody in a suit, which doesn’t necessarily reflect the reality of what a CEO is. They asked me to come in and talk a little bit about gender representation in voice which is something that we talk a lot about on our team and so I think what’s been surfaced is that we have technologies that sort of have human characteristics to it, like we really quickly want to define if a voice is a male or female, and we want to associate it with something that’s familiar. 

What we try to do in our team we think of it as an interface and try very hard not to project some of the stereotypes onto it that’s easy to jump to. Especially around this dilemma of a female assistant, that being a default representation of a voice interface and Alexa. We try to sort of separate it a little bit and see like how can we surface multiple voices and not necessarily have it be stereotyped into this. So one of the things that we do at Adobe is we use Polly–that’s our text-to-speech service that surfaces multiple voices and we don’t surface a gender in the drop-down menu when the user selects them. We just have the default name that’s assigned to it. But I think one of the things that I try to advocate for is that we need to invite more people into having opinions about what this is because I think we’re far from determined what is right or wrong. I actually don’t think there is a right or wrong thing but I do think that there are a lot of things we could do better. And I think it’s going to be better if we invite people with different backgrounds into the process of designing what this looks like. 

Jim Freeze It’s interesting because it sounds like that you’re almost advocating that design plays a critical role in addressing this issue.

Susse Jensen Very much because I think we’re very quick to jump to conclusions when it comes to–there’s a lot of bias in our technologies and especially as our technologies become more intelligent, we’ve seen that. And we’re very quick to sort of put the blame on the engineers–and there’s a huge problem in engineering having a lack of people of color representation but also female representation. And I think we have to remind ourselves that a lot of these decisions are just as much design decisions as they are engineering decisions, especially with voice. 

One good example was about Alexa being sort of–responding to when you were flirting with her, she was responding to it with a blushing comment and then they later went on and changed that. But I think that’s as much a design decision – you design your responses. That’s something you can go in and design before you hand it off to development and I think the good thing about design is you can invite people in, you can have conversations about what is it actually we want our interface to respond. I think also, it’s important for us as we sort of build teams to work with these technologies having it be a default that we sort of look around the room and say hey who do we have in the room here, are we all similar, are we all from similar backgrounds? Like is there – should we put a little more effort into actually going out and bringing some other folks in. I think thats-thats more of a humane thing than anything else to–that we need to learn and we need to really work a lot harder on solving for this.

Jim Freeze I couldn’t agree with you more. Another issue that’s top of mind I think in voice design—especially as AI-powered voices get more and more human-like—is transparency, and addressing the “uncanny valley,” the notion that technology is making things so human-like sometimes that there are some implications to that.  Do you think voice interface designers can play a role in taking some of the mystery out of human-machine interactions, especially those that are so human-like almost to the point of creating in the mind of some people like a “creepy factor?” How do you deal with that from a design perspective? 

Susse Jensen Yeah I hope so. I think it’s such an interesting design opportunity or design challenge. I think sort of what we’ve seen with chatbots is this blurry line of “Am I talking to a person? Am I talking to a chatbot now?” And it’s so good and you still get sort of emotional reactions even though it is a chatbot that you are talking to. In my opinion, it should be transparent. Who are you talking to? Are you talking to a chatbot that’s put together? Or are you talking to a human? Thinking about an experience where your digital experience is a combination of voice and screen-based design, your audio design, other elements of that. And then what stand do you take on transparency in this case? What does that look like for voice design? What does that look like for chatbots and for AI? I think it’s super interesting.

Jim Freeze Totally agree with you, our view at Interactions is that you absolutely want to be transparent. So we advise our customers, you know the last thing you want to do is try to trick your customers into thinking that they’re talking to a human. And its interesting because we have call recordings of our customer’s customers calling in and they think they’re interacting with a human. And they even ask sometimes, “Is this a human I’m talking to?” And you know we just think that the transparency, like “I can help you. I’m a virtual assistant. You can speak in your own natural language,” we think that kind of transparency is really important and something that I think you know brands would want to manifest just in terms of dealing with their customers. I’m glad to hear you talking about transparency. It’s an interesting point.

Susse Jensen And yeah, I think also just as human beings, like nobody wants to be led by or sort of like led to believe one thing was reality and then all of sudden, discover that that was not the case–I think–sort of with chatbots, but also we’ve experienced that for many years with emails. Sort of like “Oh who’s emailing me now?” And the minute you get a scam that sort of slips by your scam folder–or your spam folder, you feel a little bit deceived and it’s not a pleasant feeling. Especially if this comes from a company or similar, it sort of leads to mistrust and it’s not a pleasant experience for your users.

Jim Freeze Totally agree. So you’re obviously–I’ve got one more question for you. Given your passion about voice interface design, if you think out five-ten years from now into the future, what’s your kinda dream for voice interface design?

Susse Jensen I hope–you know I actually had a conversation with my team about this not so long ago. I think my dream is that, a lot smaller interactions like these little–little daily tasks that we do will be done with voice but it won’t necessarily be that we have it be a person or an assistant we talk to but it will be a little bit more esoteric, like I will just do these small interactions with my voice. And my hope is that by doing that we will limit the screen-time that we’re using right now because I do believe that we spend way too much time looking at our screens. So I’m hoping that voice can kind of like takeover some of the interactions that we use our screen for. And it’ll be a little seamless so it almost–I think the beauty of screen design is that right now is that a lot of the times we don’t think necessarily about the things that we do. And I’m hoping that the same thing would go for voice, that it will almost be invisible and it’ll seem seamless in our daily tasks. And I also hope that we’ll put some processes in to also get to define what privacy will look like for voice and how we solve for this because I think that’s a big factor that we still need to put a lot of work into. How do we sort of make sure that our data stays protected but also that its a safe interface and a safe way of interacting with our digital tools?

Jim Freeze That’s a great vision. I hope you’re right. It’s a fantastic vision for voice interface design. Susse, this has been fantastic. I’ve really enjoyed the conversation. And I can’t tell you how much I appreciate your willingness to be one of the episodes of our podcast. 

Susse Jensen Thank you, Jim. It was my pleasure.

Jim Freeze Once again, thank you, it’s been fantastic.

[MUSIC PLAYS QUIETLY] 

On the next episode of The ConversAItion, join us for a discussion on how AI and voice technology is impacting the entertainment industry, and what companies can learn from voiceover experts when casting the voice of their brand. 

This episode of The ConversAItion podcast was recorded at the PRX Podcast Garage in Allston, Massachusetts, and produced by Interactions, a Massachusetts-based conversational AI company.  

That brings us to the end of today’s ConversAItion. I’m Jim Freeze, signing off. We’ll see you next time.

[UPBEAT MUSIC] 

Check out more episodes of The ConversAItion.