the ConversAItion: Season 5 Episode 32

Unpacking The Technology Behind User-Centric Conversational Design

Susan Hura is a renowned speech technology expert and user experience designer with more than 35 years of experience in the linguistics field. She also happens to be the Director of Conversational Design Services at Interactions. This week, Susan joins us on The ConversAItion to walk us through the evolution of speech technology, from phone trees to virtual assistants, the importance of user testing and why every conversation should be considered a contract. You can find Susan on Twitter @speechusability.

Google Podcasts, Spotify, RadioPublic, Stitcher and TuneIn or the podcatcher of your choice.

Read the transcript

“Every conversation is a contract between two people. We learn to be expert conversationalists before we can even walk, so when we talk about applications in technology being conversational, it has more to do with a gut feeling of playing by the rules. The technology can’t force me down a specific path; it just lets me answer the question without having to find the magic words to make it work.”

About Susan Hura

Dr. Susan Hura is a conversational user experience designer and strategist with 35+ years of experience in linguistics, user-centered design and speech technologies. As the Director of Conversation Design Services at Interactions, she leads a team creating transformational conversational experiences for global enterprises. Previously, Susan was Program Chair of the SpeechTEK Conference (2007-2016) and founding president of the Association for Conversational Interaction Design. She holds a Doctorate in Linguistics from the University of Texas at Austin, and BA in Linguistics from Ohio State University.

Short on time? Here are 5 quick takeaways:

Speech technology has evolved tremendously over the last decade.

The customer service industry has been defined by limited speech technology for decades, with applications like phone trees and IVRs constrained by a finite list of words spoken by the customer. Anything beyond those specific phrases is a “no match,” meaning that the application couldn’t move the conversation forward. While technology has progressed into new applications, such as chatbots, it’s still very limited in its understanding of human conversation. So while customers could say, “I want to know the balance on my account” instead of “account balance,” they still have to use those specific keywords to progress the interaction.

But things have changed for the better, particularly in the past few years. Susan describes today’s speech technology landscape as a “brave new world,” in which users can speak freely and in their own words. A modern conversational application has the ability to understand exactly what users are saying, what they mean and what kind of service they need—illustrating just how far the industry has come.
Developers must look beyond traditional linguistics to account for the nuances of conversation.

While speech technology has seen ample progress in recent years, there are still shortcomings in today’s applications. Susan believes these are in large part due to a failure to understand how conversation between two human beings actually works and how we use language to get things done in the real world. According to Susan, not many linguists actually spend a lot of time studying conversations, which has created a knowledge gap. Why? Linguistics concerns what we know as native speakers of language, while conversation covers how multiple people interact together.
Users can—and should—be a part of the design process.

User testing is a simple, yet important step in developing conversational technology—but it’s often overlooked. The process involves recruiting a set of representative users to interact with a realistic version of an application, observing their reactions, patterns, and thought processes, and asking them questions about their experience. While designers are deeply involved in the development of the technology from start to finish, users are far more removed from the process—so they’re able to objectively judge an application and reveal any assumptions that may have been made during the design process.

User testing ultimately allows designers to get a fresh perspective on which elements should be tweaked in the application’s next iteration to optimize the experience. Susan has run hundreds of tests with thousands of participants over the course of her career, and has never regretted any of them.
A good conversational application feels intuitive and comfortable for users.

Susan likes to think of every conversation as a contract, since the people involved are implicitly promising to listen, understand and give a relevant response to one another. While we don’t explicitly list those rules, we quickly become aware of them if one is broken. Consider when an IVR states, “There was a payment of $127 from your checking account ending in 1234. Is that right?” We might want to correct them by saying, “No, the payment was $137, not $127.” But an inflexible system would likely respond, “I didn’t catch that. Was this information right? Please say yes or no”—preventing us from both responding and giving them the correct answer.

Language comes to us naturally—we begin to learn it before we can even walk. So, good conversational technology should be just as intuitive, comfortable and collaborative as a conversation with another human to fulfill its end of the bargain. It shouldn’t force people to use specific words or answer narrow questions to get an answer.
The future of conversational technology is multimodal and flexible.

Susan is optimistic about what the future of conversational technology holds for customers. First, she anticipates a rise in multimodal conversations that are able to transition seamlessly from voice to chat and back again, without losing the thread. This would effectively mimic the human ability to stop and write something down before going back to speaking. While this capability has been in the works for years, we have yet to see it in practice. Today’s advanced technology may allow customers to finally experience it.

Susan also expects greater emphasis on advanced dialogue, which allows for more flexible interactions. For instance, if a caller is trying to make a dinner reservation, there are several components that comprise the request: the location, the number of people, the date and the time. Advanced dialogue technologies will be able to simultaneously collect multiple pieces of information in any order, enabling the caller to take a nonlinear yet streamlined path through the conversation.

Read the transcript

TRANSCRIPT

EPISODE 32: SUSAN HURA

Jim Freeze Hi! And welcome. I’m Jim Freeze, and this is The ConversAItion, a podcast airing viewpoints on the impact of artificial intelligence on business and society.

[UPBEAT MUSIC]

On today’s episode, we’re doing something a bit different. I’ll be speaking with Dr. Susan Hura, renowned speech technology expert and Director of Conversational Design Services at the company I work for, Interactions. From academia, to entrepreneurship, to consulting, Susan has been deeply entrenched in the linguistics field for over 35 years, working closely with both businesses and consumers to understand and build conversational experiences.

Today, she’ll give us the full download on how AI-based speech technology has evolved over time, why and how businesses can incorporate user behavior and feedback into conversation design, and what a truly conversational experience looks like.

Susan, welcome to the show! We’re thrilled to have you. Over 4 seasons, you are the 1st Interactions co-worker that we’ve had on the Conversation!

Susan Hura That’s amazing, Jim, thank you so much. I am, I am super excited to be here and chat with you.

Jim Freeze Well, I’m super excited to have a fellow Interactions employee on The ConversAltion. So you’ve had a pretty impressive career in speech technology and conversation design working with businesses to help them better communicate with their customers, not to mention like me, are also a graduate of the Ohio State University. Can you walk us through your experiences and what drew you to Interactions?

Susan Hura Sure. I’d be happy to do that. I’ll try to give you the short version. So I actually have been kind of obsessed with speech technologies for as long as I can remember. I actually read an article about computer speech recognition when I was still in high school. And I wrote away to MIT. This was when you had to write a physical letter to get them to send you a paper. And they sent me a bunch of stuff, and I was like, wow, I understand these individual words, but I have no idea what it means and it’s, except that, it was the coolest thing I’d ever heard of. And that eventually led me to getting a degree in linguistics. So unlike a lot of linguists, I didn’t do it out of pure love of language, but because I always had this eye onconversational technologies. So I spent some time there in academia. I spent some time after that at Lucent Bell labs, which is academia light and since then I have been so fortunate to be able to make a living as a linguist of all things, working with these technologies that I am still super excited about.

Jim Freeze That’s fantastic. And obviously, the speech technology industry has really evolved tremendously over the last decade. from IVRs to chatbots, to virtual assistants. Can you walk us through these changes and how it has evolved into what we have today?

Susan Hura Sure, speech technologies that have been commercially available, right? Ones that have been ready for prime time, so that organizations can use it to communicate with their customers. They have indeed come a really long way. We used to only have the ability to do very limited kinds of recognition. Where to build a customer service application, what we had to do was basically instruct the color. Here are the words that you can say, and then we would have to, in the application, code up and say, listen for this set of words.

Jim Freeze And we’re all very familiar with that, aren’t we?

Susan Hura Absolutely. You can say A, B or C and anything that fell too far outside of that was just treated as a no match. Something that we had no idea how to deal with. And yes, we progressed over time so that if, if the user said, I want the balance on my account instead of my account balance, we eventually got a little better, but it was really constrained. And so the name of the game in conversation design back then was how to bring the user in and ask questions so that they would give us exactly the set of responses that we were expecting, because otherwise we couldn’t help them.

And now of course, we’ve entered this brave new world where we really have the ability to let users speak to us however they choose. They can describe their issues in their own words. And now, through the amazing natural language technologies that we have, we are able to understand not just what they’re saying, but what they mean and then take that to use, to provide them with a really superior kind of service.

Jim Freeze That there’s no doubt about it. Where does the industry still fall short and what steps should we take to move towards better conversation design?

Susan Hura That’s a really good question. I think that some of the big failings are really because of a failure to understand how conversation between two human beings actually works. So when I say conversation, I’m not talking about a lack of natural language processing abilities, because it’s the language I’m talking about, but how we, as people, use language to get stuff done in the real world. And, and to be honest, Jim, not a lot of linguists actually spend a ton of time studying conversations. So linguistics tends to be concerned with what you know when you are a native speaker of language, but conversation is not just what you know, it’s about how you interact with someone else. And I think that’s a big gap in the entire field is that there’s not a lot of people who have spent a lot of time understanding how human conversations work.

Jim Freeze Actually I think it probably has had an impact on how you view things from a philosophy perspective. And what I mean by that is you’re a big proponent of working with users to better understand how they want to communicate with companies. Can you talk about that process and what steps businesses should take to incorporate that user design concept into communications and their strategy?

Susan Hura Yeah, absolutely. Just jump in here and stop me because as I’m sure you know, Jim, this is like my very favorite thing to talk about.

Jim Freeze I know, I know. And, and it’s important. It’s very important.

Susan Hura It is. And you know, here’s the thing: user research as a concept is not anything that is specific to conversation. It’s certainly nothing that I came up with. And when you talk about what does it mean to do user research? It sounds deceptively simple, right? Oh, we’re trying to build an application that lets users do this certain set of things. How do we know if it works? Well, you recruit a set of representative users to interact with a realistic version of an application and then you observe what they do and ask them what they think about it. I mean that, that’s the kind of idea that underlies a user centric design practice, but it also is something that passes the grandma test. You could explain to just about anyone that yeah, we want to know if this works for users. So we ask them to try it and tell us what they think.

Right. So it sounds so basic, but the kinds of insights that you can get out of that are tremendous. And it’s really there, the combination of observing what people do when they’re interacting with an application and then that ability to say, Hey, what were you thinking when that happened? That’s what gives you the true power. There’s lots of ways of observing what users do, right? Every organization collects some kind of data about how people are interacting with applications. But the problem there is you can see patterns sometimes, but you don’t necessarily know what they mean. What was the user’s motivation? What was going on in their head when it happened? And so that’s the beauty of usability testing is we can see stuff happen and ask people about it.

So by getting that opinion feedback, what we’re able to do is prioritize the kinds of observations that we’re making in terms of how people interact. It’s only through that prioritization based on how much it matters to the end user that we’re then able to go back and say, yes, this is something that we will definitely fix. This is a nice to have, and this is a next release when we’re able, we’ll make this optimization. So it’s a super powerful set of tools that, again, on the face of it seems really simple

Jim Freeze Now you’re right. It does seem really simple, but it’s in products we interact with on a daily basis that it becomes pretty obvious when that kind of a process of involving users hasn’t been done. I think my favorite experience with this is, I don’t know, it’s like 10 years ago, I’m in a rental car and I’m looking for a button inside the car to open the trunk. I couldn’t find it. It wasn’t in all the logical places and it ended up, it was in the glove box and I’m like, who thought that was a logical place? You know, it’s pretty clear that no user was ever involved in that decision. So, I think what you’re talking about applying those concepts to design is really important, and I’m sure there are times that you’ve done usability testing and been really surprised by the results.

Susan Hura It happens every single time, Jim,

Jim Freeze Every single time.

Susan Hura I have run hundreds of tests with thousands of participants over my career. I stopped counting at around 1500 participants and never once at the end of a usability test, I say, eh, you know, wait, but that time might have been better, spent doing something else. We always learned something because here’s the thing, no matter how expert your designers are when they’re building an application, no matter what best practices you follow, no matter how much you understand about an organization, you are not the end user as the designer. Any of us who are involved in building these experiences, we know too much. We can’t, in some ways, think like the average person who’s going to interact with the application. And that’s why testing is so valuable is that you bring in people who haven’t been involved in figuring out how to solve the problems, so that they can reveal essentially assumptions that we may have made, places where we weren’t able to see with clarity, what the right path forward was simply because of the position of being the ones who are building it.

Jim Freeze It makes total sense. And it actually, you know, I’ve heard you say before that conversational experiences aren’t what we typically think they are. And so, from your perspective, what makes a truly conversational experience?

Susan Hura Yeah, it’s funny because conversational is a term that I hear a lot during user testing. So when people like an experience with one of our apps, they’ll say, yeah, that was really conversational. But I don’t think it means what you might think it means from the outside. So a lot of times when people say, oh, this app is super conversational, they tend to equate that with it being more casual and chatty, and somewhat less formal than you might expect. But I actually don’t think that’s what users are referring to when they describe something as conversational. I think what they mean is they feel comfortable that interacting with this application is intuitive. When a conversational application feels intuitive and it feels comfortable, I think what that means is that we’ve built the application in a way that the app is fulfilling its end of the bargain.

So every time we a have a conversation like you and I are doing here today, we are essentially playing according to a rule book. I say that every conversation is a contract between the people in the conversation. What’s the contract? Well, you’re essentially promising to pay attention, to listen, to try to understand what the other person’s saying, and then give a timely and relevant response. Now we’re not aware of those rules as we’re talking to another person until somebody breaks a rule. And that’s because we’re all able to do conversation. We all learn to be expert conversationalists before we could even walk around. Right. We are all great at conversation as toddlers. But so, so you know, these applications, when we talk about them being conversational, it has more to do with that gut feeling of oh yeah, this played by the rules. This didn’t force me down a specific path. This let me just answer the question and not think about what I have to say in order to find the magic words to make this thing work.

Jim Freeze You’re hitting on a key concept, which is letting people use their own words to communicate what they want. And simply being able to understand that. And it’s simple but it’s really profound to be able to have a technology that’s designed in a way that truly understands the intent of the speaker.

Susan Hura It’s also about, you know, just behaving in a way that conforms to these rules of conversation that we all play by. Here’s one of my favorite examples: most of us have had this kind of experience in a bad old IVR system where the application comes back and it’s confirming some details with you. So it might be a payment of $127 from your checking account ending in 1234. Is that right?

If you, as the user, notice something that’s wrong in what was just said, the rules of human conversations say the right thing for you to do, is to offer that correction. Yeah. Right. So the right thing to do is say, “oh, no, it was $137.” But in bad, old IVR days, you might get, I’m sorry, was this information correct? Please say yes or no.

Jim Freeze Yeah. Yeah. Please speak like a robot.

Susan Hura It’s not the rules of conversation, right? That’s the rules of bad old IVR. What you’re supposed to do, what you’re supposed to do is not just say, “nope, I don’t confirm that.” And then say nothing else. That’s distinctly unhelpful to the conversation. Right? If you view conversation as something cooperative where the two people involved are trying to get to a certain outcome, then the helpful thing to do is to say, no, it’s 137. So a truly conversational experience is one that is number one, aware that that is the right thing to do. And that we build our applications to accommodate those things and allow people to say the things that just come naturally to them.

Jim Freeze That’s a great example. I do have one last question for you.

Susan Hura Sure.

Jim Freeze We like you to, to look into the future five to 10 years. How do you think the industry is going to progress towards achieving more grounded, more natural, truly conversational experiences?

Susan Hura So I’m going to make predictions in sort of two directions. Although I think they’re really very connected. The one thing that I think is going to take off is conversations that occur in more than one modality. So conversations that can move seamlessly from voice to chat, maybe back to voice, without losing the thread of what you were talking about.

Jim Freeze Yep.

Susan Hura People have been talking about this for 15 years now.

Jim Freeze But none of us ever experience it. Right?

Susan Hura No, never. It almost never happens. But I think that there is the number one, the technology, I think is finally to a point where we can make that happen. But more importantly, we’ve got some sophistication in the way that we think about designing conversations that would actually enable that to happen. I mean, with human beings, we do this all the time. You know, it’s like, “Oh, let me jot that down for you. Let me draw you a map.” Right. We can do it in human conversations. I think we may finally be getting there in these automated conversations. The other thing is truly some of the advanced dialogue technologies that we’ve got so that we can enable even more flexibility in these automated conversations. An advanced dialogue, for example, can allow you to take more, I guess I would say nonlinear paths through a conversation.

So being able to build an application where we know we need to collect these five pieces of information from the user in order to say, make this reservation that they’re trying to make. But to build it in such a way that we are able to take anything from, I want to make a reservation, or I want to make a reservation at Main Street Cafe for four people on November 24th, at 7:00 PM that we could handle in the automated conversation, either one of those equally gracefully. So I think some of the advanced dialogue technologies are finally going to allow us to do that in a way that is more streamlined. And that is easier to maintain in a conversational application.

Jim Freeze You speak about a very exciting and hopefully not 10 years, but in the very near future. Susan, thank you so much. This has been fascinating. I really appreciate you being the first Interactions guest on The ConversAltion.

Susan Hura Well, you’re welcome, Jim.

[UPBEAT MUSIC]

That’s all for this episode of The ConversAItion. In our next episode, I’ll be speaking with Andrew Giessel, Director of AI and Data Science at Moderna. Andrew will discuss the digital-first culture at this leading pharma company, and how they leverage AI to streamline key systems and processes, including the development of one of the first COVID-19 vaccines.

This episode of The ConversAItion podcast was produced by Interactions, a conversational AI company. I’m Jim Freeze, and we’ll see you next time.

Check out more episodes of The ConversAItion.

See more episodes

the ConversAItion: Season 5 Episode 32

Unpacking The Technology Behind User-Centric Conversational Design

About Susan Hura

Short on time? Here are 5 quick takeaways:

Speech technology has evolved tremendously over the last decade.

Developers must look beyond traditional linguistics to account for the nuances of conversation.

Users can—and should—be a part of the design process.

A good conversational application feels intuitive and comfortable for users.

The future of conversational technology is multimodal and flexible.

Check out more episodes of The ConversAItion.