When was the last time you tried to use the speech recognition feature on your phone or in your car? Maybe it was to ask your GPS program for directions, to place a call without taking your eyes off of the road or to check on traffic ahead.
Did it work? If your experience is like mine, you probably had marginal success. For example: my GPS system only hears me correctly about 50 percent of the time and struggles with street or town names. Admittedly, my Boston accent, coupled with the noisy environment of my car, doesn’t make things easier. Having been forced to pull over on several occasions to manually input my destination, I now make it a point to enter the location into my GPS before I hit the road.
According to a recent study by AAA on the potential safety risks of hands-free systems for vehicles, a detour in your morning commute could be the least of your concerns. The report found that voice-command systems can cause distracted driving, even if a driver’s eyes are on the road and both hands are on the wheel. This recent phenomenon, called “inattention blindness,” is as dangerous as manual texting while driving.
Common barriers like these have marginalized the value of voice recognition technology in smart devices, relegating its primary purpose to entertainment and infotainment status. Look no further than the Google Now commercial where a child asks if dogs can dream, or the new Amazon Echo interactive speaker, which serves as a nifty way to dictate notes or ask about the weather (and more importantly, to add things to your Amazon shopping list.) When it comes to the important things, most people opt for solutions that are sure to work every time.
Consumer research backs this up, too: according to a recent study by Affinnova, 41 percent of Americans feel strongly that the smart products they’ve seen or heard about are gimmicky. More than half say they won’t upgrade to a smart product until the maker can prove it has value beyond novelty.
For smart devices to take off, we need to focus on the interface. This means speech recognition that works regardless of background noise, accents or barking dogs and natural language understanding capabilities that understand not just what’s being said or typed, but the intent behind it. Because whether via speech, text or touch, the interface is the critical link that can turn a novelty into a must-have.
To continue reading, head to WIRED Innovation Insights for the full article.