“Agent, agent, agent.”
If you are like me, your experience with most of the voice recognition systems offered by the airlines devolves quickly to a frustrated plea for human help. Still, while voice interaction is still not at the point HAL 9000 promised by 1992, there’s little doubt that it has come a long way in the last 10-15 years.
Today, Yahoo announced the broad availability of voice-enabled search on the oneSearch mobile platform and a concurrent investment in vlingo, the partner providing the voice technology:
With the voice-enabled version of Yahoo! oneSearch, consumers can search for anything, including flight numbers, locations, Web site names, local restaurants, and more, by simply speaking…Whereas most mobile voice recognition systems are specific to vertical categories such as local listings, Yahoo! oneSearch with Voice lets consumers perform “wide open” searches – returning relevant results for practically every kind of query.
It continues a buzz of activity and investment in voice-recognition and text-to-speech that was capped by SpinVox’s recent $100m round and $500m valuation, but also includes some other notable events:
- Nuance announced voicemail-to-email transcription on April 1
- YouMail added voicemail-to-email transcription on April 1
- Jott added the ability to reply to email and sms by voice on March 29 (computer world)
- Microsoft showcased audible text messaging in Sync
There’s a palpable mix of excitement and skepticism about the buzz. The virtual receptionist Wildfire that seduced the valley in 1995 found a respectable exit to Orange / France Telecom in 2000, but the service had only 10,000 users when it was discontinued in 2005. And TellMe did find $100 million in annual revenue powering corporate voice interaction systems for companies like FedEx and a $800+ million exit to Microsoft in March 2007, but the technology never seemed to deliver on a fraction of what HAL promised. And Nuance has built a $4 billion market cap, but they’ve rolled up almost every company in the market to get there.
Today, with a new influx of investment and interest, I’m bullish on the applications that deliver a specific value in a narrow context, bearish on the broader applications that seek to be your voice window to the world.
Despite some persistent limitations and frustrations, several of the narrow and focused applications look like they will clear the bar for acceptability and be broadly adopted:
- SpinVox, Simulscribe and Nuance voice-mail to email/sms transcription
- Goog-411
- Garmin voice directions
But the broader applications (like oneSeach by voice, vlingo, VoiceOnTheGo) will either need to bite off small pieces of the puzzle (OneSearch and vlingo do this by limiting voice to the input and presenting results visually) or I expect that they’ll struggle.
As usual, it’s all about the user experience, and one of the biggest challenges is that the visual UIs that are defacto standards are improving so rapidly, especially on a 3G iPhone. We could use GOOG-411 while driving, but the predictive power of Google SMS makes it my preferred choice. We could use oneSearch by voice for local search, but iGoogle maps for the Blackberry delivers more information and more value.
Bottom line: Despite large investments in voice recognition and text-to-speech applications in the next 18 months, and many product new launches by start-ups and established players, only a small set of narrowly focused services that hurdle a high UE bar and deliver targeted value will thrive.