“Agent, agent, agent.”

If you are like me, your experience with most of the voice recognition systems offered by the airlines devolves quickly to a frustrated plea for human help. Still, while voice interaction is still not at the point HAL 9000 promised by 1992, there’s little doubt that it has come a long way in the last 10-15 years.

Today, Yahoo announced the broad availability of voice-enabled search on the oneSearch mobile platform and a concurrent investment in vlingo, the partner providing the voice technology:

With the voice-enabled version of Yahoo! oneSearch, consumers can search for anything, including flight numbers, locations, Web site names, local restaurants, and more, by simply speaking…Whereas most mobile voice recognition systems are specific to vertical categories such as local listings, Yahoo! oneSearch with Voice lets consumers perform “wide open” searches – returning relevant results for practically every kind of query.

It continues a buzz of activity and investment in voice-recognition and text-to-speech that was capped by SpinVox’s recent $100m round and $500m valuation, but also includes some other notable events:

There’s a palpable mix of excitement and skepticism about the buzz. The virtual receptionist Wildfire that seduced the valley in 1995 found a respectable exit to Orange / France Telecom in 2000, but the service had only 10,000 users when it was discontinued in 2005. And TellMe did find $100 million in annual revenue powering corporate voice interaction systems for companies like FedEx and a $800+ million exit to Microsoft in March 2007, but the technology never seemed to deliver on a fraction of what HAL promised. And Nuance has built a $4 billion market cap, but they’ve rolled up almost every company in the market to get there.

Today, with a new influx of investment and interest, I’m bullish on the applications that deliver a specific value in a narrow context, bearish on the broader applications that seek to be your voice window to the world.

Despite some persistent limitations and frustrations, several of the narrow and focused applications look like they will clear the bar for acceptability and be broadly adopted:

But the broader applications (like oneSeach by voice, vlingo, VoiceOnTheGo) will either need to bite off small pieces of the puzzle (OneSearch and vlingo do this by limiting voice to the input and presenting results visually) or I expect that they’ll struggle.

As usual, it’s all about the user experience, and one of the biggest challenges is that the  visual UIs that are defacto standards are improving so rapidly, especially on a 3G iPhone. We could use GOOG-411 while driving, but the predictive power of Google SMS makes it my preferred choice. We could use oneSearch by voice for local search, but iGoogle maps for the Blackberry delivers more information and more value.

Bottom line: Despite large investments in voice recognition and text-to-speech applications in the next 18 months, and many product new launches by start-ups and established players, only a small set of narrowly focused services that hurdle a high UE bar and deliver targeted value will thrive.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s