November 18th, 2008

Google Voice Recognition / Voice Search API

Last night, Google launched a new version of their iPhone app that supports voice search.  Instead of typing, you just hold the phone up to your ear, speak the search term into your phone, and Google returns matching search results.  It’s not perfect.  I tried to search on my own name, Joe Lazarus, and the first attempt returned search results for “trailblazer” (flattering, but not exactly what I was looking for).  However, when I tried a second time and spoke more clearly, it worked like a charm.

Google should offer this technology as an API for third party developers.  Voice is a logical way to interact with mobile devices, but the technology required to support voice recognition is out of reach for all but a handful of companies.  While the voice search interface is fairly simple, there is a lot of sophisticated technology running behind the scenes.  As Andy Baio discovered, Google compresses the audio recording of your voice into a tiny file, sends it off to Google servers, does some complex voice recognition analysis on the file, and identifies both the most likely phrase that you spoke as well as a number of possible alternatives… all in a matter of seconds.

If Google offered a voice recognition API, third party developers could use it to power voice enabled interfaces for nearly any application. App developers for the iPhone, Android, and Blackberry could allow people to fill in forms with their voice.  Car GPS manufacturers could offer people an option to simply say an address or the name of a local business rather than fumbling with a clumsy keyboard while driving.  This sort of technology is perfectly suited to be a centralized platform.  It’s prohibitively expensive and complex for most companies to develop in-house, yet nearly every device / software developer would benefit from voice features.  Meanwhile, by offering this as an open web service, Google could become the de facto voice interface, not just for their own applications, but for any application using their API.  Google is already forming a rich profile of our interests and behavior based on the keywords we type into their search engine.  A voice recognition API would allow them to tap into our activities across any number of applications… our local business preferences, our driving patterns, our to-do list dictation, and so on.  Clearly there would be some privacy concerns, but I’m sure those could be worked out over time.

What do you think?  If you have an iPhone, download the latest version of the Google Mobile App and try voice search for yourself.

| |

  1. siamondo reblogged this from joelaz
  2. joelaz posted this