Wireless

Google unveils machine learning APIs for developers

By Daniel Kobialka Aug 2, 2016 11:34am

Google (NASDAQ: GOOG) has launched two machine learning application programming interfaces (APIs) into open beta that could make it easier for mobile developers to incorporate voice and language recognition capabilities into their apps.

Here's a closer look at the new machine learning APIs and what they could offer developers going forward.

What are Google's new machine learning APIs?

Two Google Cloud machine learning products are entering beta: The Cloud Natural Language and Cloud Speech APIs.

Google pointed out that the Cloud Natural Language API is based on natural language and empowers developers to "reveal the structure and meaning of your text in a variety of languages, with initial support for English, Spanish and Japanese."

The API's features include:

Entity Recognition: Enables developers to identify people, organizations and other entities in a block of text and label them accordingly
Sentiment Analysis: Provides insights into the overall sentiment of a block of text
Syntax Analysis: Evaluates a sentence and parses it to define the structure and meaning of text

Google's Cloud Speech API offers developers speech-to-text conversion in over 80 languages.

The API is based on voice recognition technology, and some of its features include asynchronous calling capabilities and "word hints" that allow developers to add custom words and phrases to API calls to improve recognition.

How can developers use these APIs?

To better understand how developers can use Google's new APIs, let's take a look at an example of how one company used Google's Cloud Speech API alpha to improve its speech analytics and predictive analytics products.

VoiceBase, a company that provides APIs for automatic speech-to-text, speech analytics and predictive insights, represents one of many businesses that could benefit from Google's new APIs.

The company has offered speech services via API for five years, but Google's new APIs may enable it to extend its reach and allow developers to incorporate voice and language recognition capabilities into their apps faster than ever before.

VoiceBase previously signed up for the Cloud Speech API alpha (along with more than 5,000 companies) and plans to use the beta release to offer voice transcription through Google, too.

"[VoiceBase] is going to offer transcription through Google or our own engine, depending on the language, with speech analytics and predictive analytics services layered on top of either transcript as an option," VoiceBase CEO Walter Bachtiger told FierceDeveloper in an interview. "Google is good at providing best-of-breed solutions across a broad range of products, or in this case a broad range of content and use cases, and we do expect them to be one of the biggest players in this industry in the future."

Google's new APIs also will allow VoiceBase to expand the language coverage of its speech analytics and predictive analytics to 80 languages.

"By rolling out the Google Speech API, Google added an important component to their cloud offering," Bachtiger said. "Surfacing the information in spoken content has been and will become critical in applications that handle voice. Looking at the various cloud offerings for speech – Google clearly has the most powerful solution among the big guys and that's one of the reasons we are excited to work with them and provide choices to developers."

Should developers incorporate voice and language recognition capabilities into their apps?

The Google Cloud Natural Language and Cloud Speech APIs highlight the potential of adding voice and language recognition capabilities to apps. As a result, these APIs could make a world of difference for mobile developers, particularly for those who want to boost their app engagement levels.

App engagement remains an ongoing struggle for developers, which is reflected in recent data. For instance, Appboy's "Spring 2016 Mobile Customer Retention Report" of over 300 apps worldwide indicated more than 75 percent of new users fail to return to an app the day after first use.

The report also showed the average app's retention rate declines over the first three months of use, reaching an average of 4.1 percent after 90 days.

Google Could Speech features context recognition — Google's Cloud Speech API
features context recognition to
deliver better results.

In contrast, imagine what end users could experience if an app is able to communicate with them and learn from them as well. Thanks to machine learning and Google's Cloud Natural Language and Cloud Speech APIs, developers may be able to move one step closer to transforming this dream into reality.

Ultimately, the combination of machine learning and Google's new APIs could help developers drive user engagement.

"There are already a million reasons for a user to leave your app – communicating with a human being shouldn't be one of them," Rob Spectre, chief developer evangelist at cloud communications company Twilio, told FierceDeveloper. "By connecting your users directly inside your app, not only are you reducing your communication costs, you're also keeping your users engaged."

Spectre pointed out incorporating voice and language recognition capabilities into an app could help developers stay ahead of their rivals as well.

"The bar for user experience on mobile continues to rise every year – seamlessness is becoming a base consumer expectation," he noted. "The developers that are staying ahead of that curve are the ones that provide a uninterrupted flow for everything their users want to do. Bringing that flow to your communications so users aren't forced to hop out of the app to communicate with a driver, courier or agent will be the new standard."

Second and third images courtesy of Google.

application programming interface (API) machine learning Google Twilio