Cloud Speech-to-Text

Speech-to-text conversion powered by machine learning and available for short-form or long-form audio.

Buy NowContact Sales

Powerful speech recognition

Google Cloud Speech-to-Text enables developers to convert audio to text by applying powerful neural network models in an easy-to-use API. The API recognizes 120 languages and variants to support your global user base. You can enable voice command-and-control, transcribe audio from call centers, and more. It can process real-time streaming or prerecorded audio, using Google’s machine learning technology.


Tab #2 content goes here!

Donec pulvinar neque sed semper lacinia. Curabitur lacinia ullamcorper nibh; quis imperdiet velit eleifend ac. Donec blandit mauris eget aliquet lacinia! Donec pulvinar massa interdum risus ornare mollis. In hac habitasse platea dictumst. Ut euismod tempus hendrerit. Morbi ut adipiscing nisi. Etiam rutrum sodales gravida! Aliquam tellus orci, iaculis vel.

Tab #3 content goes here!

Donec pulvinar neque sed semper lacinia. Curabitur lacinia ullamcorper nibh; quis imperdiet velit eleifend ac. Donec blandit mauris eget aliquet lacinia! Donec pulvinar massa interdum ri.

Tab #4 content goes here!

Donec pulvinar neque sed semper lacinia. Curabitur lacinia ullamcorper nibh; quis imperdiet velit eleifend ac. Donec blandit mauris eget aliquet lacinia! Donec pulvinar massa interdum risus ornare mollis. In hac habitasse platea dictumst. Ut euismod tempus hendrerit. Morbi ut adipiscing nisi. Etiam rutrum sodales gravida! Aliquam tellus orci, iaculis vel.

Benefits

Automatic Speech Recognition
Automatic Speech Recognition (ASR) powered by deep learning neural networking to power your applications like voice search or speech transcription
Noise Robustness
Handles noisy audio from many environments without requiring additional noise cancellation.
Global Vocabulary
Recognizes 120 languages and variants with an extensive vocabulary.
Auto-Detect Language
When you need to support multilingual scenarios, you can now specify two to four language codes and Cloud Speech-to-Text will identify the correct language spoken and provide the transcript.

Features

Cloud Speech-to-Text API pricing

Cloud Speech-to-Text is priced per 15 seconds of audio processed after a 60-minute free tier. For details, please see our pricing guide.
Feature Standard models (all models except enhanced phone and video) Premium models* (enhanced phone, video)
0-60 Minutes Over 60 Mins up to 1 Million Mins 0-60 Minutes Over 60 Mins up to 1 Million Mins
Speech Recognition (without Data Logging - default) Free $0.006 / 15 seconds ** Free $0.009 / 15 seconds **
Speech Recognition (with Data Logging opt-in) Free $0.004 / 15 seconds ** Free $0.006 / 15 seconds **

Sign Up As a Partner of Google Cloud Marketplace

Join as a technology partner and grow your business together with Google Cloud

Get Started