Text to speech IBM
What is IBM Watson text-to-speech?
IBM Watson text-to-speech turns written text into audio via an API cloud service. The text-to-speech voice is available in natural-sounding custom voices and multiple languages. IBM uses the most modern neural speech synthesizing techniques to create unique, customizable artificial voices. The text-to-speech services can be used with an existing app or through the Watson Assistant.
Possible use cases for this text to speech software include tools for those with vision impairment or other disabilities, reading texts and emails to commuters, video voice-overs, educational tools for reading and home-automation systems.
In addition to text-to-speech, there are a variety of other language processing applications available through IBM Watson, including speech recognition software.
How text-to-speech works in IBM
In order to use IBM Watson text-to-speech, you will start by creating an IBM Cloud account. From there, you will need to enable the TTS or any other available Watson speech services. You will be provided with a text box to input your desired text and a drop-down selection of voices. When you’re ready, simply push play to hear your newly created audio. While this service is available in multiple languages, the input text must be in the same language as the desired output. All languages are also available in both male and female voices.
IBM uses neural speech synthesis to create a variety of natural-sounding voices. Neural speech is a form of machine learning which involves uploading audio samples of a live human voice, allowing the deep neural network of the artificial intelligence to learn from it. The AI must then use the information to synthesize natural-sounding speech patterns into a WAV audio file. It can learn many things from these files, such as appropriate inflections and intonations which make listening to and processing information much easier for the listener.
Alternative to expensive TTS software
The IBM Watson text-to-speech has three levels of pricing. A free Lite version is available, but the plan only covers up to 10,000 characters per month. The standard package costs $0.02 USD per thousand characters. There is a premium package available, but IBM must be contacted directly for pricing.
There are several different providers of text to speech API or SDK products.
One alternative, Amazon Polly, has pricing that starts at $4.00 per million characters; however, the price increases to $16.00 per million if the TTS uses the more natural sounding neural voices. One million characters is approximately 23 hours of audio.
Microsoft has a text-to-speech available as part of its Azure Cognitive Services which has quite a variety of pricing options, including one which also comes in at $16.00 USD for real time synthesis using neural voices.
Some more examples that may be worth investigating include Speechify, Murf, Play.ht, NaturalReader, Voice Dream Reader, Balabolka and Panopreter Basic. Below is a more in-depth look at one of these options that may best suit the average user.
Speechify TTS Reader
Speechify is an extremely user-friendly text-to-speech application with natural-sounding audio that allows users to easily listen to documents, articles, PDF’s, books, e-mails and even text messages. Speechify is available as an app, making it very simple to input text from various parts of your phone. It even syncs directly to certain apps and phone features. Excitingly, Speechify possesses the capability to read text from a photograph. Speechify offers a free basic version that reads at a rate of 220 words per minute in 10 standard voice options. This free version allows you to scan and listen to any printed text. You can save content and access it on multiple devices. Speechify is supported by iOS, Android and the web.
For an upgrade, Speechify also offers a premium version for $139 per year, which averages out to $11.58 per month. With this version, you gain access to over 30 natural-sounding voices in more than 20 different languages. You can still scan and listen to any printed texts but with this version you can listen at speeds up to five times faster than with the free version. For a bonus, Speechify Premium offers powerful highlighting and note taking tools and advanced skipping and importing capabilities.