Secret tip: How to bring your Text to Speech voices to life

The biggest difference between a Text to Speech voice and a natural voice is that a Text to Speech voice will always pronounce the words in the same way.

There’s no way to say something with a certain intonation, to wheedle your mom into getting you an ice cream or to let someone know that you are happy or express your anger. 

Well, that used to be the case. Here’s a secret not many people know: most voices that are included in our apps offer sounds and expressions. If you’re using Proloquo2Go, you may be familiar with its ExpressivePower™ feature that provides an easy way to listen to and select expressions and sounds. These same sounds and expressions are available in Pictello and Proloquo4TextWrise users can also access the sounds and expressions through the Infovox iVox voices.
 
Normally Text to Speech is created by combining word sounds. Acapela Group, the Text to Speech developer we work with, makes sure the voice talent reads a script that is designed to record as many sounds and sound combinations as possible. When the database is done, basically any word can be spoken by combining those sounds into actual words. That is why words will always be pronounced without emotion. See the process in action in the video “Text to Speech, How does it work?”.
 
To make sure the sounds and expressions have a unique intonation, we had to step away from this method and record each expression individually. Expressions such as “Excellent!”, “Goodbye!” and “Stop it!” can now be pronounced with emotion. There are also many sounds available, like yawning, coughing and laughing. For our unique children’s voices we even took this a step further by recording sounds that kids use when playing. Need the sound of an ambulance or a barking dog? It’s all there! 
 
You may not have realized, but before the release of our British English children’s voices back in 2012, no genuine children’s Text to Speech voices were available. The only available “children’s voices” were tweaked adult voices that sounded quite robotic. As the majority of our users are children that use an Augmentative and Alternative Communication (AAC) solution, we took on the challenge to develop genuine Text to Speech children's voices together with Acapela Group. After all, a child using an AAC system to communicate deserves to sound just like their peers.

So how does it work? For expressions it’s fairly easy: you simply start with a capital and add a question mark at the end of the expression. Remember, this does not work for every word or expression. It only works for the expressions we recorded individually. For sounds it’s a little bit trickier: you need to enter the sound code.

Which sounds and expressions are available varies from voice to voice, and from language to language. Older voices do not have sounds or expressions, but the children’s voices have many. Also keep in mind that every voice is different and has its own personality. Have a look at this Acapela Group document to find all the expressions and sounds per language and voice.

Not using any of our products yet but want to try them out? Demo Infovox iVox and Wrise for free!