Converting TTS audio format

Last reviewed: 12/15/2011

HOW Article ID: H121102

The information in this article applies to:

  • SpeechKit 7

Summary

SAPI 5 provides automatic audio conversion for TTS audio but SAPI 4 and most native APIs do not. The ChantTTS class can automatically convert audio format for these APIs.

Voices generate audio in specific formats such as 22kHz 16-bit mono. Applications may need the audio format in alternative formats such as 16kHz 16-bit mono to stream over the internet or 8kHz 8-bit mono to playback over telephony connections.

More Information

The ChantTTS StartPlayback method enables the application to generate synthesized speech in a variety of formats regardless of the voice generated audio format.

If the voice is a SAPI 5 voice, SAPI 5 converts the audio generated by the voice. For SAPI 4 and other native TTS APIs such as Acapela BabTTS, Cepstral SWIFT, Nuance RSSolo, and Nuance Vocalizer, the ChantTTS object will convert the audio. The application can freely switch voices and produce the same audio format for playback.

The following examples illustrate using the StartPlayback method to synthesize speech and playback audio.

// Instantiate ChantTTS object
NChantTTS1 = new NChantTTS();

// Synthesize text to speech and save to a wave file
NChantTTS1.StartPlayback("Hello world", ChantPlaybackObject.CPOText, "myoutaudiofile.wav", ChantPlaybackResult.CPRFile, ChantAudioFormat.CAF16kHz16BitMono, ChantPlaybackStyle.CPSAsynchronous);

See the Chant SpeechKit 7 help file for more synthesis examples.