Synchronizing Animation with TTS playback
Last reviewed: 2/21/2008
HOW Article ID: H020801
The information in this article applies to:
- SpeechKit 5
You can synchronize screen display animation with synthesized speech playback.
The SpeechKit ChantTTS class enables you to register for phoneme, viseme, and visual synthesis events that provide information for synchronizing the audio with the visual.
In SAPI 4, Microsoft defined a visual event that fires during the audio playback process of the synthesized speech. This event identifies the phoneme being spoken and values representing the mouth positions typical of enunciating that phoneme. The phoneme event provides the duration of the phoneme to indicate the playback time of generated audio for that sound. The visual and phoneme event information can be used by your application to synchronize screen display animation of characters, cursor movement, presentation flow, and other visual cues.
In SAPI 5, Microsoft dropped the visual event and replaced it with a viseme event. Visemes represent standard mouth positions based on an approach used by Walt Disney animators. The viseme event provides current viseme value, its duration, and the next viseme value. Your application can use the current and next viseme values for smooth transitions during animation.
Your application can use phoneme, viseme, and visual synthesis events for live or pre-synthesized audio playback. With SpeechKit, you can synthesize text and get the audio back in buffers, files, or streams, capture the event information, then use this information for screen display animation when you playback the audio buffers, files, or streams.
To see what the phoneme, viseme, and visual sysnthesis events look like for the voices your applications use, start the Chant Developer Workbench and open Speech Synthesizers under File->Open. Select the voice you want. In the Command window, enter the following: StartPlayback "Show me the events" and press the Enter key.
Click on the Events window and review the events and their values. You will see how many events are generated for each word as it is broken down into phonemes.
For more information about phonemes and synthesis events, review the SpeechKit and VoiceMarkupKit help files and contact Chant support via your Help Desk with any questions.