Last reviewed: 3/23/2024 10:36:13 AM
Fine-tuning TTS With VoiceMarkupKit
Text-to-speech (TTS) markup is text with imbedded indicators that control speech synthesis from the text. Speaking qualities such as the speed, pitch, emphasis, and word pronunciation may be tailored in reproducing speech from text.
Chant VoiceMarkupKit is comprised of software class that handle the complexities of generating text-to-speech markup for various markup syntax. This enables you to tailor speech synthesis to produce sounds in familiar dialects, speaking patterns, and accents of your end users. You can adjust TTS markup as needed for the synthesizer to enhance the playback quality when synthesizing.
Synthesizers (i.e. speech APIs) interpret different markup syntax. VoiceMarkupKit supports the following markup syntax:
Speech API | Markup Syntax |
---|---|
Acapela TTS | AcaTTS Tags |
Cepstral Swift | W3C SSML |
CereProc CereVoice | W3C SSML, CereVoice Tagset |
Microsoft Azure Speech | Azure Speech SSML |
Microsoft SAPI 5 | SAPI 5 XML Markup, W3C SSML (SAPI 5.3+) |
Microsoft Speech Platform | W3C SSML |
Microsoft .NET System.Speech | W3C SSML |
Microsoft .NET Microsoft.Speech | W3C SSML |
Microsoft WindowsMedia (UWP and WinRT) | W3C SSML |
For more information about the fine-tuning speech synthesis with VoiceMarkupKit, review the following topics: