How do I manage speech synthesis directly with the synthesizer?
Last reviewed: 7/8/2022
HOW Article ID: H072204
The information in this article applies to:
- SpeechKit 11
Summary
Optimize speech synthesis apps by managing the synthesizer directly in applications.
More Information
A speech synthesizer converts text to speech and produces audio bytes for playback or persistence. In addition, events are generated to indicate processing states.
The Microsoft Speech API (SAPI5) runtime is part of Windows that provides application control of the playback and processing of the audio bytes and events of a synthesizer. It optionally provides audio streaming playback and time-sequenced event posting. Microsoft includes synthesizers in many Windows SKUs.
Synthesizers from other speech technology vendors typically render the audio bytes and events but rely on applications and/or SAPI5 to handle playback and event processing. Some support the Microsoft runtime and all provide their own proprietary speech API with SDK and runtimes.
SpeechKit provides common speech synthesis management for multiple application scenarios across the various speech technology APIs by managing speech synthesis directly with the synthesizer.
SpeechKit includes libraries for the following Speech APIs for speech synthesis:
Speech API | Platforms |
---|---|
Apple AVFoundation | ARM, x64, x86 |
Cepstral Swift | x64, x86 |
CereProc CereVoice | x64, x86 |
Google android.speech.tts | ARM |
Microsoft SAPI 5 | x64, x86 |
Microsoft Speech Platform | x64, x86 |
Microsoft .NET System.Speech | x64, x86 |
Microsoft .NET Microsoft.Speech | x64, x86 |
Microsoft WindowsMedia (UWP) | ARM, x64, x86 |
Microsoft WindowsMedia (WinRT) | x86, x64 |
Libraries for the most popular synthesizer speech APIs are included in Chant Developer Workbench. For additional libraries that support different APIs, runtimes, versions, and vendors contact Chant Support.
SpeechKit supports speech synthesis with playback or persistence with a single request.
// Create Synthesizer
_Synthesizer = _SpeechKit.createAndroidSynthesizer();
// Synthesize speech for playback
_Synthesizer.speak("Hello world.");
// Create Acapela TTS Synthesizer
_Synthesizer = _SpeechKit.CreateAcaTTSSynthesizer();
// Or, Create CereProc CereVoice Synthesizer
_Synthesizer = _SpeechKit.CreateCereVoiceSynthesizer();
// Or, Create Microsoft Speech Platform Synthesizer
_Synthesizer = _SpeechKit.CreateMSPSynthesizer();
// Or, Create Microsoft SAPI5 (Desktop) Synthesizer
_Synthesizer = _SpeechKit.CreateSAPI5Synthesizer();
// Or, Create Cepstral Synthesizer
_Synthesizer = _SpeechKit.CreateSwiftSynthesizer();
// Or, Create Microsoft WindowsMedia Synthesizer
_Synthesizer = _SpeechKit.CreateWindowsMediaSynthesizer();
// Synthesize speech for playback
_Synthesizer.Speak("Hello world.");
// Free the synthesizer
_Synthesizer.Dispose();
// Create Acapela TTS Synthesizer
_Synthesizer = _SpeechKit->CreateAcaTTSSynthesizer();
// Or, Create CereProc CereVoice Synthesizer
_Synthesizer = _SpeechKit->CreateCereVoiceSynthesizer();
// Or, Create Microsoft Speech Platform Synthesizer
_Synthesizer = _SpeechKit->CreateMSPSynthesizer();
// Or, Create Microsoft SAPI5 (Desktop) Synthesizer
_Synthesizer = _SpeechKit->CreateSAPI5Synthesizer();
// Or, Create Cepstral Synthesizer
_Synthesizer = _SpeechKit->CreateSwiftSynthesizer();
// Or, Create Microsoft WindowsMedia Synthesizer
_Synthesizer = _SpeechKit->CreateWindowsMediaSynthesizer();
// Synthesize speech for playback
_Synthesizer->Speak(L"Hello world.");
// Free the synthesizer
delete _Synthesizer;
// Create Acapela TTS Synthesizer
_Synthesizer = _SpeechKit->CreateAcaTTSSynthesizer();
// Or, Create CereProc CereVoice Synthesizer
_Synthesizer = _SpeechKit->CreateCereVoiceSynthesizer();
// Or, Create Microsoft Speech Platform Synthesizer
_Synthesizer = _SpeechKit->CreateMSPSynthesizer();
// Or, Create Microsoft SAPI5 (Desktop) Synthesizer
_Synthesizer = _SpeechKit->CreateSAPI5Synthesizer();
// Or, Create Cepstral Synthesizer
_Synthesizer = _SpeechKit->CreateSwiftSynthesizer();
// Or, Create Microsoft WindowsMedia Synthesizer
_Synthesizer = _SpeechKit->CreateWindowsMediaSynthesizer();
// Synthesize speech for playback
_Synthesizer->Speak("Hello world.");
// Free the synthesizer
delete _Synthesizer;
// Create Acapela TTS Synthesizer
_Synthesizer := _SpeechKit.CreateAcaTTSSynthesizer();
// Or, Create CereProc CereVoice Synthesizer
_Synthesizer := _SpeechKit.CreateCereVoiceSynthesizer();
// Or, Create Microsoft Speech Platform Synthesizer
_Synthesizer := _SpeechKit.CreateMSPSynthesizer();
// Or, Create Microsoft SAPI5 (Desktop) Synthesizer
_Synthesizer := _SpeechKit.CreateSAPI5Synthesizer();
// Or, Create Cepstral Synthesizer
_Synthesizer := _SpeechKit.CreateSwiftSynthesizer();
// Or, Create Microsoft WindowsMedia Synthesizer
_Synthesizer := _SpeechKit.CreateWindowsMediaSynthesizer();
// Synthesize speech for playback
_Synthesizer.Speak('Hello world.');
// Free the recognizer
_SpeechKit.Destroy();
// Create Acapela TTS Synthesizer
_Synthesizer = _SpeechKit.createAcaTTSSynthesizer();
// Or, Create CereProc CereVoice Synthesizer
_Synthesizer = _SpeechKit.createCereVoiceSynthesizer();
// Or, Create Microsoft Speech Platform Synthesizer
_Synthesizer = _SpeechKit.createMSPSynthesizer();
// Or, Create Microsoft SAPI5 (Desktop) Synthesizer
_Synthesizer = _SpeechKit.createSAPI5Synthesizer();
// Or, Create Cepstral Synthesizer
_Synthesizer = _SpeechKit.createSwiftSynthesizer();
// Or, Create Microsoft WindowsMedia Synthesizer
_Synthesizer = _SpeechKit.createWindowsMediaSynthesizer();
// Synthesize speech for playback
_Synthesizer.speak("Hello world.");
// Free the synthesizer
_Synthesizer.dispose();
// Create iOS TTS Synthesizer
_synthesizer = [_speechKit createiOSSynthesizer];
// Synthesize speech for playback
[_synthesizer speak: @"Hello world."];
// Free the synthesizer
[_synthesizer dispose];
' Create Acapela TTS Synthesizer
_Synthesizer = _SpeechKit.CreateAcaTTSSynthesizer()
' Or, Create CereProc CereVoice Synthesizer
_Synthesizer = _SpeechKit.CreateCereVoiceSynthesizer()
' Or, Create Microsoft Speech Platform Synthesizer
_Synthesizer = _SpeechKit.CreateMSPSynthesizer()
' Or, Create Microsoft SAPI5 (Desktop) Synthesizer
_Synthesizer = _SpeechKit.CreateSAPI5Synthesizer()
' Or, Create Cepstral Synthesizer
_Synthesizer = _SpeechKit.CreateSwiftSynthesizer()
' Or, Create Microsoft WindowsMedia Synthesizer
_Synthesizer = _SpeechKit.CreateWindowsMediaSynthesizer()
' Synthesize speech for playback
_Synthesizer.Speak("Hello world.")
' Free the synthesizer
_Synthesizer.Dispose()
To know the progress or state of speech synthesis, the application processes event callbacks.
public class MainActivity extends AppCompatActivity implements com.speechkit.JChantSpeechKitEvents
{
...
// Set the callback object
_Synthesizer.setChantSpeechKitEvents(this);
// Register for callbacks
_Synthesizer.registerCallback(ChantSpeechKitCallback.CCTTSInitComplete);
...
@Override
public void initComplete(Object o, TTSEventArgs ttsEventArgs)
{
if (_Synthesizer.getChantEngines() != null)
{
for (JChantEngine engine : _Synthesizer.getChantEngines())
{
// Add name to list
_Engines.add(engine.getName());
}
}
...
}
}
// Register Event Handler
_Synthesizer.WordPosition += Synthesizer_WordPosition;
private void Synthesizer_WordPosition(object sender, WordPositionEventArgs e)
{
if (e != null)
{
int startPosition = e.Position;
int wordLength = e.Length;
...
}
}
// Register Event Handler
_Synthesizer->SetWordPosition(WordPosition);
void CALLBACK WordPosition(void* Sender, CWordPositionEventArgs* Args)
{
if (Args != NULL)
{
int startPosition = Args->GetPosition();
int wordLength = Args->GetLength();
...
}
}
// Register Event Handler
_Synthesizer->SetWordPosition(WordPosition);
void CALLBACK WordPosition(void* Sender, CWordPositionEventArgs* Args)
{
if (Args != NULL)
{
int startPosition = Args->GetPosition();
int wordLength = Args->GetLength();
...
}
}
// Register event handler
_Synthesizer.WordPosition := WordPosition;
procedure TForm1.WordPosition(Sender: TObject; Args: TWordPositionEventArgs);
var
startPosition: Integer;
wordLength: Integer;
begin
If (Args <> nil) then
begin
startPosition := args.Position;
wordLength := args.Length;
...
end;
end;
public class Frame1 extends JFrame implements com.speechkit.JChantSpeechKitEvents
// Set the callback
_Synthesizer.setChantSpeechKitEvents(this);
// Register Callbacks for visual cues.
_Synthesizer.registerCallback(ChantSpeechKitCallback.CCTTSWordPosition);
public void wordPosition(Object sender, WordPositionEventArgs args)
{
if (args != null)
{
int startPosition = args.getPosition();
int wordLength = args.getLength();
...
}
}
// Set the callback
[_synthesizer setDelegate:(id<SPChantSynthesizerDelegate>)self];
-(void) rangeStart:(NSObject*)sender args:(SPRangeStartEventArgs*)args
{
[[self textView1] setSelectedRange:NSMakeRange([args location], [args length])];
}
Dim WithEvents _Synthesizer As NSAPI5Synthesizer = Nothing
Private Sub Synthesizer_WordPosition(ByVal sender As System.Object, ByVal e As WordPositionEventArgs) Handles _Synthesizer.WordPosition
Dim startPosition As Integer
Dim wordLength As Integer
If (e IsNot Nothing) Then
startPosition = e.Position
wordLength = e.Length
...
End If
End Sub
To control basic properties of how the synthesis occurs, some synthesizers support property settings.
// No properties
// Set the speaking volume
_Synthesizer.SetProperty("Volume","50");
// Synthesize speech for playback
_Synthesizer->SetProperty(L"Volume","50");
// Synthesize speech for playback
_Synthesizer->SetProperty("Volume","50");
// Synthesize speech for playback
_Synthesizer.SetProperty('Volume','50');
// Synthesize speech for playback
_Synthesizer.setProperty("Volume.","50");
// No properties
' Synthesize speech for playback
_Synthesizer.SetProperty("Volume","50")