How do I manage application generated speech requests asynchronously?
Last reviewed: 12/14/2021
HOW Article ID: H072122
The information in this article applies to:
- Speech Manager
Summary
Applications may generate speech requests dynamically as a result many different types of interactions creating complex management situations.
More Information
Chant Speech Manager supports two types of speech requests: transcription and speech synthesis.
With transcription requests, audio streams (i.e., buffers and files) serve as audio source for speech recognition.
With synthesis requests, audio is generated from synthesizing speech from text and returned as buffers, file, or streamed for live playback.
Requests are created, scheduled, and destroyed. A request is created with optional parameters that specify the details for transcription or synthesis. Once a request is created, it can be managed with various priorities:
- Cancel - cancel the request;
- Interrupt - cancel the current request and process this request immediately (speech synthesis only);
- Priority - place the request at the head of the queue to be the next request processed; or
- Schedule - append the request at the end of the queue to be processed.
Speech Manager uses the SpeechKit speech API libraries to process the requests. All SpeechKit speech API property setting and event handling is supported.
// Instantiate SpeechManager
NSpeechManager _SpeechManager = new NSpeechManager();
if (_SpeechManager != null)
{
// Set license properties
//_SpeechManager.SetLicense("LicenseRegistrationNumber", "LicenseSerialNumber");
// Else, for evaluation, set only evaluation serial number
_SpeechManager.SetLicense(string.Empty, "EvaluationSerialNumber");
// Create transcription request
NChantTranscribeAudioRequest transcribeRequest = _SpeechManager.CreateTranscribeAudioRequest("", "", "myaudio.wav");
if (transcribeRequest != null)
{
// Register for recognition events
transcribeRequest.RecognitionDictation += Recognizer_RecognitionDictation;
// Optionally register for begin/end events
transcribeRequest.AudioSourceStart += Recognizer_AudioSourceStart;
transcribeRequest.AudioSourceStop += Recognizer_AudioSourceStop;
// Schedule request
transcribeRequest.ScheduleRequest();
// Since we no longer need it, destroy it
transcribeRequest.Dispose();
}
// Create synthesis request
NChantSpeakRequest speakRequest = _SpeechManager.CreateSpeakRequest("See how easy it is to talk with Speech Manager.");
if (speakRequest != null)
{
// Optionally register for begin/end events
speakRequest.AudioDestStart += Synthesizer_AudioDestStart;
speakRequest.AudioDestStop += Synthesizer_AudioDestStop;
speakRequest.Done += Synthesizer_TTSDone;
speakRequest.Started += Synthesizer_TTSStarted;
// Schedule request
speakRequest.ScheduleRequest();
// Since we no longer need it, destroy it
speakRequest.Dispose();
}
}
// Instantiate SpeechManager object
CSpeechManager* _SpeechManager = new CSpeechManager();
if (_SpeechManager =! NULL)
{
// Set license properties
//_SpeechManager->SetLicense(L"LicenseRegistrationNumber", L"LicenseSerialNumber");
// Else, for evaluation, set only evaluation serial number
_SpeechManager->SetLicense(L"", L"EvaluationSerialNumber");
// Create transcription request
CChantTranscribeAudioRequest* pTranscribeRequest = _SpeechManager->CreateTranscribeAudioRequest(L"", L"", L"myaudio.wav");
if (pTranscribeRequest != NULL)
{
// Register for recognition events
pTranscribeRequest->SetRecognitionDictation(RecognitionDictation);
// Optionally register for begin/end events
pTranscribeRequest->SetAudioSourceStart(AudioSourceStart);
pTranscribeRequest->SetAudioSourceStop(AudioSourceStop);
// Schedule request
pTranscribeRequest->ScheduleRequest();
// Since we no longer need it, destroy it
delete pTranscribeRequest;
}
// Create synthesis request
CChantSpeakRequest* pSpeakRequest = _SpeechManager->CreateSpeakRequest(L"See how easy it is to talk with Speech Manager.");
if (pSpeakRequest != NULL)
{
// Optionally register for begin/end events
pSpeakRequest->SetAudioDestStart(AudioDestStart);
pSpeakRequest->SetAudioDestStop(AudioDestStop);
pSpeakRequest->SetDone(TTSDone);
pSpeakRequest->SetStarted(TTSStarted);
// Schedule request
pSpeakRequest->ScheduleRequest();
// Since we no longer need it, destroy it
delete pSpeakRequest;
}
}
// Instantiate SpeechManager object
CSpeechManager* _SpeechManager = new CSpeechManager();
if (_SpeechManager =! NULL)
{
// Set license properties
//_SpeechManager->SetLicense("LicenseRegistrationNumber", "LicenseSerialNumber");
// Else, for evaluation, set only evaluation serial number
_SpeechManager->SetLicense("", "EvaluationSerialNumber");
// Create transcription request
CChantTranscribeAudioRequest* pTranscribeRequest = _SpeechManager->CreateTranscribeAudioRequest("","","myaudio.wav");
if (pTranscribeRequest != NULL)
{
// Register for recognition events
pTranscribeRequest->SetRecognitionDictation(RecognitionDictation);
// Optionally register for begin/end events
pTranscribeRequest->SetAudioSourceStart(AudioSourceStart);
pTranscribeRequest->SetAudioSourceStop(AudioSourceStop);
// Schedule request
pTranscribeRequest->ScheduleRequest();
// Since we no longer need it, destroy it
delete pTranscribeRequest;
}
// Create synthesis request
CChantSpeakRequest* pSpeakRequest = _SpeechManager->CreateSpeakRequest("See how easy it is to talk with Speech Manager.");
if (pSpeakRequest != NULL)
{
// Optionally register for begin/end events
pSpeakRequest->SetAudioDestStart(AudioDestStart);
pSpeakRequest->SetAudioDestStop(AudioDestStop);
pSpeakRequest->SetDone(TTSDone);
pSpeakRequest->SetStarted(TTSStarted);
// Schedule request
pSpeakRequest->ScheduleRequest();
// Since we no longer need it, destroy it
delete pSpeakRequest;
}
}
var
_SpeechManager: TSpeechManager;
transcribeRequest: TChantTranscribeAudioRequest;
speakRequest: TChantSpeakRequest;
begin
// Instantiate SpeechManager object
_SpeechManager := TSpeechManager.Create();
if (_SpeechManager <> nil) then
begin
// Set license properties
//_SpeechManager.SetLicense('LicenseRegistrationNumber', 'LicenseSerialNumber');
// Else, for evaluation, set only evaluation serial number
_SpeechManager.SetLicense('', 'EvaluationSerialNumber');
// Create transcription request
transcribeRequest := _SpeechManager.CreateTranscribeAudioRequest('','','myaudio.wav');
if (request <> nil) then
begin
// Register for recognition events
transcribeRequest.RecognitionDictation := RecognitionDictation;
// Optionally register for begin/end events
transcribeRequest.AudioSourceStart := AudioSourceStart;
transcribeRequest.AudioSourceStop := AudioSourceStop;
// Schedule request
transcribeRequest.ScheduleRequest();
// Since we no longer need it, destroy it
transcribeRequest.Destroy();
end;
// Create synthesis request
speakRequest := _SpeechManager.CreateSpeakRequest('See how easy it is to talk with Speech Manager.');
if (request <> nil) then
begin
// Optionally register for begin/end events
request.AudioDestStart := AudioDestStart;
request.AudioDestStop := AudioDestStop;
request.Done := TTSDone;
request.Started := TTSStarted;
// Schedule request
request.ScheduleRequest();
// Since we no longer need it, destroy it
request.Destroy();
end;
end;
end;
// Create SpeechManager object
JSpeechManager _SpeechManager = new JSpeechManager();
// Set license properties
//_SpeechManager.setLicense("LicenseRegistrationNumber", "LicenseSerialNumber");
// Else, for evaluation, set only evaluation serial number
_SpeechManager.setLicense("", "EvaluationSerialNumber");
// Create transcription request
JChantTranscribeAudioRequest transcribeRequest = _SpeechManager.createTranscribeAudioRequest("", "", "myaudio.wav", "", "");
if (transcribeRequest != null)
{
// Set the callback
transcribeRequest.setChantSpeechKitEvents(this);
// Register for recognition events
transcribeRequest.registerCallback(ChantSpeechKitCallback.CCSRRecognitionDictation);
// Optionally register for begin/end events
transcribeRequest.registerCallback(ChantSpeechKitCallback.CCSAudioStart);
transcribeRequest.registerCallback(ChantSpeechKitCallback.CCSAudioStop);
// Schedule request
transcribeRequest.scheduleRequest();
// Since we no longer need it, destroy it
transcribeRequest.dispose();
}
// Create synthesis request
JChantSpeakRequest speakRequest = _SpeechManager.createSpeakRequest("See how easy it is to talk with Speech Manager.", "", "");
if (speakRequest != null)
{
// Set the callback
speakRequest.setChantSpeechKitEvents(this);
// Optionally register for begin/end events
speakRequest.registerCallback(ChantSpeechKitCallback.CCDAudioStart);
speakRequest.registerCallback(ChantSpeechKitCallback.CCDAudioStop);
speakRequest.registerCallback(ChantSpeechKitCallback.CCTTSDone);
speakRequest.registerCallback(ChantSpeechKitCallback.CCTTSStarted);
// Schedule request
speakRequest.scheduleRequest();
// Since we no longer need it, destroy it
speakRequest.dispose();
}
Dim _SpeechManager As NSpeechManager
Dim WithEvents _SRRequest As NChantTranscribeAudioRequest
Dim WithEvents _TTSRequest As NChantSpeakRequest
' Instantiate SpeechManager
_SpeechManager = New NSpeechManager()
If (_SpeechManager IsNot Nothing) Then
' Set license properties
'_SpeechManager.SetLicense("LicenseRegistrationNumber", "LicenseSerialNumber")
' Else, for evaluation, set only evaluation serial number
_SpeechManager.SetLicense(string.Empty, "EvaluationSerialNumber")
If (Not String.IsNullOrEmpty(audioFile.Text.Trim())) Then
' Create transcription request
_SRRequest = _SpeechManager.CreateTranscribeAudioRequest("", "", "myaudio.wav")
If (_SRRequest IsNot Nothing) Then
' Schedule request
_SRRequest.ScheduleRequest()
' Since we no longer need it, destroy it
_SRRequest.Dispose()
End If
End If
' Create synthesis request
_TTSRequest = _SpeechManager.CreateSpeakRequest("See how easy it is to talk with Speech Manager.")
If (_TTSRequest IsNot Nothing) Then
' Schedule request
_TTSRequest.ScheduleRequest()
' Since we no longer need it, destroy it
_TTSRequest.Dispose()
End If
End If
Syntax Options
A SpeechManager supports the following methods:
- CreateTranscribeAudioRequest - Instantiate a transcription request.
- CreateSpeakRequest - Instantiate a speak request;
- FlushSRRequests - Remove all transcription requests.
- FlushTTSRequests - Remove all speak requests.
- QuiesceSRRequests - Stop processing all transcription requests allowing the current request to finish.
- QuiesceTTSRequests - Stop processing all speak requests allowing the current request to finish.
- StartSRRequests - Start processing transcription requests.
- StartTTSRequests - Start processing speak requests.
- StopSRRequests - Stop processing all transcription requests canceling the current request.
- StopTTSRequests - Stop processing all speak requests canceling the current request.
The CreateTranscribeAudioRequest method has two optional parameters that may be used to control speech recognition:
- commands - (optional) Comma-separated list of commands for command recognition.
- grammar - (optional) File path of speech recognition grammar for grammar recognition.
- audiofile - (optional) File path of audio file for transcription. Use PutAudioBytes for audio buffers.
- api - (optional) Speech API. Supported APIs include: sapi5, dragon, and msp.
- engine - (optional) Speech engine name, id, or language.
A TranscribeAudioRequest supports the following methods:
- CancelRequest - Cancels the request and removes it.
- PriorityRequest - Schedules the request as the next request to process.
- PutAudioBytes - Sets the source audio for the transcription.
- SetProperty - Sets the engine properties.
- ScheduleRequest - Schedules the request at the end of the queue to be processed.
The CreateSpeakRequest method has four optional parameters that may be used to control speech synthesis:
- text - (required) The text from which to synthesize speech.
- options - (optional) Speech synthesis options.
- outfile - (optional) The file path of where to write the synthesized audio.
- outformat - (optional) The audio output format.
- api - (optional) Speech API. Supported APIs include: sapi5, windowsmedia, msp, acatt, swift, cerevoice.
- engine - (optional) Speech engine name, id, or language.
A SpeakRequest supports the following methods:
- CancelRequest - Cancels the request and removes it.
- GetAudioBytes - Returns the audio bytes from speech synthesis.
- InterruptRequest - Cancels the current request and schedules the request as the next request to process.
- PriorityRequest - Schedules the request as the next request to process.
- SetProperty - Sets the engine properties.
- ScheduleRequest - Schedules the request at the end of the queue to be processed.
Development and Deployment
Speech Manager applications require the Speech Manager library and the applicable SpeechKit Speech API libraries:
- C++Builder, C++, and Delphi applications require the Speech Manager library (CSpeechManager.dll or CSpeechManagerX64.dll) and the applicable SpeechKit Speech API libraries in the same directory as the application .exe.
- Java applications require the chant.speechmanager.jar, speechkit.jar, and chant.shared.jar in the target system Java JRE lib directory and/or ensure the classpath includes the path where the chant.speechmanager.jar, speechkit.jar, and chant.shared.jar libraries are placed on the target system. The Speech Manager library (JSpeechManager.dll or JSpeechManagerX64.dll) and the applicable SpeechKit Speech API libraries must be in the target system Java JRE bin directory.
- C# and VB .NET applications require the assembly libraries Chant.SpeechManager.Windows.dll, Chant.SpeechKit.Windows.dll, and Chant.Shared.Windows.dll embedded in the application or located in the same directory as the application .exe. The Speech Manager library (NSpeechManager.dll or NSpeechManagerX64.dll) must be registered as a COM library on the target system and the applicable SpeechKit Speech API libraries in the same directory as the application .exe.