Managing queued speech processing requests
Last reviewed: 7/8/2022
HOW Article ID: H072222
The information in this article applies to:
- Speech Manager 2
Summary
Applications may generate speech requests dynamically as a result many different types of interactions creating complex management situations.
More Information
Chant Speech Manager supports two types of speech requests: transcription and speech synthesis.
With transcription requests, audio streams (i.e., buffers and files) serve as audio source for speech recognition.
With synthesis requests, audio is generated from synthesizing speech from text and returned as buffers, file, or streamed for live playback.
Requests are created, scheduled, and destroyed. A request is created with optional parameters that specify the details for transcription or synthesis. Once a request is created, it can be managed with various priorities:
- Cancel - cancel the request;
- Interrupt - cancel the current request and process this request immediately (speech synthesis only);
- Priority - place the request at the head of the queue to be the next request processed; or
- Schedule - append the request at the end of the queue to be processed.
Speech Manager uses the SpeechKit speech API libraries to process the requests. All SpeechKit speech API property setting and event handling is supported.
// Instantiate SpeechManager
NSpeechManager _SpeechManager = new NSpeechManager();
if (_SpeechManager != null)
{
// Set credentials
_SpeechManager.SetCredentials("Credentials");
// Create transcription request
NChantTranscribeAudioRequest transcribeRequest = _SpeechManager.CreateTranscribeAudioRequest("", "", "myaudio.wav");
if (transcribeRequest != null)
{
// Register for recognition events
transcribeRequest.RecognitionDictation += Recognizer_RecognitionDictation;
// Optionally register for begin/end events
transcribeRequest.AudioSourceStart += Recognizer_AudioSourceStart;
transcribeRequest.AudioSourceStop += Recognizer_AudioSourceStop;
// Schedule request
transcribeRequest.ScheduleRequest();
// Since we no longer need it, destroy it
transcribeRequest.Dispose();
}
// Create synthesis request
NChantSpeakRequest speakRequest = _SpeechManager.CreateSpeakRequest("See how easy it is to talk with Speech Manager.");
if (speakRequest != null)
{
// Optionally register for begin/end events
speakRequest.AudioDestStart += Synthesizer_AudioDestStart;
speakRequest.AudioDestStop += Synthesizer_AudioDestStop;
speakRequest.Done += Synthesizer_TTSDone;
speakRequest.Started += Synthesizer_TTSStarted;
// Schedule request
speakRequest.ScheduleRequest();
// Since we no longer need it, destroy it
speakRequest.Dispose();
}
}
Syntax Options
A SpeechManager supports the following methods:
- CreateTranscribeAudioRequest - Instantiate a transcription request.
- CreateSpeakRequest - Instantiate a speak request;
- FlushSRRequests - Remove all transcription requests.
- FlushTTSRequests - Remove all speak requests.
- QuiesceSRRequests - Stop processing all transcription requests allowing the current request to finish.
- QuiesceTTSRequests - Stop processing all speak requests allowing the current request to finish.
- StartSRRequests - Start processing transcription requests.
- StartTTSRequests - Start processing speak requests.
- StopSRRequests - Stop processing all transcription requests canceling the current request.
- StopTTSRequests - Stop processing all speak requests canceling the current request.
The CreateTranscribeAudioRequest method has two optional parameters that may be used to control speech recognition:
- commands - (optional) Comma-separated list of commands for command recognition.
- grammar - (optional) File path of speech recognition grammar for grammar recognition.
- audiofile - (optional) File path of audio file for transcription. Use PutAudioBytes for audio buffers.
- api - (optional) Speech API. Supported APIs include: sapi5, dragon, and msp.
- engine - (optional) Speech engine name, id, or language.
A TranscribeAudioRequest supports the following methods:
- CancelRequest - Cancels the request and removes it.
- PriorityRequest - Schedules the request as the next request to process.
- PutAudioBytes - Sets the source audio for the transcription.
- SetProperty - Sets the engine properties.
- ScheduleRequest - Schedules the request at the end of the queue to be processed.
The CreateSpeakRequest method has four optional parameters that may be used to control speech synthesis:
- text - (required) The text from which to synthesize speech.
- options - (optional) Speech synthesis options.
- outfile - (optional) The file path of where to write the synthesized audio.
- outformat - (optional) The audio output format.
- api - (optional) Speech API. Supported APIs include: sapi5, windowsmedia, msp, acatt, swift, cerevoice.
- engine - (optional) Speech engine name, id, or language.
A SpeakRequest supports the following methods:
- CancelRequest - Cancels the request and removes it.
- GetAudioBytes - Returns the audio bytes from speech synthesis.
- InterruptRequest - Cancels the current request and schedules the request as the next request to process.
- PriorityRequest - Schedules the request as the next request to process.
- SetProperty - Sets the engine properties.
- ScheduleRequest - Schedules the request at the end of the queue to be processed.
Development and Deployment
Speech Manager applications require the Speech Manager library and the applicable SpeechKit Speech API libraries:
- C++Builder, C++, and Delphi applications require the Speech Manager library (CSpeechManager.dll or CSpeechManagerX64.dll) and the applicable SpeechKit Speech API libraries in the same directory as the application .exe.
- Java applications require the chant.speechmanager.jar, speechkit.jar, and chant.shared.jar in the target system Java JRE lib directory and/or ensure the classpath includes the path where the chant.speechmanager.jar, speechkit.jar, and chant.shared.jar libraries are placed on the target system. The Speech Manager library (JSpeechManager.dll or JSpeechManagerX64.dll) and the applicable SpeechKit Speech API libraries must be in the target system Java JRE bin directory.
- C# and VB .NET applications require the assembly libraries Chant.SpeechManager.Windows.dll, Chant.SpeechKit.Windows.dll, and Chant.Shared.Windows.dll embedded in the application or located in the same directory as the application .exe. The Speech Manager library (NSpeechManager.dll or NSpeechManagerX64.dll) must be registered as a COM library on the target system and the applicable SpeechKit Speech API libraries in the same directory as the application .exe.