How do I manage application generated speech requests asynchronously?

Last reviewed: 12/14/2021

HOW Article ID: H072122

The information in this article applies to:

  • Speech Manager

Summary

Applications may generate speech requests dynamically as a result many different types of interactions creating complex management situations.

More Information

Chant Speech Manager supports two types of speech requests: transcription and speech synthesis.

With transcription requests, audio streams (i.e., buffers and files) serve as audio source for speech recognition.

With synthesis requests, audio is generated from synthesizing speech from text and returned as buffers, file, or streamed for live playback.

Requests are created, scheduled, and destroyed. A request is created with optional parameters that specify the details for transcription or synthesis. Once a request is created, it can be managed with various priorities:

  • Cancel - cancel the request;
  • Interrupt - cancel the current request and process this request immediately (speech synthesis only);
  • Priority - place the request at the head of the queue to be the next request processed; or
  • Schedule - append the request at the end of the queue to be processed.
The request can be destroyed when appropriate by the application.

Speech Manager uses the SpeechKit speech API libraries to process the requests. All SpeechKit speech API property setting and event handling is supported.


// Instantiate SpeechManager
NSpeechManager _SpeechManager = new NSpeechManager();
if (_SpeechManager != null)
{
    // Set license properties
    //_SpeechManager.SetLicense("LicenseRegistrationNumber", "LicenseSerialNumber");
    // Else, for evaluation, set only evaluation serial number
    _SpeechManager.SetLicense(string.Empty, "EvaluationSerialNumber");
    // Create transcription request
    NChantTranscribeAudioRequest transcribeRequest = _SpeechManager.CreateTranscribeAudioRequest("", "", "myaudio.wav");
    if (transcribeRequest != null)
    {
        // Register for recognition events
        transcribeRequest.RecognitionDictation += Recognizer_RecognitionDictation;
        // Optionally register for begin/end events
        transcribeRequest.AudioSourceStart += Recognizer_AudioSourceStart;
        transcribeRequest.AudioSourceStop += Recognizer_AudioSourceStop;

        // Schedule request
        transcribeRequest.ScheduleRequest();

        // Since we no longer need it, destroy it
        transcribeRequest.Dispose();
    }
    // Create synthesis request
    NChantSpeakRequest speakRequest = _SpeechManager.CreateSpeakRequest("See how easy it is to talk with Speech Manager.");
    if (speakRequest != null)
    {
        // Optionally register for begin/end events
        speakRequest.AudioDestStart += Synthesizer_AudioDestStart;
        speakRequest.AudioDestStop += Synthesizer_AudioDestStop;
        speakRequest.Done += Synthesizer_TTSDone;
        speakRequest.Started += Synthesizer_TTSStarted;

        // Schedule request
        speakRequest.ScheduleRequest();

        // Since we no longer need it, destroy it
        speakRequest.Dispose();
    }
}
        

// Instantiate SpeechManager object
CSpeechManager* _SpeechManager = new CSpeechManager();
if (_SpeechManager =! NULL)
{
	// Set license properties
	//_SpeechManager->SetLicense(L"LicenseRegistrationNumber", L"LicenseSerialNumber");
	// Else, for evaluation, set only evaluation serial number
	_SpeechManager->SetLicense(L"", L"EvaluationSerialNumber");
	// Create transcription request
	CChantTranscribeAudioRequest* pTranscribeRequest = _SpeechManager->CreateTranscribeAudioRequest(L"", L"", L"myaudio.wav");
	if (pTranscribeRequest != NULL)
	{
	    // Register for recognition events
	    pTranscribeRequest->SetRecognitionDictation(RecognitionDictation);
	    // Optionally register for begin/end events
	    pTranscribeRequest->SetAudioSourceStart(AudioSourceStart);
	    pTranscribeRequest->SetAudioSourceStop(AudioSourceStop);

	    // Schedule request
	    pTranscribeRequest->ScheduleRequest();

	    // Since we no longer need it, destroy it
	    delete pTranscribeRequest;
	}
	// Create synthesis request
	CChantSpeakRequest* pSpeakRequest = _SpeechManager->CreateSpeakRequest(L"See how easy it is to talk with Speech Manager.");
	if (pSpeakRequest != NULL)
	{
	    // Optionally register for begin/end events
	    pSpeakRequest->SetAudioDestStart(AudioDestStart);
	    pSpeakRequest->SetAudioDestStop(AudioDestStop);
	    pSpeakRequest->SetDone(TTSDone);
	    pSpeakRequest->SetStarted(TTSStarted);

	    // Schedule request
	    pSpeakRequest->ScheduleRequest();

	    // Since we no longer need it, destroy it
	    delete pSpeakRequest;
	}
}
    

// Instantiate SpeechManager object
CSpeechManager* _SpeechManager = new CSpeechManager();
if (_SpeechManager =! NULL)
{
	// Set license properties
	//_SpeechManager->SetLicense("LicenseRegistrationNumber", "LicenseSerialNumber");
	// Else, for evaluation, set only evaluation serial number
	_SpeechManager->SetLicense("", "EvaluationSerialNumber");
	// Create transcription request
	CChantTranscribeAudioRequest* pTranscribeRequest = _SpeechManager->CreateTranscribeAudioRequest("","","myaudio.wav");
	if (pTranscribeRequest != NULL)
	{
	    // Register for recognition events
	    pTranscribeRequest->SetRecognitionDictation(RecognitionDictation);
	    // Optionally register for begin/end events
	    pTranscribeRequest->SetAudioSourceStart(AudioSourceStart);
	    pTranscribeRequest->SetAudioSourceStop(AudioSourceStop);

	    // Schedule request
	    pTranscribeRequest->ScheduleRequest();

	    // Since we no longer need it, destroy it
	    delete pTranscribeRequest;
	}
	// Create synthesis request
	CChantSpeakRequest* pSpeakRequest = _SpeechManager->CreateSpeakRequest("See how easy it is to talk with Speech Manager.");
	if (pSpeakRequest != NULL)
	{
	    // Optionally register for begin/end events
	    pSpeakRequest->SetAudioDestStart(AudioDestStart);
	    pSpeakRequest->SetAudioDestStop(AudioDestStop);
	    pSpeakRequest->SetDone(TTSDone);
	    pSpeakRequest->SetStarted(TTSStarted);

	    // Schedule request
	    pSpeakRequest->ScheduleRequest();

	    // Since we no longer need it, destroy it
	    delete pSpeakRequest;
	}
}

var
  _SpeechManager: TSpeechManager;
  transcribeRequest: TChantTranscribeAudioRequest;
  speakRequest: TChantSpeakRequest;
begin
    // Instantiate SpeechManager object
    _SpeechManager := TSpeechManager.Create();
    if (_SpeechManager <> nil) then
    begin
        // Set license properties
        //_SpeechManager.SetLicense('LicenseRegistrationNumber', 'LicenseSerialNumber');
        // Else, for evaluation, set only evaluation serial number
        _SpeechManager.SetLicense('', 'EvaluationSerialNumber');
        // Create transcription request
        transcribeRequest := _SpeechManager.CreateTranscribeAudioRequest('','','myaudio.wav');
        if (request <> nil) then
        begin
            // Register for recognition events
            transcribeRequest.RecognitionDictation := RecognitionDictation;
            // Optionally register for begin/end events
            transcribeRequest.AudioSourceStart := AudioSourceStart;
            transcribeRequest.AudioSourceStop := AudioSourceStop;

            // Schedule request
            transcribeRequest.ScheduleRequest();

            // Since we no longer need it, destroy it
            transcribeRequest.Destroy();
        end;
        // Create synthesis request
        speakRequest := _SpeechManager.CreateSpeakRequest('See how easy it is to talk with Speech Manager.');
        if (request <> nil) then
        begin
            // Optionally register for begin/end events
          request.AudioDestStart := AudioDestStart;
          request.AudioDestStop := AudioDestStop;
          request.Done := TTSDone;
          request.Started := TTSStarted;

          // Schedule request
          request.ScheduleRequest();

          // Since we no longer need it, destroy it
          request.Destroy();
        end;
    end;
end;
    

// Create SpeechManager object
JSpeechManager _SpeechManager = new JSpeechManager();
// Set license properties
//_SpeechManager.setLicense("LicenseRegistrationNumber", "LicenseSerialNumber");
// Else, for evaluation, set only evaluation serial number
_SpeechManager.setLicense("", "EvaluationSerialNumber");
// Create transcription request
JChantTranscribeAudioRequest transcribeRequest = _SpeechManager.createTranscribeAudioRequest("", "", "myaudio.wav", "", "");
if (transcribeRequest != null)
{
	// Set the callback
	transcribeRequest.setChantSpeechKitEvents(this);
	// Register for recognition events
	transcribeRequest.registerCallback(ChantSpeechKitCallback.CCSRRecognitionDictation);
	// Optionally register for begin/end events
	transcribeRequest.registerCallback(ChantSpeechKitCallback.CCSAudioStart);
	transcribeRequest.registerCallback(ChantSpeechKitCallback.CCSAudioStop);
	// Schedule request
	transcribeRequest.scheduleRequest();
	// Since we no longer need it, destroy it
	transcribeRequest.dispose();
}
// Create synthesis request
JChantSpeakRequest speakRequest = _SpeechManager.createSpeakRequest("See how easy it is to talk with Speech Manager.", "", "");
if (speakRequest != null)
{
	// Set the callback
	speakRequest.setChantSpeechKitEvents(this);
	// Optionally register for begin/end events
	speakRequest.registerCallback(ChantSpeechKitCallback.CCDAudioStart);
	speakRequest.registerCallback(ChantSpeechKitCallback.CCDAudioStop);
	speakRequest.registerCallback(ChantSpeechKitCallback.CCTTSDone);
	speakRequest.registerCallback(ChantSpeechKitCallback.CCTTSStarted);
	// Schedule request
	speakRequest.scheduleRequest();
	// Since we no longer need it, destroy it
	speakRequest.dispose();
}
    

Dim _SpeechManager As NSpeechManager
Dim WithEvents _SRRequest As NChantTranscribeAudioRequest
Dim WithEvents _TTSRequest As NChantSpeakRequest
    ' Instantiate SpeechManager
    _SpeechManager = New NSpeechManager()
    If (_SpeechManager IsNot Nothing) Then
        ' Set license properties
        '_SpeechManager.SetLicense("LicenseRegistrationNumber", "LicenseSerialNumber")
        ' Else, for evaluation, set only evaluation serial number
        _SpeechManager.SetLicense(string.Empty, "EvaluationSerialNumber")
        If (Not String.IsNullOrEmpty(audioFile.Text.Trim())) Then
            ' Create transcription request
            _SRRequest = _SpeechManager.CreateTranscribeAudioRequest("", "", "myaudio.wav")
            If (_SRRequest IsNot Nothing) Then
                ' Schedule request
                _SRRequest.ScheduleRequest()

                ' Since we no longer need it, destroy it
                _SRRequest.Dispose()
            End If
        End If
        ' Create synthesis request
        _TTSRequest = _SpeechManager.CreateSpeakRequest("See how easy it is to talk with Speech Manager.")
        If (_TTSRequest IsNot Nothing) Then
            ' Schedule request
            _TTSRequest.ScheduleRequest()

            ' Since we no longer need it, destroy it
            _TTSRequest.Dispose()
        End If
    End If

Syntax Options

A SpeechManager supports the following methods:

  • CreateTranscribeAudioRequest - Instantiate a transcription request.
  • CreateSpeakRequest - Instantiate a speak request;
  • FlushSRRequests - Remove all transcription requests.
  • FlushTTSRequests - Remove all speak requests.
  • QuiesceSRRequests - Stop processing all transcription requests allowing the current request to finish.
  • QuiesceTTSRequests - Stop processing all speak requests allowing the current request to finish.
  • StartSRRequests - Start processing transcription requests.
  • StartTTSRequests - Start processing speak requests.
  • StopSRRequests - Stop processing all transcription requests canceling the current request.
  • StopTTSRequests - Stop processing all speak requests canceling the current request.

The CreateTranscribeAudioRequest method has two optional parameters that may be used to control speech recognition:

  • commands - (optional) Comma-separated list of commands for command recognition.
  • grammar - (optional) File path of speech recognition grammar for grammar recognition.
  • audiofile - (optional) File path of audio file for transcription. Use PutAudioBytes for audio buffers.
  • api - (optional) Speech API. Supported APIs include: sapi5, dragon, and msp.
  • engine - (optional) Speech engine name, id, or language.

A TranscribeAudioRequest supports the following methods:

  • CancelRequest - Cancels the request and removes it.
  • PriorityRequest - Schedules the request as the next request to process.
  • PutAudioBytes - Sets the source audio for the transcription.
  • SetProperty - Sets the engine properties.
  • ScheduleRequest - Schedules the request at the end of the queue to be processed.

The CreateSpeakRequest method has four optional parameters that may be used to control speech synthesis:

  • text - (required) The text from which to synthesize speech.
  • options - (optional) Speech synthesis options.
  • outfile - (optional) The file path of where to write the synthesized audio.
  • outformat - (optional) The audio output format.
  • api - (optional) Speech API. Supported APIs include: sapi5, windowsmedia, msp, acatt, swift, cerevoice.
  • engine - (optional) Speech engine name, id, or language.

A SpeakRequest supports the following methods:

  • CancelRequest - Cancels the request and removes it.
  • GetAudioBytes - Returns the audio bytes from speech synthesis.
  • InterruptRequest - Cancels the current request and schedules the request as the next request to process.
  • PriorityRequest - Schedules the request as the next request to process.
  • SetProperty - Sets the engine properties.
  • ScheduleRequest - Schedules the request at the end of the queue to be processed.

Development and Deployment

Speech Manager applications require the Speech Manager library and the applicable SpeechKit Speech API libraries:

  • C++Builder, C++, and Delphi applications require the Speech Manager library (CSpeechManager.dll or CSpeechManagerX64.dll) and the applicable SpeechKit Speech API libraries in the same directory as the application .exe.
  • Java applications require the chant.speechmanager.jar, speechkit.jar, and chant.shared.jar in the target system Java JRE lib directory and/or ensure the classpath includes the path where the chant.speechmanager.jar, speechkit.jar, and chant.shared.jar libraries are placed on the target system. The Speech Manager library (JSpeechManager.dll or JSpeechManagerX64.dll) and the applicable SpeechKit Speech API libraries must be in the target system Java JRE bin directory.
  • C# and VB .NET applications require the assembly libraries Chant.SpeechManager.Windows.dll, Chant.SpeechKit.Windows.dll, and Chant.Shared.Windows.dll embedded in the application or located in the same directory as the application .exe. The Speech Manager library (NSpeechManager.dll or NSpeechManagerX64.dll) must be registered as a COM library on the target system and the applicable SpeechKit Speech API libraries in the same directory as the application .exe.