Last reviewed: 3/23/2024 10:02:45 AM
Event Handling
To track recognition operations, applications receive event notifications. Event availability varies among Speech APIs.
Event | Event Arguments | Description |
---|---|---|
ApiError | ChantAPIErrorEventArgs | Notifies the application of an API error |
AudioSourceLevel | AudioLevelEventArgs | Notifies the application of the audio level |
AudioSourceStart | AudioEventArgs | Notifies the application the speech recognizer has started processing audio |
AudioSourceStop | AudioEventArgs | Notifies the application the speech recognizer has stopped processing audio |
DialogClosed | DialogClosedEventArgs | Notifies the application the speech recognizer dialog has closed |
FalseRecognition | SREventArgs | Notifies the application the speech recognizer was unable to recognize speech from the utterance |
InitComplete | SREventArgs | Notifies the application that speech engine enumeration is complete |
Interference | InterferenceEventArgs | Notifies the application the speech recognizer detected interference |
Paused | SREventArgs | Notifies the application the speech recognizer has paused processing audio |
PhraseStart | SREventArgs | Notifies the application the speech recognizer has detected the start of a phrase |
PropertyChange | PropertyChangeEventArgs | Notifies the application a speech recognizer property has changed |
RecognitionCommand | RecognitionCommandEventArgs | Notifies the application the speech recognizer recognized speech from a command vocabulary |
RecognitionCommandHypothesis | RecognitionCommandEventArgs | Notifies the application the speech recognizer may have recognized speech from a command vocabulary |
RecognitionDictation | RecognitionDictationEventArgs | Notifies the application the speech recognizer recognized speech from a dictation vocabulary |
RecognitionDictationHypothesis | RecognitionDictationEventArgs | Notifies the application the speech recognizer may have recognized speech from a dictation vocabulary |
RecognitionGrammar | RecognitionGrammarEventArgs | Notifies the application the speech recognizer recognized speech from a grammar vocabulary |
RecognitionGrammarHypothesis | RecognitionGrammarEventArgs | Notifies the application the speech recognizer may have recognized speech from a grammar vocabulary |
RecognitionHypothesis | WindowsMediaRecognitionHypothesisEventArgs | Notifies the application the WindowsMedia (via WinRT C++) speech recognizer may have recognized speech |
RecognitionOther | RecognitionOtherEventArgs | Notifies the application the speech recognizer recognized speech from another application context for Windows desktop recognizers and no command vocabulary match for Android, iOS, and macOS recognizers |
RecognitionTimeOut | SREventArgs | Notifies the application the speech recognizer has stopped processing audio |
RequestUI | RequestUIEventArgs | Notifies the application the speech recognizer recommends invoking one of its dialogs |
Sound | SREventArgs | Notifies the application the speech recognizer detected sound |
SRBookMark | SRBookMarkEventArgs | Notifies the application the speech recognizer detected a bookmark in the audio source |
SRCancel | CancelEventArgs | Notifies the application the speech recognizer canceled the request |
UtteranceBegin | SREventArgs | Notifies the application the speech recognizer detected the beginning of an utterance |
UtteranceEnd | SREventArgs | Notifies the application the speech recognizer detected the end of an utterance |
Some events provide data values that are returned in argument objects. Argument data availability varies among Speech APIs.
-
AudioEventArgs
- File - File name
-
MCSAudioSourceEventArgs
- SessionId - Session identifier
-
WindowsAudioEventArgs
- AudioStreamOffset - Audio stream offset
- AudioTimeOffset - Audio time offset
-
AudioLevelEventArgs
- Level - Audio level
- AudioStreamOffset - Audio stream offset
- AudioTimeOffset - Audio time offset
-
CancelEventArgs
-
MCSCancelEventArgs
- ErrorCode - Cancel error code
- ErrorDetails - Cancel error details
- Reason - Cancel reason
-
MCSSRCancelEventArgs
- Offset - Session offset
- SessionId - Session identifier
-
MCSCancelEventArgs
-
ChantAPIErrorEventArgs
- Function - API funtion name
- Message - API error message
- RC - API error return code
-
DialogClosedEventArgs
- Dialog - Dialog identifier
- ExitCode - Dialog exit code
-
InterferenceEventArgs
- Interference - The type of interference detected by the recognizer
- AudioStreamOffset - Audio stream offset
- AudioTimeOffset - Audio time offset
-
PropertyChangeEventArgs
- Property - Property that changed
- Value - New property value
- AudioStreamOffset - Audio stream offset
- AudioTimeOffset - Audio time offset
-
RecognitionCommandEventArgs
- Phrase - Recognized phrase
-
AndroidRecognitionCommandEventArgs
- Alternates - Collection of alternate recognized phrases
- Confidence - Indicates the confidence of the speech recognizer in the recognition result
- Semantics - Collection of semantics for recognized phrase
- Words - Collection of recognized words
-
SFRecognitionCommandEventArgs
- Alternates - Collection of alternate recognized phrases
- Confidence - Indicates the confidence of the speech recognizer in the recognition result
- Semantics - Collection of semantics for recognized phrase
- Words - Collection of recognized words
-
WindowsRecognitionCommandEventArgs
- AnnotatedPhrase - Annotated recognized phrase
- StreamTime - Recognition result absolute time for start of phrase audio
- Length - Recognition result length of the phrase specified in 100 nanosecond units
- TickCount - Number of milliseconds elapsed from the start of the system to the start of the current result
- Start - The total 100 nanosecond units from the start of the stream to the start of the phrase
- SAPIElements - Collection of SAPIElements for recognized phrase
- SAPIPhrases - Collection of SAPIPhrases for recognized phrase
- AudioStreamOffset - Audio stream offset
- AudioTimeOffset - Audio time offset
-
WindowsMediaRecognitionCommandEventArgs
- Alternates - Collection of alternate recognized phrases
- Confidence - Indicates the confidence of the speech recognizer in the recognition result
- Duration - The amount of time required for the utterance.
- Phrase - Recognized phrase
- RawConfidence - Indicates the relative confidence of the result when compared with a collection of alternatives
- Rules - Collection of rules for recognized phrase
- Semantics - Collection of semantics for recognized phrase
- StartTime - The start time of the utterance
-
Status - The result state
Value Description 0 The recognition session or compilation succeeded 1 A topic constraint was set for an unsupported language 2 The language of the speech recognizer does not match the language of a grammar 3 A grammar failed to compile 4 Audio problems caused recognition to fail 5 User canceled recognition session 6 An unknown problem caused recognition or compilation to fail 7 A timeout due to extended silence or poor audio caused recognition to fail 8 An extended pause, or excessive processing time, caused recognition to fail 9 Network problems caused recognition to fail 10 Lack of a microphone caused recognition to fail - Words - Collection of recognized words
-
RecognitionDictationEventArgs
- Text - Recognized phrase
-
AndroidRecognitionDictationEventArgs
- Alternates - Collection of alternate recognized phrases
- Confidence - Indicates the confidence of the speech recognizer in the recognition result
-
MCSRecognitionDictationEventArgs
- Alternates - Alternative recognition results
- Confidence - Confidence of recognition from 0.0 (no confidence) to 1.0 (full confidence)
- Lexical - The actual words recognized
- ITN - Inverse-text-normalized form of the recognized text and other transformations applied
- MaskedITN - Normalized form with profanity masked
- SessionId - Session identifier
- Offset - Offset of the recognized speech in ticks. A single tick represents one hundred nanoseconds
- Duration - Duration of the recognized speech that does not include trailing or leading silence
- Words - Word level timing result list
-
SFRecognitionDictationEventArgs
- Alternates - Collection of alternate recognized phrases
- Confidence - Indicates the confidence of the speech recognizer in the recognition result
-
WindowsRecognitionDictationEventArgs
- StreamTime - Recognition result absolute time for start of phrase audio
- Length - Recognition result length of the phrase specified in 100 nanosecond units
- TickCount - Number of milliseconds elapsed from the start of the system to the start of the current result
- Start - The total 100 nanosecond units from the start of the stream to the start of the phrase
- AudioStreamOffset - Audio stream offset
- AudioTimeOffset - Audio time offset
-
WindowsMediaRecognitionDictationEventArgs
- Alternates - Collection of alternate recognized phrases
- Confidence - Indicates the confidence of the speech recognizer in the recognition result
- Text - Recognized phrase
-
RecognitionGrammarEventArgs
- Phrase - Recognized phrase
-
AndroidRecognitionGrammarEventArgs
- Alternates - Collection of alternate recognized phrases
- Confidence - Indicates the confidence of the speech recognizer in the recognition result
- Rules - Collection of rules for recognized phrase
- Semantics - Collection of semantics for recognized phrase
- Words - Collection of recognized words
-
SFRecognitionGrammarEventArgs
- Alternates - Collection of alternate recognized phrases
- Confidence - Indicates the confidence of the speech recognizer in the recognition result
- Rules - Collection of rules for recognized phrase
- Semantics - Collection of semantics for recognized phrase
- Words - Collection of recognized words
-
WindowsRecognitionGrammarEventArgs
- AnnotatedPhrase - Annotated recognized phrase
- StreamTime - Recognition result absolute time for start of phrase audio
- Length - Recognition result length of the phrase specified in 100 nanosecond units
- TickCount - Number of milliseconds elapsed from the start of the system to the start of the current result
- Start - The total 100 nanosecond units from the start of the stream to the start of the phrase
- SAPIElements - Collection of SAPIElements for recognized phrase
- SAPIPhrases - Collection of SAPIPhrases for recognized phrase
- SAPIRules - Collection of SAPIRules for recognized phrase
- AudioStreamOffset - Audio stream offset
- AudioTimeOffset - Audio time offset
-
WindowsMediaRecognitionGrammarEventArgs
- Alternates - Collection of alternate recognized phrases
- Confidence - Indicates the confidence of the speech recognizer in the recognition result
- Duration - The amount of time required for the utterance.
- Phrase - Recognized phrase
- RawConfidence - Indicates the relative confidence of the result when compared with a collection of alternatives
- Rules - Collection of rules for recognized phrase
- Semantics - Collection of semantics for recognized phrase
- StartTime - The start time of the utterance
-
Status - The result state
Value Description 0 The recognition session or compilation succeeded 1 A topic constraint was set for an unsupported language 2 The language of the speech recognizer does not match the language of a grammar 3 A grammar failed to compile 4 Audio problems caused recognition to fail 5 User canceled recognition session 6 An unknown problem caused recognition or compilation to fail 7 A timeout due to extended silence or poor audio caused recognition to fail 8 An extended pause, or excessive processing time, caused recognition to fail 9 Network problems caused recognition to fail 10 Lack of a microphone caused recognition to fail - Words - Collection of recognized words
-
RecognitionHypothesisEventArgs
- Text - Recognized phrase
-
MCSRecognitionHypothesisEventArgs
- Duration - Audio stream offset
- Offset - Audio time offset s
- SessionId - Session identifier
- WindowsMediaRecognitionHypothesisEventArgs
-
RecognitionOtherEventArgs
- Command - Recognized phrase
- AndroidRecognitionOtherEventArgs
- SFRecognitionOtherEventArgs
-
WindowsRecognitionOtherEventArgs
- AudioStreamOffset - Audio stream offset
- AudioTimeOffset - Audio time offset
-
RequestUIEventArgs
- RequestUI - The requested recognizer dialog
- AudioStreamOffset - Audio stream offset
- AudioTimeOffset - Audio time offset
-
SRBookMarkEventArgs
- MarkValue - Bookmark value
- RecoEventFlag - Speech recognition event flag
- AudioStreamOffset - Audio stream offset
- AudioTimeOffset - Audio time offset
-
SREventArgs
- AndroidSREventArgs
-
MCSSREventArgs
- Duration - Audio stream offset
- Offset - Audio time offset
- SessionId - Session identifier
- SFSREventArgs
- WindowsSREventArgs
-
WindowsMediaSREventArgs
- AudioStreamOffset - Audio stream offset
- AudioTimeOffset - Audio time offset
Event arguments may contain the following class objects:
-
ChantAlternate
- Phrase - Recognized phrase
-
AndroidAlternate
- Confidence - Indicates the confidence of the speech recognizer in the recognition result
-
MCSAlternate
- Confidence - Indicates the confidence of the speech recognizer in the recognition result
- Lexical - The actual words recognized
- ITN - Inverse-text-normalized form of the recognized text and other transformations applied
- MaskedITN - Normalized form with profanity masked
- SessionId - Session identifier
- Offset - Offset of the recognized speech in ticks. A single tick represents one hundred nanoseconds
- Duration - Duration of the recognized speech that does not include trailing or leading silence
- Words - Word level timing result list
-
SFAlternate
- Confidence - Indicates the confidence of the speech recognizer in the recognition result
-
WindowsMediaAlternate
- Confidence - Indicates the confidence of the speech recognizer in the recognition result
-
ChantSAPIElement
- DisplayText - The display text for this element
- LexicalForm - The lexical form of this element
- Pronunciation - The phonemes for this element
- ActualConfidence - The actual confidence for this element
- SREngineConfidence - The confidence score computed by the SR engine
- AudioTimeOffset - The starting offset of the element in 100-nanosecond units of time relative to the start of the phrase
- AudioSizeTime - The length of the element in 100-nanosecond units of time
- AudioStreamOffset - The starting offset of the element in bytes relative to the start of the phrase in the original input stream
- AudioSizeBytes - The size of the element in bytes in the original input stream
- RetainedStreamOffset - The starting offset of the element in bytes relative to the start of the phrase in the retained audio stream
- RetainedSizeBytes - The size of the element in bytes in the retained audio stream
-
ChantSAPIPhrase
- PropName - Property name
- PropID - Property ID
- ValStr - ValStr value
- Val - Val value
- SREngineConfidence - Confidence computed by the speech recognition engine
- Confidence - Confidence computed by SAPI
-
ChantSAPIRule
- Name - Rule name
- ID - Rule ID
- SREngineConfidence - Confidence computed by the speech recognition engine
- Confidence - Confidence computed by SAPI
-
ChantSemantic
- PropName - Property name
- PropValue - Property value
-
ChantWord
- Text - Word text
-
MCSWord
- Duration - Word duration
- Offset - Word offset
- WindowsMediaWord
Event notifications are recieved in callback routines as follows:
_Recognizer = _SpeechKit.createChantRecognizer();
if (_Recognizer != null)
{
// Set the callback
_Recognizer.setChantSpeechKitEvents(this);
// Register Callbacks for engine init
_Recognizer.registerCallback(ChantSpeechKitCallback.CCSRRecognitionDictation);
}
_Recognizer = _SpeechKit.CreateChantRecognizer();
if (_Recognizer != null)
{
_Recognizer.RecognitionCommand += this.Recognizer_RecognitionCommand;
}
_Recognizer = _SpeechKit->CreateChantRecognizer();
if (_Recognizer != NULL)
{
// Register Event Handlers
_Recognizer->SetRecognitionCommand(RecognitionCommand);
}
_Recognizer = _SpeechKit->CreateChantRecognizer();
if (_Recognizer != NULL)
{
// Register Event Handlers
_Recognizer->SetRecognitionCommand(RecognitionCommand);
}
_Recognizer := _SpeechKit.CreateChantRecognizer();
if (_Recognizer <> nil) then
begin
// Register Event Handlers
_Recognizer.RecognitionCommand := RecognitionCommand;
end;
_Recognizer = _SpeechKit.createChantRecognizer();
if (_Recognizer != null)
{
// Set the callback object
_Recognizer.setChantSpeechKitEvents(this);
// Register for callbacks
_Recognizer.registerCallback(ChantSpeechKitCallback.CCSRRecognitionCommand);
}
_recognizer = [_speechKit createChantRecognizer];
if (_recognizer != nil)
{
[_recognizer setDelegate:(id<SPChantRecognizerDelegate>)self];
}
_Recognizer = _SpeechKit!.createChantRecognizer()
if (_Recognizer != nil)
{
_Recognizer!.delegate = self
}
_Recognizer = _SpeechKit.CreateChantRecognizer()
// Declaring the event handlers routines with Handles clause in VB automatically registers for the event notifications
Private Sub Recognizer_RecognitionCommand(ByVal sender As System.Object, ByVal e As RecognitionCommandEventArgs) Handles _Recognizer.RecognitionCommand
The recognizer object sends all notifications to the event handlers. All event data is contained in a arguments object.
@Override
public void recognitionDictation(Object o, RecognitionDictationEventArgs recognitionDictationEventArgs)
{
// Display recognized speech
final EditText textBox1 = (EditText) findViewById(R.id.textbox1);
if ((textBox1 != null) && (recognitionDictationEventArgs.getText() != null)) {
textBox1.append( recognitionDictationEventArgs.getText() + "\n" );
}
...
}
private void Recognizer_RecognitionCommand(object sender, RecognitionCommandEventArgs e)
{
if ((e != null) && (e.Phrase != null))
{
...
}
}
void CALLBACK RecognitionCommand(void* Sender, CRecognitionCommandEventArgs* Args)
{
...
// Get the command properties
if ((Args != NULL) && (wcslen(Args->GetPhrase()) > 0))
{
...
}
}
void RecognitionCommand(void* Sender, CRecognitionCommandEventArgs* Args)
{
// Get the command
if ((Args != NULL) && (Args->GetPhrase().Length() > 0))
{
...
}
}
procedure TForm1.RecognitionCommand(Sender: TObject; Args: TRecognitionCommandEventArgs);
begin
// Get the command properties
If ((Args <> nil) and (Length(Args.Phrase) > 0)) then
begin
...
end;
end;
public void recognitionCommand(Object sender, RecognitionCommandEventArgs args)
{
if ((args != null) && (args.getPhrase() != null))
{
...
}
}
-(void)recognitionDictation:(NSObject *)sender args:(SPRecognitionDictationEventArgs *)args;
{
NSString* newText = [NSString stringWithFormat:@"%@%@ ", [_textView1 text], [args text]];
[_textView1 setText:newText];
}
func recognitionDictation(sender: SPChantRecognizer, args: SPRecognitionDictationEventArgs)
{
let newText = String(format: "%@%@ ", self.textView1.text, args.text)
self.textView1.text = newText
}
Private Sub Recognizer_RecognitionCommand(ByVal sender As System.Object, ByVal e As RecognitionCommandEventArgs) Handles _Recognizer.RecognitionCommand
If ((e IsNot Nothing) And (e.Phrase IsNot Nothing)) Then
...
End If
End Sub