How Tos

Last reviewed: 9/5/2022

Article ID: H072203

HOW: Managing listening context with speech recognition vocabularies

The information in this article applies to:

  • SpeechKit 11

Summary

Explore the new way of managing listening context with speech recognition vocabularies.

More Information

Vocabularies define the listening context from which to recognize speech. Speech recognition engines may support four types of vocabularies:

  • command,
  • grammar,
  • dictation, and
  • dictation topic.

To recognize speech with vocabularies, applications define the vocabulary, enable it to make it active, and disable it to make it inactive.

Applications can instantiate, enable, and disable vocabularies as needed to manage the listening context. More than one vocabulary may be enabled concurrently.

For Windows desktop applications, speech recognition does not begin until StartRecognition is invoked and stopped when StopRecognition is invoked. Disabling a vocabulary during recognition has no effect. Speech recognition must be stopped before changing the listening context by enabling and disabling vocabularies.

For Android, iOS, and macOS applications, speech recognition does not begin until StartRecognition is invoked and stopped automatically when either and utterance is recognized or a timeout exception occurs except when using a dictation vocabulary. With a dictation vocabulary, speech recognition is automatically restarted after an utterance is recognized or a timeout exception occurs until it is stopped when StopRecognition is invoked. Disabling a vocabulary during recognition has no effect. Speech recognition must be stopped before changing the listening context by enabling and disabling vocabularies.

Command Vocabulary

A command vocabulary consists of words and phrases that are spoken as commands. It is a simple list of the possible spoken words and phrases. The speech recognition engine matches recognized speech against the list and returns the best match. Command vocabularies are typically very small. For example, a command vocabulary may contain a 100 or fewer entries.

A command phrase can optionally have a property value associated with it. This enables processing recognition results on the basis of alternate values in addition to command strings.

For Android, iOS, and macOS applications, the RecognitionOther event is fired if there is no command vocabulary match for recognized speech.

To create a command vocabulary, instantiate a ChantCommandVocab class object. Then add the commands.

// Define the vocabulary
JChantCommandVocab _CommandVocab = _Recognizer.createCommandVocab("commands");

// Add commands to the vocabulary
_CommandVocab.addCommand("Open <filename>");
_CommandVocab.addCommand("Print <filename>");
_CommandVocab.addCommand("Close <filename>");

// Add choices to list
_CommandVocab.addChoiceToList("expenses", "filename");
_CommandVocab.addChoiceToList("status report", "filename");

// Enable the vocabulary
_CommandVocab.enable();

// Disable the vocabulary
_CommandVocab.disable();

Grammar Vocabulary

A grammar vocabulary consists of words and phrases and combinations of words and phrases expressed in a grammar syntax. Recognizers have their own syntax for expressing grammars.

Recognizer Speech API Grammar Syntax
Microsoft SAPI 5 (all languages)SAPI 5SAPI 5 XML Grammar
W3C SRGS XML
Microsoft Speech Platform (all languages)SAPI 5W3C SRGS XML
Microsoft .NET System.Speech (all languages)System.SpeechW3C SRGS XML
Microsoft .NET Microsoft.Speech (all languages)Microsoft.SpeechW3C SRGS XML
Microsoft WindowsMedia (UWP and WinRT) (all languages)WindowsMediaW3C SRGS XML

To create a grammar vocabulary from a grammar source file, use the ChantGrammarVocab class.

// Not supported on Android

Dictation Vocabulary

A dictation vocabulary represents a dictionary of all possible words from which speech is recognized. These vocabularies are integrated with the speech recognition engine. They are typically very large. For example, a dictionary may contain 30,000 and significantly more words.

To create a dictation vocabulary, use the ChantDictationVocab class.

// Define the vocabulary
JChantDictationVocab _DictationVocab = _Recognizer.createDictationVocab("text");

// Enable the vocabulary
_DictationVocab.enable();

// Disable the vocabulary
_DictationVocab.disable();

SAPI5 currently defines one specialized dictation topic: Spelling. SAPI5 recognition engines are not required to support specialized dictation topics including Spelling.

To enable a topic with SAPI5, use the ChantDictationVocab class and specify the topic name.

// Not supported on Android