How do I integegrate VoCon Hybrid recognizer that accesses the Nuance Mobile Application Server (NMAS)?

Last reviewed: 5/1/2012

HOW Article ID: H051201

The information in this article applies to:

  • SpeechKit 7

Summary

Chant Application Framework Vocabulary Management provides a remote vocabulary object for controlling speech recognition with a remote server. SpeechKit supports VoCon 3200 Hybrid V4.3 for accessing the Nuance Mobile Application Server (NMAS) with remote vocabularies.

More Information

A remote vocabulary is an xml file containing the parameters to use speech recognition services on a remote server. The parameters are unique to the service provider.

VoCon 3200 Hybrid Release 4.3 introduces a new Nuance Mobile Speech Platform (NMSP) add-on feature that supports the capabilities of Nuance Mobile Application Server (NMAS).

The VoCon NMSP add-on feature connects your application over TCP/IP through the NMSP gateway with a Nuance Mobile Application Server (NMAS). This server executes queries to perform dictation recognition and command execution.

A remote vocabulary represents a single query request for the server. Your application may have one or more remote vocabularies repesenting the various queries needed at runtime.

The query request is invoked when the remote vocabulary is enabled and terminated when the remote vocabulary is disabled.

For dictation query requests, the query is invoked until the vocabulary is disabled simulating continuous speech recognition.

The following grammar illustrates using dictation services with the NMAS server:


<?xml version="1.0" ?>
<csrremotevocab>
<connectionparams>
	<host>65.124.114.138</host>
	<port>443</port>
	<applicationid>your_id</applicationid>
	<applicationkey>your_key</applicationkey>
</connectionparams>
<queryparams>
	<applicationversion>1</applicationversion>
	<applicationlanguage>enus</applicationlanguage>
	<carrier>wifi</carrier>
</queryparams>
<acmodfile>C:\\Nuance\\VoCon Hybrid\\SDK_v4_3\\models\\acmod4_900_enu_gen_car_f16_v1_0_0.dat</acmodfile>
<nmascommandname>AUTOMOTIVE_RESOLVE_DICTATION</nmascommandname>
<nmaslinkname>BODY</nmaslinkname>
</csrremotevocab>

The following remote vocabulary elements may be defined:

ElementParent ElementDefinition
csrremotevocabVocabulary root element
connectionparamscsrremotevocabDefines the session parameters.
hostconnectionparamsDefines the IP address of the remote NMSP server.
portconnectionparamsDefines the port number of the remote NMSP server.
applicationidconnectionparamsDefines the application id.
applicationkeyconnectionparamsDefines the application key.
queryparamscsrremotevocabDefines the query parameters.
applicationversionqueryparamsDefines application version string.
applicationlanguagequeryparamsDefines application Language identifier.
carrierqueryparamsDefines carrier name.
phonemodelqueryparamsDefines the Phone model, e.g. \"iPhone\" (this key is optional).
phonenetworkqueryparamsDefines the phone network (this key is optional).
phoneosqueryparamsDefines the the phone os, e.g \"4.0\" (this key is optional).
phonesubmodelqueryparamsDefines the phone sub-model, e.g \"iPhone2,1\" (this key is optional).
dictationtypequeryparamsDefines the query type as Dictation, Websearch, or Automotive-Dictation.
audiosourcequeryparamsIndicates the source of audio from the user as SpeakerAndMicrophone, HeadsetInOut, HeadsetBT, HeadPhone, or LineOut (this key is optional).
commandtimeoutqueryparamsDefines query timeout value, in milliseconds (this key is optional).
asrparamscsrremotevocabDefines the feature extractor recognition parameters.
fxparamsasrparamsDefines the feature extractor properties.
startenablefxparamsDefines a LH_FX_PARAM_START_ENABLE value.
tsilencefxparamsDefines a LH_FX_PARAM_TSILENCE value.
recparamsasrparamsDefines the distributed recognizer properties.
buffersizerecparamsDefines thebuffer size between front-end and back-end in milliseconds.
acmodfilecsrremotevocabDefines the acoustic model file.
pcmaudiocsrremotevocabDefines a collection of source audio files.
audiofilepcmaudioDefines an audio file (pcm wave audio).
localcontextcsrremotevocabDefines a collection of local contexts.
contextfilelocalcontextDefines a context file (compiled grammar).
sampleratecsrremotevocabDefines the audio sample rate.
deviceidcsrremotevocabDefines the local audio source device id.
framesizecsrremotevocabDefines the number of audio samples in the frame.
buffercountcsrremotevocabDefines the number of audio frames to buffer.
ssenoisereductioncsrremotevocabA flag to indicate whether noise reduction should be applied to the audio sent to the remote server.
nmascommandnamecsrremotevocabDefines the NMAS Remote query command name: AUTOMOTIVE_RESOLVE_DICTATION (dictation) NMDP_ASR_COMMAND (command).
nmaslinknamecsrremotevocabDefines the NMAS audio link name: BODY (dictation) AUDIO_INFO (command).

The following grammar illustrates all the elements that may be defined:


<?xml version="1.0" ?>
<csrremotevocab>
<connectionparams>
	<host>65.124.114.138</host>
	<port>443</port>
	<applicationid>your_id</applicationid>
	<applicationkey>your_key</applicationkey>
</connectionparams>
<queryparams>
	<applicationversion>1</applicationversion>
	<applicationlanguage>enus</applicationlanguage>
	<carrier>ATT</carrier>
	<phonemodel>iPhone</phonemodel>
	<phonenetwork>ATT</phonenetwork>
	<phoneos>4.0</phoneos>
	<phonesubmodel>iPhone2,1</phonesubmodel>
	<dictationtype>WebSearch</dictationtype>
	<audiosource>HeadsetInOut</audiosource>
	<commandtimeout>200000</commandtimeout>
</queryparams>
<asrparams>
	<fxparams>
		<startenable>1</startenable>
		<tsilence>1200</tsilence>
	</fxparams>
	<recparams>
		<buffersize>3000</buffersize>
	</recparams>
</asrparams>
<acmodfile>C:\\Nuance\\VoCon Hybrid\\SDK_v4_3\\models\\acmod4_900_enu_gen_car_f16_v1_0_0.dat</acmodfile>
<pcmaudio>
	<audiofile>myaudio1.wav</audiofile>
	<audiofile>myaudio2.wav</audiofile>
</pcmaudio>
<localcontext>
	<contextfile>C:\\Nuance\\VoCon Hybrid\\SDK_v4_3\\samples\\data\\ctx\\acmod4_900_enu_gen_car_f16_v1_0_0\\commands_names.fcf</contextfile>
</localcontext>
<samplerate>16000</samplerate>
<deviceid>-1</deviceid>
<framesize>384</framesize>
<buffercount>5</buffercount>
<ssenoisereduction>0</ssenoisereduction>
<nmascommandname>NMDP_ASR_COMMAND</nmascommandname>
<nmaslinkname>AUDIO_INFO</nmaslinkname>
</csrremotevocab>