How do I integegrate VoCon Hybrid recognizer that accesses the Nuance Mobile Application Server (NMAS)?
Last reviewed: 5/1/2012
HOW Article ID: H051201
The information in this article applies to:
- SpeechKit 7
Summary
Chant Application Framework Vocabulary Management provides a remote vocabulary object for controlling speech recognition with a remote server. SpeechKit supports VoCon 3200 Hybrid V4.3 for accessing the Nuance Mobile Application Server (NMAS) with remote vocabularies.
More Information
A remote vocabulary is an xml file containing the parameters to use speech recognition services on a remote server. The parameters are unique to the service provider.
VoCon 3200 Hybrid Release 4.3 introduces a new Nuance Mobile Speech Platform (NMSP) add-on feature that supports the capabilities of Nuance Mobile Application Server (NMAS).
The VoCon NMSP add-on feature connects your application over TCP/IP through the NMSP gateway with a Nuance Mobile Application Server (NMAS). This server executes queries to perform dictation recognition and command execution.
A remote vocabulary represents a single query request for the server. Your application may have one or more remote vocabularies repesenting the various queries needed at runtime.
The query request is invoked when the remote vocabulary is enabled and terminated when the remote vocabulary is disabled.
For dictation query requests, the query is invoked until the vocabulary is disabled simulating continuous speech recognition.
The following grammar illustrates using dictation services with the NMAS server:
<?xml version="1.0" ?>
<csrremotevocab>
<connectionparams>
<host>65.124.114.138</host>
<port>443</port>
<applicationid>your_id</applicationid>
<applicationkey>your_key</applicationkey>
</connectionparams>
<queryparams>
<applicationversion>1</applicationversion>
<applicationlanguage>enus</applicationlanguage>
<carrier>wifi</carrier>
</queryparams>
<acmodfile>C:\\Nuance\\VoCon Hybrid\\SDK_v4_3\\models\\acmod4_900_enu_gen_car_f16_v1_0_0.dat</acmodfile>
<nmascommandname>AUTOMOTIVE_RESOLVE_DICTATION</nmascommandname>
<nmaslinkname>BODY</nmaslinkname>
</csrremotevocab>
The following remote vocabulary elements may be defined:
Element | Parent Element | Definition |
---|---|---|
csrremotevocab | Vocabulary root element | |
connectionparams | csrremotevocab | Defines the session parameters. |
host | connectionparams | Defines the IP address of the remote NMSP server. |
port | connectionparams | Defines the port number of the remote NMSP server. |
applicationid | connectionparams | Defines the application id. |
applicationkey | connectionparams | Defines the application key. |
queryparams | csrremotevocab | Defines the query parameters. |
applicationversion | queryparams | Defines application version string. |
applicationlanguage | queryparams | Defines application Language identifier. |
carrier | queryparams | Defines carrier name. |
phonemodel | queryparams | Defines the Phone model, e.g. \"iPhone\" (this key is optional). |
phonenetwork | queryparams | Defines the phone network (this key is optional). |
phoneos | queryparams | Defines the the phone os, e.g \"4.0\" (this key is optional). |
phonesubmodel | queryparams | Defines the phone sub-model, e.g \"iPhone2,1\" (this key is optional). |
dictationtype | queryparams | Defines the query type as Dictation, Websearch, or Automotive-Dictation. |
audiosource | queryparams | Indicates the source of audio from the user as SpeakerAndMicrophone, HeadsetInOut, HeadsetBT, HeadPhone, or LineOut (this key is optional). |
commandtimeout | queryparams | Defines query timeout value, in milliseconds (this key is optional). |
asrparams | csrremotevocab | Defines the feature extractor recognition parameters. |
fxparams | asrparams | Defines the feature extractor properties. |
startenable | fxparams | Defines a LH_FX_PARAM_START_ENABLE value. |
tsilence | fxparams | Defines a LH_FX_PARAM_TSILENCE value. |
recparams | asrparams | Defines the distributed recognizer properties. |
buffersize | recparams | Defines thebuffer size between front-end and back-end in milliseconds. |
acmodfile | csrremotevocab | Defines the acoustic model file. |
pcmaudio | csrremotevocab | Defines a collection of source audio files. |
audiofile | pcmaudio | Defines an audio file (pcm wave audio). |
localcontext | csrremotevocab | Defines a collection of local contexts. |
contextfile | localcontext | Defines a context file (compiled grammar). |
samplerate | csrremotevocab | Defines the audio sample rate. |
deviceid | csrremotevocab | Defines the local audio source device id. |
framesize | csrremotevocab | Defines the number of audio samples in the frame. |
buffercount | csrremotevocab | Defines the number of audio frames to buffer. |
ssenoisereduction | csrremotevocab | A flag to indicate whether noise reduction should be applied to the audio sent to the remote server. |
nmascommandname | csrremotevocab | Defines the NMAS Remote query command name: AUTOMOTIVE_RESOLVE_DICTATION (dictation) NMDP_ASR_COMMAND (command). |
nmaslinkname | csrremotevocab | Defines the NMAS audio link name: BODY (dictation) AUDIO_INFO (command). |
The following grammar illustrates all the elements that may be defined:
<?xml version="1.0" ?>
<csrremotevocab>
<connectionparams>
<host>65.124.114.138</host>
<port>443</port>
<applicationid>your_id</applicationid>
<applicationkey>your_key</applicationkey>
</connectionparams>
<queryparams>
<applicationversion>1</applicationversion>
<applicationlanguage>enus</applicationlanguage>
<carrier>ATT</carrier>
<phonemodel>iPhone</phonemodel>
<phonenetwork>ATT</phonenetwork>
<phoneos>4.0</phoneos>
<phonesubmodel>iPhone2,1</phonesubmodel>
<dictationtype>WebSearch</dictationtype>
<audiosource>HeadsetInOut</audiosource>
<commandtimeout>200000</commandtimeout>
</queryparams>
<asrparams>
<fxparams>
<startenable>1</startenable>
<tsilence>1200</tsilence>
</fxparams>
<recparams>
<buffersize>3000</buffersize>
</recparams>
</asrparams>
<acmodfile>C:\\Nuance\\VoCon Hybrid\\SDK_v4_3\\models\\acmod4_900_enu_gen_car_f16_v1_0_0.dat</acmodfile>
<pcmaudio>
<audiofile>myaudio1.wav</audiofile>
<audiofile>myaudio2.wav</audiofile>
</pcmaudio>
<localcontext>
<contextfile>C:\\Nuance\\VoCon Hybrid\\SDK_v4_3\\samples\\data\\ctx\\acmod4_900_enu_gen_car_f16_v1_0_0\\commands_names.fcf</contextfile>
</localcontext>
<samplerate>16000</samplerate>
<deviceid>-1</deviceid>
<framesize>384</framesize>
<buffercount>5</buffercount>
<ssenoisereduction>0</ssenoisereduction>
<nmascommandname>NMDP_ASR_COMMAND</nmascommandname>
<nmaslinkname>AUDIO_INFO</nmaslinkname>
</csrremotevocab>