What's new in SpeechKit 5?
Last reviewed: 5/7/2007
FAQ Article ID: F050701
GO FASTER!
Gain development productivity and runtime efficiency.
1. Internal enhancements for streamlined performance.
- Classes re-factored for efficiency.
- Runtime functions converted to secure CRT functions.
- Libraries compiled and linked with latest version of compiler and runtimes.
- SAPI 5 TTS support enhanced by eliminating threading for asynchronous file requests that also eliminates the SAPI 5 file data stamp.
- ChantSR and ChantTTS request queue managers multi-threading processing optimized for maximized asynchronous and synchronous throughput.
- New WindowProc method added for optimization when filtering for messages related to defined resources, event handling, and SMAPI recognition.
- Libraries included for both .NET Framework v1.1 and v2.
- Libraries digitally signed for applications addressing Windows Vista certification requirements.
2. New defined constants added for speech API properties and events.
Speech APIs define many enumerated constants for property and event data. New Chant enumerated constants are available for these values to streamline code and enhance readability.
- ChantAudioReason: The reason for the SAPI 4 audio stop event.
- ChantBookmarkOption: A numeric bookmark option from a SAPI 5 input audio stream.
- ChantBookmarkOption: A numeric bookmark option from a SAPI 5 input audio stream.
- ChantSPVFeature: Information about the features of the SAPI 5 phonemes and visemes.
- ChantSRGram: The vocabulary types supported by the recognizer.
- ChantSRInterface: The COM interfaces supported by a SAPI 4 recognizer.
- ChantSRSequencing: The recognition schemes supported by the recognizer.
- ChantTTSAge: The voice age.
- ChantTTSFeature: The features supported by a synthesizer.
- ChantTTSGender: The voice gender.
- ChantTTSHint: Information about the features of the SAPI 4 phonemes and visual events.
- ChantTTSInterface: The COM interfaces supported by a SAPI 4 synthesizer.
- ChantVisemeType: Represents a mouth position during speech synthesis with a SAPI 5 voice.
3. Updated documentation format for improved usability.
Documentation formatted to support user customization options:
- Unified Class Library Reference for all component library formats.
- Selectable programming language display for syntax and example code sections.
- Expandable section formatting.
- Page print option.
GO WIRELESS!
Liberate your users from their desktops with wireless headsets.
4. New ChantAudio class for unified audio management.
Managing audio is challenge in most applications especially in programming languages that lack access to Win32 APIs. The new ChantAudio class provides access to basic audio functions.
- The ChantSR and ChantTTS classes use the ChantAudio class common audio handling instead of using various speech API implementations,
- Provides non-queued audio recording and playback capabilities directly, or queued requests via ChantSR and ChantTTS classes,
- detects arrival and removal of USB audio devices (wireless headsets), and
- enumerates audio devices, mixers, and mixer lines.
5. New ChantPlaybackResult options added to support a variety of playback requirements.
Playbacks results for buffer, file, and stream formats can specify whether or not audio header data in part of the result object. Audio header data may be required for subsequent playback. "New" indicates that audio header is included.
- CPRBufferNew: Synthesized audio output is written to buffer. Buffer includes audio header.
- CPRStreamNew: Synthesized audio output is written to a stream. Audio header written to stream.
6. New ChantRecordingResult options added to support a variety of audio archival and playback requirements.
Recording results for buffer, file, and stream formats can specify whether or not audio header data in part of the audio result object. Audio header data may be required for subsequent playback. "New" indicates that that audio header is included.
- CRRBufferNew: Recorded audio or recognized speech as text is written to buffer. Every buffer includes audio header.
- CRRBufferUtterance: Recorded audio or recognized speech as text is written to buffer. No audio header included.
- CRRFileUtterance: Recorded audio or recognized speech as text is written to a file. No audio header included.
- CRRStreamNew: Recorded audio is written to a stream. Audio header written to stream for each recognition result.
- CRRStreamUtterance: Recorded audio is written to a stream. No audio header included.
7. New file name token substitution for generating unique audio result file names for StartPlayback and StartRecording requests.
For example myfile&year.wav would result in the file name myfile2007.wav. The following tokens are available:
- &year
- &month
- &dayofweek
- &day
- &hour
- &minute
- &second
- &milliseconds
- &random
GO NATIVE!
Speak and listen with the latest recognizers and synthesizers.
8. New speech recognition API support for Nuance VoCon 3200.
Your SpeechKit applications can listen with Nuance VoCon 3200 recognizer.
- Added Nuance VoCon 3200 API support for native access to VoCon recognizer and enhanced ChantSR vocabulary management to support command, grammar, and native grammars.
9. Extended ChantWord class properties with SAPI 5 and VoCon 3200 recognition event data.
Additional recognition event properties are available on the ChantWord class objects for SAPI 5 and VoCon 3200:
- AudioTimeOffset (SAPI 5): Starting offset of the element in 100-nanosecond units of time relative to the start of the phrase
- AudioSizeTime (SAPI 5): Length of the element in 100-nanosecond units of time
- AudioStreamOffset (SAPI 5): Starting offset of the element in bytes relative to the start of the phrase in the original input stream
- AudioSizeBytes (SAPI 5): Size of the element in bytes in the original input stream
- BeginTime (VoCon 3200): Start time of the word
- Confidence (VoCon 3200): The confidence value of this alternative
- EndTime (VoCon 3200): End time of the word
- ID (VoCon 3200): The ID of the word when using grammars.
- RetainedStreamOffset (SAPI 5): Starting offset of the element in bytes relative to the start of the phrase in the retained audio stream
- RetainedSizeBytes (SAPI 5): Size of the element in bytes in the retained audio stream
10. New speech synthesis API support for Cepstral and Nuance RealSpeak Solo.
Your SpeechKit applications can talk with Cepstral and RealSpeak Solo voices without SAPI. Enjoy the full benefit of Cepstral and RealSpeak solo voices in your applications.
- Added Cepstral Swift API support for native access to Cepstral voices.
- Added Nuance RealSpeak Solo API support for natvice access to RealSpeak solo voices.
11. Extended ChantPlaybackStyle playback options with Cesptral and RealSpeak Solo features.
Control speech synthesis with new playback options:
- CPSSpellOut (Cepstral and RealSpeak Solo): Spell out each letter of the words instead of pronouncing them.
- CPSSpeakPhonemes (Cepstral): Treat input as phoneme strings to be spoken.
- CPSIsSSML (Cepstral): The text is parsed for SSML markup.
- CPSIsHTML (RealSpeak Solo): The text is parsed for HTML.
- CPSLineOut (RealSpeak Solo): The text is read line-by-line.
- CPSNoBlocking (Cepstral): If port's utterance queue is full return with error rather than waiting for room to become available.
- CPSWordOut (RealSpeak Solo): The text is read word-by-word.
12. New SMAPI (ViaVoice) vocabulary error notifications supported via CCGMError event callback.
In cases where you create dynamic vocabularies, certain words may not be supported by the ViaVoice dictionary. Your application receives a not in vocabulary error code. This new event enables your application to detect and respond appropriately.
GO MOBILE!
Release the full potential of your applications on mobile devices.
13. New SpeechKit WinCE Developer Edition for PC2003 and Windows Mobile 5 devices.
Integrate application ready components that enable mobile applications to speak and listen. New WinCE Developer Edition includes:
- CDLL Component library for PC 2003 and Windows Mobile 5.
- .NET Framework (CF2) component library for Windows Mobile 5.
-
Speech API support for:
- Microsoft SAPI 5 compliant recognizer (e.g., Microsoft Voice Command), Nuance VoCon 3200, and
- Cepstral, Microsoft SAPI 5 compliant synthesizer, and Nuance RealSpeak Solo.