No matter how you slice it, a tomato is just a tomato. However, your end userscustomers and clientsmay be partial to their kind of tomato. Your applications need to accommodate and adjust to their manner of speaking when recognizing and synthesizing.
A lexicon is a collection of word pronunciations that a speech recognition engine (i.e., recognizer) uses to improve recognition accuracy and a speech synthesis engine (i.e., synthesizer) uses to enhance the quality of its pronunciation.
Lexicons play an important role in the accuracy of speech recognition. A speech recognition engine (i.e., recognizer) uses lexicons in the process of recognizing speech. Lexicons consist of the words that a recognizer understands and returns as recognized speech. Since it's impractical for a recognizer to maintain every possible word and context in its spoken language, you enhance the accuracy of speech recognition by extending its lexicon.
Lexicons play an important role in the quality of text-to-speech playback. A text-to-speech engine (i.e., synthesizer) uses lexicons to obtain pronunciation information associated with words to generate the appropriate speech sounds for the word. For example, with a lexicon you may ensure "record" is pronounced correctly when used as a noun and when used as a verb.
What is Lexicon Management?
Lexicon management enables you to:
- tailor pronunciations to specific end user preferences.
- extend recognizer and synthezizer lexicons, and
- create, delete, edit, import, and export lexicons as part of your deployed applications.
Application benefits include:
- improved speech recognition accuracy and
- enhanced speech synthesis clarity.
What is LexiconKit?
Chant LexiconKit handles the complexities of creating and editing lexicons for deployment with applications and generating pronunciations.
LexiconKit provides you a simple way to create and edit lexicons with word pronunciations. Applications can generate pronunciations as part of its runtime operation to enable real-time customization and tailoring of speech recognition and speech synthesis environments.
It simplifies the process of managing word pronunciations for Cepstral Swift API, Microsoft SAPI 5, Microsoft Speech Platform, and Nuance Vocalizer API lexicon formats to use with your favorite speech recognizers and synthesizers.
LexiconKit includes C++, C++Builder, Delphi, Java, .NET Framework, and Silverlight class library formats to support all your programming languages and sample projects for popular IDEssuch as the latest Visual Studio from Microsoft and RAD Studio from Embarcadero.
The class libraries can be integrated with 32-bit and 64-bit applications.
Lexicon Management Architecture
LexiconKit provides a simple way to create and edit lexicon word pronunciations. Applications can generate pronunciations to enable real-time customization and tailoring of speech recognition and speech synthesis.
LexiconKit manages the resources and interacts directly with the applicable speech application program interface (API). The LexiconKit class supports the following speech APIs for lexicon management:
LexiconKit encapsulates all of the technologies necessary to make the process of generating word pronunciations simple and efficient.
LexiconKit simplifies the process of generating pronunciations by handling the low-level activities directly with the speech recognition and synthesis engines.
Instantiate LexiconKit to generate the default word pronunciation within the application and destroy LexiconKit to release its resources when lexicon management is no longer needed.
The goal of lexicon management is to adjust to the end user manner of speaking for enhanced speech recognition accuracy and speech synthesis quality. A LexiconKit application can:
- Create word pronunciations on demand; and
- Edit lexicon word pronunciations for ensuring maximum recognition accuracy and speech synthesis quality.
Chant LexiconKit handles the complexities of generating word pronunciations. This allows applications to enhance the quality of speech recognition and speech synthesis and offer administrative features for maintaining word pronunciations.
Recognizers and synthesizers have unique formats for word pronunciations, lexicon formats, and approaches for runtime inclusion. LexiconKit supports the following recognizer and synthesizers and their lexicons formats.
|Speech API||File Format|
|Cepstral Swift API||.txt|
|Microsoft Speech Platform||W3C .pls|
|Nuance Vocalizer||.dct (.dcb .dcc)|
Within Chant Developer Workbench, you can:
- Create and edit W3C lexicons (.pls);
- Create and edit Cepstral lexicon file (.txt);
- Create and edit Nuance Vocalizer user dictionary text file (.dct);
- Generate word pronunciation phonemes; and
- Edit word pronunciation phonemes.
You may explore the capabilities of Chant LexiconKit for 30 days. To continue to use the product after 30 days, you must purchase a license for the software or stop using the software and remove it from your system.
A valid purchased license gives you the right to construct executable applications that use the applicable class library and distribute it with executable applications without royalty obligations to Chant.
The Chant LexiconKit license is a single end-user license. Each devleoper who installs and uses LexiconKit to develop applications must have their own license.
LexiconKit class library names vary by platform: Windows 32-bit and 64-bit. This helps ensure the correct library is deployed with your application.
LexiconKit System Requirements
- Intel processor or equivalent,
- Microsoft Windows 10,
- 120 MB of hard drive space,
- CD-ROM drive,
- VGA or higher-resolution monitor,
- Microsoft SAPI 5 or Microsoft Speech Platform recognizer,
- Cepstral, Microsoft SAPI 5, Microsoft Speech Platform, or Nuance Vocalizer synthesizer,
- C++, C++Builder, Delphi, Java, or .NET Framework development environment, and
- close-talk microphone.