No matter how you slice it, a tomato is just a tomato. However, your end userscustomers and clientsmay be partial to their kind of tomato. Your applications need to accommodate and adjust to their manner of speaking when recognizing and synthesizing.
A lexicon is a collection of word pronunciations that a speech recognition engine (i.e., recognizer) uses to improve recognition accuracy and a speech synthesis engine (i.e., synthesizer) uses to enhance the quality of its pronunciation.
Lexicons play an important role in the accuracy of speech recognition. A speech recognition engine (i.e., recognizer) uses lexicons in the process of recognizing speech. Lexicons consist of the words that a recognizer understands and returns as recognized speech. Since it's impractical for a recognizer to maintain every possible word and context in its spoken language, you enhance the accuracy of speech recognition by extending its lexicon.
Lexicons play an important role in the quality of text-to-speech playback. A text-to-speech engine (i.e., synthesizer) uses lexicons to obtain pronunciation information associated with words to generate the appropriate speech sounds for the word. For example, with a lexicon you may ensure "record" is pronounced correctly when used as a noun and when used as a verb.
What is Lexicon Management?
Lexicon management enables you to:
- tailor pronunciations to specific end user preferences,
- extend recognizer and synthesizer lexicons to deploy with application, and
- create, edit, and speak pronunciations as part of your deployed applications.
Application benefits include:
- improved speech recognition accuracy and
- enhanced speech synthesis clarity.
What is LexiconKit?
Chant LexiconKit handles the complexities of creating and editing lexicons for deployment with applications and generating and speaking pronunciations.
LexiconKit provides you a simple way to create, edit, and speak word pronunciations in lexicons. Applications can generate and speak pronunciations as part of its runtime operation to enable real-time customization and tailoring of speech recognition and speech synthesis environments.
LexiconKit includes C++, C++Builder, Delphi, Java, and .NET Framework class library formats to support all your programming languages and sample projects for popular IDEssuch as the latest Visual Studio from Microsoft, RAD Studio from Embarcadero, and Java IDEs Eclipse, IntelliJ, JDeveloper, and NetBeans.
The class libraries can be integrated with 32-bit and 64-bit applications for Windows platforms.
Lexicon Management Architecture
LexiconKit provides a simple way to create and edit lexicon word pronunciations. Applications can generate pronunciations to enable real-time customization and tailoring of speech recognition and speech synthesis.
LexiconKit manages the resources and interacts directly with the applicable speech application program interface (API). The LexiconKit class supports the following speech APIs for lexicon management:
- Acapela TTS,
- Cepstral Swift,
- Microsoft SAPI 5,
- Microsoft Speech Platform, and
- Microsoft WindowsMedia.
LexiconKit encapsulates all of the technologies necessary to make the process of generating word pronunciations simple and efficient.
LexiconKit simplifies the process of generating and speaking pronunciations by handling the low-level activities directly with the speech recognition and synthesis engines.
Instantiate LexiconKit to generate and speak the default word pronunciation within the application and destroy LexiconKit to release its resources when lexicon management is no longer needed.
The goal of lexicon management is to adjust to the end user manner of speaking for enhanced speech recognition accuracy and speech synthesis quality. A LexiconKit application can:
- Create word pronunciations on demand;
- Edit lexicon word pronunciations for ensuring maximum recognition accuracy and speech synthesis quality; and
- Speak lexicon word pronunciations to fine-tune definitions.
Chant LexiconKit handles the complexities of generating word pronunciations. This allows applications to enhance the quality of speech recognition and speech synthesis and offer administrative features for maintaining word pronunciations.
Recognizers and synthesizers have unique formats for word pronunciations, lexicon formats, and approaches for runtime inclusion. LexiconKit supports the following recognizer and synthesizers and their lexicons formats.
|Speech API||Alphabets||File Format|
|Acapela TTS||ipa, acatts||.dic|
|Cepstral Swift API||swift||.txt|
|Microsoft (SAPI5, Speech Platform, WindowsMedia)||ipa, sapi, ups||W3C .pls|
Within Chant Developer Workbench, you can:
- Create and edit W3C lexicons (.pls);
- Create and edit Cepstral lexicon file (.txt);
- Generate word pronunciation phonemes;
- Edit word pronunciation phonemes; and
- Speak word pronunciation phonemes.
You may explore the capabilities of Chant LexiconKit for 30 days. To continue to use the product after 30 days, you must purchase a license for the software or stop using the software and remove it from your system.
A valid purchased license gives you the right to construct executable applications that use the applicable class library and distribute it with executable applications without royalty obligations to Chant.
The Chant LexiconKit license is a single end-user license. Each devleoper who installs and uses LexiconKit to develop applications must have their own license.
LexiconKit class library names vary by platform: Windows 32-bit and 64-bit. This helps ensure the correct library is deployed with your application.
Chant LexiconKit is licensed separately or as part of Chant Developer Workbench. You may purchase a license for Chant LexiconKit on-line at the Chant store or through your preferred software reseller.
LexiconKit System Requirements
- Intel processor or equivalent,
- Microsoft Windows 10, 11
- 120 MB of hard drive space,
- CD-ROM drive,
- VGA or higher-resolution monitor,
- Microsoft SAPI 5, Microsoft Speech Platform, or Microsoft WindowsMedia recognizer,
- Acapela, Cepstral, Microsoft SAPI 5, Microsoft Speech Platform, or Microsoft WindowsMedia synthesizer
- C++, C++Builder, Delphi, Java (JDK 1.8, 11, 13, 14, 15, 16, 17, 18), or .NET Framework (4.5+, 3.1, 5.0, 6.0) Windows development environment, and
- close-talk microphone.