Generating, Editing, and Speaking Pronunciations with Azure Speech

Last reviewed: 7/16/2023

HowTo Article ID: H032310

The information in this article applies to:

  • Chant Developer Workbench 2023
  • LexiconKit 9

Working with phonemes to generate, edit, and speak pronunciations with Azure Speech

Applications can speak clearly when synthesizing with Microsoft Azure Speech by creating pronunciations.

Developing Apps for Azure Speech with LexiconKit

LexiconKit supports tailoring lexicons for more accurate speech recognition and precise word pronunciation when synthesizing.

Lexicons may be created and edited interactively in the Developer Workbench that access Microsoft Azure Speech cloud resources.

Applications may use the MCSSynthesizer class to access the Microsoft Cognitive Services speech API that provides access to Microsoft Azure Speech cloud resources.

Tailoring Pronuncations in the Developer Workbench

LexiconKit provides a comprehensive development and testing environment to generate, edit, and test phonemes with Azure Speech. Type and listen to changes on demand.

PLS Lexicon Editing
PLS Lexicon Editing: Generate, edit, and test word pronunciations with Azure Speech.

Easily select the speaking language and the phoneme alphabet to generate and edit word pronunciations in the lexicon document.

Review the details explained in PLS Lexicon Editing.

Tailoring Pronuncations in Apps

LexiconKit includes the MCSSynthesizer class to enumerate Azure Speech Voices for tailoring pronunciations.

To generate a lexicon word pronunciation, simply pass the word, the word type (i.e., part of speech), language, and alphabet to LexiconKit. To speak a lexicon word pronunciation, simply pass the phonemes, language, and alphabet to LexiconKit.


// Instantiate LexiconKit
NLexiconKit _LexiconKit = new NLexiconKit();
if (_LexiconKit != null)
{
    // Set credentials
    _LexiconKit.SeCredentials("Credentials");
    NMCSSynthesizer _Synthesizer = _LexiconKit.CreateMCSSynthesizer("speechKey", "speechRegion");
    if (_Synthesizer != null)
    {
        string phonemes = _Synthesizer.GeneratePhonemes("tomato", "Noun", "en-US", "ipa");
        _Synthesizer.SpeakPhonemes(phonemes, "en-US", "ipa");
        _Synthesizer.Dispose();
    }
    _LexiconKit.Dispose();
}

// Instantiate LexiconKit object
CLexiconKit* _LexiconKit = new CLexiconKit();
if (_LexiconKit != NULL)
{
    // Set credentials
    _LexiconKit->SetCredentials(L"Credentials");
    // Create synthesizer
    CMCSSynthesizer* _Synthesizer = _LexiconKit->CreateMCSSynthesizer(L"speechKey", L"speechRegion");
    if (_Synthesizer != NULL)
    {
        wchar_t* phonemes = _Synthesizer->GeneratePhonemes(L"tomato", L"Noun", L"en-US", L"ipa");
        _Synthesizer->SpeakPhonemes(phonemes, L"en-US", L"ipa");
        delete _Synthesizer;
    }
    delete _LexiconKit;
}

// Instantiate LexiconKit  object
CLexiconKit* _LexiconKit = new CLexiconKit();
if (_LexiconKit != NULL)
{
    // Set credentials
    _LexiconKit->SetCredentials("Credentials");
    // Create synthesizer
    CMCSSynthesizer* _Synthesizer = _LexiconKit->CreateMCSSynthesizer("speechKey", "speechRegion");
    if (_Synthesizer != NULL)
    {
        String phonemes = _Synthesizer->GeneratePhonemes("tomato", "Noun", "en-US", "ipa");
        _Synthesizer->SpeakPhonemes(phonemes, "en-US", "ipa");
        delete _Synthesizer;
    }
    delete _LexiconKit;
}

var 
    _LexiconKit: TLexiconKit;
    _Synthesizer: TMCSSynthesizer;
    phonemes: string;
begin
    // Instantiate LexiconKit object
    _LexiconKit := TLexiconKit.Create();
    if (_LexiconKit <> nil) then
    begin
        // Set credentials
        _LexiconKit.SetCredentials('Credentials');
        // Create synthesizer
        _Synthesizer := _LexiconKit.CreateMCSSynthesizer('speechKey', 'speechRegion');
        if (_Synthesizer <> nil) then
        begin
            phonemes := _Synthesizer.GeneratePhonemes('tomato', 'Noun', 'en-US', 'ipa');
            _Synthesizer.SpeakPhonemes(phonemes, 'en-US', 'ipa');
            _Synthesizer.Destroy();
        end;
        _LexiconKit.Destroy();
    end;
end;

JLexiconKit _LexiconKit = new JLexiconKit();
if (_LexiconKit != null)
{
    // Set credentials
    _LexiconKit.setCredentials("Credentials");
    JMCSSynthesizer _Synthesizer = _LexiconKit.createMCSSynthesizer("speechKey", "speechRegion");
    if (_Synthesizer != null)
    {
        String phonemes = _Synthesizer.generatePhonemes("tomato", "Noun", "en-US", "ipa");
        phonemes = _Synthesizer.speakPhonemes(phonemes, "en-US", "ipa");
        _Synthesizer.dispose();
    }
    _LexiconKit.dispose();
}

Dim _LexiconKit As NLexiconKit
Dim WithEvents _Synthesizer As NMCSSynthesizer
Dim phonemes As String
' Instantiate LexiconKit
_LexiconKit = New NLexiconKit()
If (_LexiconKit IsNot Nothing) Then
    ' Set credentials
    _LexiconKit.SetCredentials("Credentials")
    _Synthesizer = _LexiconKit.CreateMCSSynthesizer("speechKey", "speechRegion")
    If (_Synthesizer IsNot Nothing) Then
        phonemes = _Synthesizer.GeneratePhonemes("tomato", "Noun", "en-US", "ipa")
        _Synthesizer.SpeakPhonemes(phonemes, "en-US", "ipa")
        _Synthesizer.Dispose()
    End If
    _LexiconKit.Dispose()
End If

Review the details explained in Generating, Editing, Speaking, and Persisting Pronunciations.