How do I use Azure Speech in my app?
Last reviewed: 9/12/2023
HowTo Article ID: H032309
The information in this article applies to:
- Chant Developer Workbench 2023
- SpeechKit 12
Developing and Testing Apps that Speak and Listen with Azure Speech
Applications can leverage speech recognition and speech synthesis services from Microsoft's latest Speech SDK 1.29 with SpeechKit MCSRecognizer and MCSSynthesizer classes.
Developing Apps for Azure Speech with SpeechKit
SpeechKit supports many speech SDKs by providing API-specific recognizer and synthesizer classes. Apps instantiate the API-specific class to access resource for that API.
MCSRecognizer and MCSSynthesizer classes are available to access the Microsoft Cognitive Services speech API that provides access to Microsoft Azure Speech cloud resources.
Synthesizing Speech with Azure Speech Voices
Synthesizing with MCSSynthesizer is identical to other SpeechKit synthesizer classes: instantiate the API class, optionally enumerate and select a voice, and speak. Playback is real-time to an output device, streamed, or written to a file. UI events for real-time playback are synchronized. The Synthesis samples for C++Builder, C++, Delphi, Java, and .NET Windows are updated with MCSSynthesizer to illustrate.
private JSpeechKit _SpeechKit = null;
private JMCSSynthesizer _Synthesizer = null;
// Create SpeechKit object
_SpeechKit = new JSpeechKit();
// Set credentials
_SpeechKit.setCredentials("Credentials");
_Synthesizer = _SpeechKit.createMCSSynthesizer("speechKey", "speechRegion");
if (_Synthesizer != null)
{
// Set the callback object
_Synthesizer.setChantSpeechKitEvents(this);
// For Azure Speech, set Activity for callbacks on UI thread
_Synthesizer.setActivity(this);
}
...
if (_Synthesizer != null)
{
_Synthesizer.Speak("Hello world");
}
NSpeechKit _SpeechKit = null;
NMCSSynthesizer _Synthesizer = null;
// Instantiate SpeechKit
_SpeechKit = new NSpeechKit();
if (_SpeechKit != null)
{
// Set credentials
_SpeechKit.SetCredentials("Credentials");
_Synthesizer = _SpeechKit.CreateMCSSynthesizer("speechKey", "speechRegion");
}
...
if (_Synthesizer != null)
{
_Synthesizer.Speak("Hello world");
}
CSpeechKit* _SpeechKit;
CMCSSynthesizer* _Synthesizer;
_SpeechKit = new CSpeechKit();
if (_SpeechKit =! NULL)
{
// Set credentials
_SpeechKit->SetCredentials(L"Credentials");
// Create synthesizer
_Synthesizer = _SpeechKit->CreateMCSSynthesizer(L"speechKey", L"speechRegion");
}
...
if (_Synthesizer != NULL)
{
_Synthesizer->Speak(L"Hello world");
}
CSpeechKit* _SpeechKit;
CMCSSynthesizer* _Synthesizer;
_SpeechKit = new CSpeechKit();
if (_SpeechKit =! NULL)
{
// Set credentials
_SpeechKit->SetCredentials("Credentials");
// Create synthesizer
_Synthesizer = _SpeechKit->CreateMCSSynthesizer("speechKey", "speechRegion");
}
...
if (_Synthesizer != NULL)
{
_Synthesizer->Speak("Hello world");
}
var
_SpeechKit: TSpeechKit;
_Synthesizer: TCMSSynthesizer;
begin
// Instantiate SpeechKit object
_SpeechKit := TSpeechKit.Create();
if (_SpeechKit <> nil) then
begin
// Set credentials
_SpeechKit.SetCredentials('Credentials');
// Create synthesizer
_Synthesizer := _SpeechKit.CreateMCSSynthesizer('speechKey', 'speechRegion');
end;
end;
begin
if (_Synthesizer <> nil) then
begin
_Synthesizer.Speak('Hello world');
end;
end;
JSpeechKit _SpeechKit = null;
JMCSSynthesizer _Synthesizer = null;
// Create SpeechKit object
_SpeechKit = new JSpeechKit();
// Set credentials
_SpeechKit.setCredentials("Credentials");
// Create Synthesizer object
_Synthesizer = _SpeechKit.createMCSSynthesizer("speechKey", "speechRegion");
if (_Synthesizer != null)
{
_Synthesizer.speak("Hello world");
}
Dim _SpeechKit As NSpeechKit
Dim WithEvents _Synthesizer As NMCSSynthesizer
Private Sub Window_Loaded(ByVal sender As System.Object, ByVal e As System.Windows.RoutedEventArgs) Handles MyBase.Loaded
' Instantiate SpeechKit
_SpeechKit = New NSpeechKit()
If (_SpeechKit IsNot Nothing) Then
' Set credentials
_SpeechKit.SetCredentials("Credentials")
_Synthesizer = _SpeechKit.CreateMCSSynthesizer("speechKey", "speechRegion")
End If
End Sub
...
If (_Synthesizer IsNot Nothing) Then
_Synthesizer.Speak("Hello World")
End If
Recognizing Speech with Azure Speech
Recognizing with MCSRecognizer is identical to other SpeechKit recognizer classes: instantiate the API class, optionally select a language, and recognize. Recognition is real-time via a microphone, audio stream or file. The Dictation samples for C++Builder, C++, Delphi, Java, and .NET Windows are updated with MCSRecognizer to illustrate.
private JSpeechKit _SpeechKit = null;
private JMCSRecognizer _Recognizer = null;
_Recognizer = _SpeechKit.createMCSRecognizer("speechKey", "speechRegion");
if (_Recognizer != null)
{
// Set the callback object
_Recognizer.setChantSpeechKitEvents(this);
// For Azure Speech, set Activity for callbacks on UI thread
_Recognizer.setActivity(this);
}
...
if (_Recognizer != null)
{
_Recognizer.StartRecognition();
}
...
if (_Recognizer != null)
{
_Recognizer.StopRecognition();
}
NSpeechKit _SpeechKit = null;
NMCSRecognizer _Recognizer = null;
// Instantiate SpeechKit
_SpeechKit = new NSpeechKit();
if (_SpeechKit != null)
{
// Set credentials
_SpeechKit.SetCredentials("Credentials");
_Recognizer = _SpeechKit.CreateMCSRecognizer("speechKey", "speechRegion");
if (_Recognizer != null)
{
// Setup handler for recognition results
_Recognizer.RecognitionDictation += this.Recognizer_RecognitionDictation;
}
}
...
if (_Recognizer != null)
{
_Recognizer.StartRecognition();
}
...
if (_Recognizer != null)
{
_Recognizer.StopRecognition();
}
CSpeechKit* _SpeechKit;
CMCSRecognizer* _Recognizer;
_SpeechKit = new CSpeechKit();
if (_SpeechKit =! NULL)
{
// Set credentials
_SpeechKit->SetCredentials(L"Credentials");
// Create recognizer
_Recognizer = _SpeechKit->CreateMCSRecognizer(L"speechKey", L"speechRegion");
if (_Recognizer != NULL)
{
// Register Event Handlers
_Recognizer->SetRecognitionDictation(RecognitionDictation);
}
}
...
if (_Recognizer != NULL)
{
_Recognizer->StartRecognition();
}
...
if (_Recognizer != NULL)
{
_Recognizer->StopRecognition();
}
CSpeechKit* _SpeechKit;
CMCSRecognizer* _Recognizer;
_SpeechKit = new CSpeechKit();
if (_SpeechKit =! NULL)
{
// Set credentials
_SpeechKit->SetCredentials("Credentials");
// Create recognizer
_Recognizer = _SpeechKit->CreateMCSRecognizer("speechKey", "speechRegion");
if (_Recognizer != NULL)
{
// Register Event Handlers
_Recognizer->SetRecognitionDictation(RecognitionDictation);
}
}
...
if (_Recognizer != NULL)
{
_Recognizer->StartRecognition();
}
...
if (_Recognizer != NULL)
{
_Recognizer->StopRecognition();
}
var
_SpeechKit: TSpeechKit;
_Recognizer: TMCSRecognizer;
begin
// Instantiate SpeechKit object
_SpeechKit := TSpeechKit.Create();
if (_SpeechKit <> nil) then
begin
// Set credentials
_SpeechKit.SetCredentials('Credentials');
// Create recognizer
_Recognizer := _SpeechKit.CreateMCSRecognizer('speechKey', 'speechRegion');
if (_Recognizer <> nil) then
begin
// Register Event Handlers
_Recognizer.RecognitionDictation := RecognitionDictation;
end;
end;
end;
begin
if (_Recognizer <> nil) then
begin
_Recognizer.StartRecognition();
end;
end;
begin
if (_Recognizer <> nil) then
begin
_Recognizer.StopRecognition();
end;
end;
JSpeechKit _SpeechKit = null;
JMCSRecognizer _Recognizer = null;
// Create SpeechKit object
_SpeechKit = new JSpeechKit();
// Set credentials
_SpeechKit.setCredentials("Credentials");
// Create Recognizer object
_Recognizer = _SpeechKit.createMCSRecognizer("speechKey", "speechRegion");
if (_Recognizer != null)
{
// Set the callback object
_Recognizer.setChantSpeechKitEvents(this);
// Register for callbacks
_Recognizer.registerCallback(ChantSpeechKitCallback.CCSRRecognitionDictation);
}
if (_Recognizer != null)
{
_Recognizer.startRecognition();
}
if (_Recognizer != null)
{
_Recognizer.stopRecognition();
}
Dim _SpeechKit As NSpeechKit
Dim WithEvents _Recognizer As NMCSRecognizer
Private Sub Window_Loaded(ByVal sender As System.Object, ByVal e As System.Windows.RoutedEventArgs) Handles MyBase.Loaded
' Instantiate SpeechKit
_SpeechKit = New NSpeechKit()
If (_SpeechKit IsNot Nothing) Then
' Set credentials
_SpeechKit.SetCredentials("Credentials")
_Recognizer = _SpeechKit.CreateMCSRecognizer("speechKey", "speechRegion")
End If
End Sub
...
If (_Recognizer IsNot Nothing) Then
_Recognizer.StartRecognition()
End If
...
If (_Recognizer IsNot Nothing) Then
_Recognizer.StopRecognition()
End If
SpeechKit App Integration
Explore these details specific to your application development environment:
- C++Builder Android Applications
- Delphi Android Applications
- Android Java Applications
- C++Builder VCL and FireMonkey Applications
- Delphi VCL and FireMonkey Applications
- Java Applications
- Microsoft Visual C++ Applications
- .NET Windows Forms Applications
- .NET WPF Applications
Testing Azure Speech in Chant Developer Workbench
Another way to see Microsoft cloud speech services in action is to access and test them via the Developer Workbench.
Cloud Access Setup
To access the Azure cloud, paste your SpeechKey and SpeechRegion values in the View->Options->Environment->Credentials->Microsoft Cognitive Services fields and press the Save toolbar button.
Testing Azure Speech Synthesis
Open a Speech Synthesizers tab (View->Speech Synthesizers). The Speech API list includes two new entries: Microsoft Cognitive Services (C++) Speech Synthesis and Microsoft Cognitive Services (.NET) Speech Synthesis. The C++ API is the SpeechKit library used by C++Builder, C++, Delphi, Java, and .NET Windows apps. The .NET API is the SpeechKit library use by .NET apps. This approach is also used for SAPI, Microsoft Speech Platform, and Windows Media APIs.
Select Microsoft Cognitive Services (C++) Speech Synthesis, select a voice, enter text, and press the Start button. Observe the events in the Events windows. Notice they are synchronized with the live playback.
Select Microsoft Cognitive Services (.NET) Speech Synthesis, select a voice, enter text, and press the Start button. Observe the events in the Events windows. Notice they are synchronized with the live playback.
Testing Azure Speech Recognition
Open a Speech Recognizer tab (View->Speech Recognizers). The Speech API list includes two new entries: Microsoft Cognitive Services (C++) Speech Recognition and Microsoft Cognitive Services (.NET) Speech Recognition. The C++ API is the SpeechKit library used by C++Builder, C++, Delphi, Java, and .NET Windows apps. The .NET API is the SpeechKit library use by .NET apps. This approach is also used for SAPI, Microsoft Speech Platform, and Windows Media APIs.
Select Microsoft Cognitive Services (C++) Speech Recognition, optionally select a language, optionally select the text dictation vocabulary, and press the Start button. Speak into the microphone. Press the Stop button when finished. When a dictation vocabulary is enabled, spoken punctuation is interpreted. Observe the events in the Events window and output in the Output window.
Select Microsoft Cognitive Services (.NET) Speech Recognition, optionally select a language, optionally select the text dictation vocabulary, and press the Start button. Speak into the microphone. Press the Stop button when finished. When a dictation vocabulary is enabled, spoken punctuation is interpreted. Observe the events in the Events window and output in the Output window.