Last reviewed: 3/23/2024 10:56:42 AM

<speak>

A speak element is the required root element of a W3C SSML document. It defines the beginning and end of the text markup.

Unlike a W3C SSML document, the Azure Speech speak element has 2 children: backgroundaudio and voice.

<?xml version="1.0"?>
<speak version="1.0"
         xmlns="http://www.w3.org/2001/10/synthesis" 
         xmlns:mstts="https://www.w3.org/2001/mstts"
         xml:lang="en-US">
  ... the body ...
</speak>

Attributes

version

Specifies the version of the specification to be used for the document and must have the value "1.0".

xml:lang

Instructs the synthesizer to speak content in the indicated language.

xmlns

The URI of the SSML name space.

Children

<backgroundaudio> and <voice>.

Parents

none

Source: Microsoft Azure Speech Synthesis Markup Language (SSML)