Want to provide feedback?

Write to docs@telesign.com.

Using SSML for TTS

This page explains how to use speech synthesis markup language (SSML) when using the text-to-speech option with Telesign’s Voice API.

Contents of this page:

What is SSML?

SSML is a web standard for the generation of synthetic speech. Use SSML when you want to specify fine details of how your message is converted to speech, as opposed to relying on the default conversion details used when sending a plain text message.

Reserved Characters

The following characters are reserved in SSML, so use the associated escape code when including them in the content of your message.

Character Escape Code
" "
& &
' '
< &lt;
> &gt;

Using Quotation Marks

There are a few special rules related to the " and ' characters above:

Double Quotation Marks

  • Must always be escaped when in an attribute value delimited by double quotes.
  • Do not need to be escaped when in textual context in the message. For example: <speak>He said, "Do. Or do not. There is no try."</speak>
  • Do not need to be escaped when in an attribute value delimited by single quotes.

Single Quotation Marks

  • Must be escaped when used as an apostrophe.
  • Do not need to be escaped when in textual context in the message.
  • Do not need to be escaped when in an attribute value delimited by double quotes.

Supported Tags

TeleSign’s implementation of SSML supports the following tags.

<speak>

Enclose your entire message within these tags.

Example
SSML
<speak>Do. Or do not. There is no try.</speak>

<break>

Add a pause in your message.

Attributes

Name Values Meaning
strength
none No pause. Use this to remove a normally occuring pause, such as after a period.
x-weak No pause. Same effect as none.
weak Pause of the same duration as one after a comma.
medium Pause of the same duration as one after a comma. Same effect as weak.
strong Pause of the same duration as one after a sentence.
x-strong Pause of the same duration as one after a paragraph.
time
{n}s

(max = 10s)
The duration of the pause, in seconds.
{n}ms

(max = 10000ms)
The duration of the pause, in milliseconds.
Example
SSML
<speak>Do<break strength="strong"/> Or do not<break time="1s"/> There is no try.</speak>

If you do not include any attributes, the effect varies depending on whether the tag is next to punctuation:

  • Next to comma: Has the same effect as including the attribute strength="strong".
  • Next to period: Has the same effect as including the attribute strength="x-strong".
  • Not next to punctuation: Has the same effect as including the attribute strength="medium".
Example
SSML
<speak>Do.<break/> Or do not.<break/> There is no try.</speak>

<prosody>

Control the volume, speaking rate, and pitch of the voice. At least one attribute must be included. <prosody> tags can be nested within other <prosody> tags.

Attributes

Name Values Meaning
volume
default Resets to the default volume level for the selected voice.
silent, x-soft, soft,medium,loud,x-loud Sets the volume to a predefined value for the selected voice.
+{n}dB, -{n}dB

Example: +6dB (approximately doubles current volume)

Example: -6dB (approximately halves current volume)
Changes the volume from the current level to higher (+) or lower (-) by a measurement in decibels.
rate
x-slow, slow, medium, fast, x-fast Sets the speaking rate to a predefined value for the selected voice.
{n}%

(min = 20%)

(max = 200%)

Example: 50%(halves the current rate.)

Example: 200%(doubles the current rate.)
Sets the speaking rate to this percentage of the current rate.
pitch
default Resets to the default pitch for the selected voice.
x-low, low, medium, high, x-high Sets the pitch to a predefined value for the selected voice.
+{n}%, -{n}% Increases (+) or decreases (-) the pitch by this percentage.
Example
SSML
<speak><prosody volume="loud" rate="slow">Do. </prosody><prosody volume="-6dB" rate="150%">Or do <prosody pitch="low">not.</prosody> </prosody>There is no try.</speak>

Next Steps