Skip to main content

<Say>

The <Say> verb converts text to speech (TTS) that is read back to ther caller. <Say> is useful for developing dynamic text that is difficult to pre-record. The verb offers different options for voices, each with its own supported set of languages and genders.

The current TTS engine used by <Say> is Amazon Polly. All voices and languages supported by Polly can be used by <Say>.

The <Say> verb also support SSML. Please see supported SSML tags.

<Say> Attributes

<Say> supports the following attributes that change its behavior:

AttributeAllowed ValuesDefault Value
voicesee belowPolly.Joanna
languagesee belowen-US
historydisable, compact, fulldisable
loopinteger > 01

voice

The voice attribute allows you to select the TTS provider and voice to use when reading the text. Current TTS providers include Deepgram and AWS. For a list of Deepgram voices please see their docs. For AWS, both standard and Neural Polly voices can be used. See Polly docs for a complete list.

The voice attribute has the following syntax:

<engine>.<voice>(-Neural)?

For example:

Polly.Matthew
Polly.Matthew-Neural

language

The language attribute allows you to specify a language and locale, with the affiliated accent and pronunciations. The language must match the selected voice. Refer to the TTS provider docs for more values.

history

The history attribute determines if the <Say> verb should be logged in the history array of the CDR. The default is to not log.

For <Say>, the compact and full values log the following payload to the history array respectively.

compact

{
"payload": {
"text": "spoken words here..."
}
}

full

{
"payload": {
"text": "spoken words here...",
"voice": "Polly.Ruth-Neural",
"language": "en-US",
"loop": "1"
}
}

loop

The loop attribute specifies how many times you'd like the text repeated. The default is once.

<Say> Nouns

The text that is read to the caller. The length is limited to 3,000 UTF-8 single byte characters, not including SSML tags.

<Say> Examples

Simple Usage

Read back text to the caller, using default voice/language, and end the call.

<Response>
<Say>Hello, and goodbye.</Say>
<Hangup/>
</Response>

Advanced Usage

Read back text to the caller using a Neural voice from Polly. While it costs more, the Neural TTS is much better than the standard engine.

<Response>
<Say voice="Polly.Matthew-Neural" languge="en-US">Transfering you to a live agent now.</Say>
<Redirect>/transferToSupport</Redirect>
</Response>