<Say>
The <Say>
verb converts text to speech (TTS) that is read back to ther caller. <Say>
is useful for developing dynamic text that is difficult to pre-record. The verb offers different options for voices, each with its own supported set of languages and genders.
The current TTS engine used by <Say>
is Amazon Polly. All voices and languages supported by Polly can be used by <Say>
.
The <Say>
verb also support SSML. Please see supported SSML tags.
<Say>
Attributes
<Say>
supports the following attributes that change its behavior:
Attribute | Allowed Values | Default Value |
---|---|---|
voice | see below | Polly.Joanna |
language | see below | en-US |
history | disable , compact , full | disable |
loop | integer > 0 | 1 |
voice
The voice
attribute allows you to select the TTS provider and voice to use when reading the text. Current TTS providers include Deepgram and AWS. For a list of Deepgram voices please see their docs. For AWS, both standard and Neural Polly voices can be used. See Polly docs for a complete list.
The voice
attribute has the following syntax:
<engine>.<voice>(-Neural)?
For example:
Polly.Matthew
Polly.Matthew-Neural
language
The language
attribute allows you to specify a language and locale, with the affiliated accent and pronunciations. The language
must match the selected voice
. Refer to the TTS provider docs for more values.
history
The history
attribute determines if the <Say>
verb should be logged in the history
array of the CDR. The default is to not log.
For <Say>
, the compact
and full
values log the following payload to the history
array respectively.
compact
{
"payload": {
"text": "spoken words here..."
}
}
full
{
"payload": {
"text": "spoken words here...",
"voice": "Polly.Ruth-Neural",
"language": "en-US",
"loop": "1"
}
}
loop
The loop
attribute specifies how many times you'd like the text repeated. The default is once.
<Say>
Nouns
The text that is read to the caller. The length is limited to 3,000 UTF-8 single byte characters, not including SSML tags.
<Say>
Examples
Simple Usage
Read back text to the caller, using default voice/language, and end the call.
<Response>
<Say>Hello, and goodbye.</Say>
<Hangup/>
</Response>
Advanced Usage
Read back text to the caller using a Neural
voice from Polly. While it costs more, the Neural
TTS is much better than the standard engine.
<Response>
<Say voice="Polly.Matthew-Neural" languge="en-US">Transfering you to a live agent now.</Say>
<Redirect>/transferToSupport</Redirect>
</Response>