Did you ever wanted to create IVR that recognize caller’s voice?
I guess you did but there are too many questions that are immediately coming to your mind when thinking about caller’s voice recognition – what software do I need to recognize voice, how to program it, do I need some additional hardware besides Asterisk PBX etc.

However, the truth is that creating speech enable dial plans is much more easier than it looks like at first glance.
Of course, to make it easy you would need the right set of tools.

In this article we will discuss how you can create simple speech enabled dial plan, and will create Hello Worlds dial plan that recognizes caller voice.
You can also download already created dial plan that will be used here: Download hello-world speech enabled dial plan

If you have any trouble loading the dial plan, contact us and we’ll gladly work with you to help you set it up.

The first step is to have installed and running three pieces of software:

1. Asterisk PBX

Asterisk PBX is available in two major distributions – standalone Asterisk PBX without GUI, and Asterisk PBX with a GUI.
We recommend Asterisk PBX with a GUI since it is much easier to set it up through the GUI comparing to manual files configuration under Linux.

There are several Asterisk GUIs on the market (Elastix, PIAF, AsteriskNOW etc.) but we do recommend Elastix GUI. Download it here.

Download the Elastix PBX ISO package and follow on screen instructions to install it.
We recommend installation on virtual machine rather than dedicating complete PC for this test.
After you are learn more, you can always switch to dedicated PC.

Besides Asterisk GUI, the ISO will install Linux OS, MySQL database, Asterisk core PBX etc.

2. Visual Dialplan

Visual Dialplan is intuitive and easy to use tool for the dial plan development. It is especially good choice in case of speech enabled dial plans development since it comes with ASR (Automatic Speech Recognition) Grammar editor and integrates ASR grammar with the dial plan during the deployment process. This last functionality will significantly speed up your speech enabled dial plan development and deployment.
Also, there is no need to know Linux or to have some Asterisk experience to use Visual Dialplan and create complex dial plans.

Visual Dialplan comes with several ready to use dial plan examples, including several speech enabled dial plan examples.
You can start by trying and modifying one of the samples or you can start with the example we use in this article.

Download Visual Dialplan here: Visual Dialplan download

3. LumenVox automated speech recognizer

The LumenVox automated speech recognizer is a software solution that converts spoken audio into text, providing users with a more efficient means of input. A Speech recognizer compares spoken input to a list of phrases to be recognized, called a grammar. The grammar is used to constrain the search, enabling the recognizer to return the text that represents the best match. This text is then used to drive the next steps of your speech-enabled application.

This is the last piece of the software you would need to create speech enabled dial plans.
More details about this software and download options can be found at LumenVox web site.

 

Create hello-world speech enabled dial plan

This dial plan is developed using Visual Dialplan for Asterisk and pre-configured to be used with Elastix or any other compatible Asterisk GUI (AsteriskNOW, PIAF, trixbox etc.).

The output of the Visual Dialplan is standard Asterisk extensions conf code and grammar files, automatically deployed and loaded to the Asterisk server.
However, you will still need to manually place the sound files used in this dial plan in the /var/lib/asterisk/sounds/ folder.

Contexts description

The entry point for this dial plan is the vdp-inbound or vdp-outbound context.

This dialplan will play the welcome message and ask the caller to select the sex.
Depending on the caller’s input the say block will play number one or number two (1 for mail or 2 for female).

Speech enabled dial plan

Complete ASR logic is in the ASR-select-sex context.

ASR logic

Each speech recognition action sets two variables, first is the value of the recognized word and the second is the score of the recognition (number between 1 and 100).

ASR-select-sex context plays hello-world voice file and listen to the user input. If the user input is ‘0’ i.e. user didn’t say anything, it will `play did-not-get-it` voice file and repeat the question again.

The dial plan will then check if the speech score (recognition score) is higher than “sex-treshold” value and proceed if it is and set variable “sex-value” to 1 or 2 depending on the caller input (1 in case the user said “male”, 2 in case the user said “female”), or ask the user to repeat the entry if the recognized score is below the threshold (poor recognition).

After this the context will return control to the vdp-inbound context to evaluate the GotoIf expression.

If expression returns true say block will say 1 and if false it will say 2.

ASR grammar

The ASR grammar in this dial plan is very simple – we will instruct engine to recognize just two words, “Male” and “Female”.
In case the “Male” word is recognized, the “sex” variable will be set to “1”.
In case the “Female” word is recognized, the “sex” variable will be set to”2″.

sex
#ABNF 1.0 UTF-8;
language en-US;
mode voice;
tag-format ;
root $sex;

$Male = (Male);
$Female = (Female);
$sex = $Male {$=”1″;}|$Female {$=”2″;};

 

Deployment

The output of Visual Dialplan is standard Asterisk code (extension.conf) and standard ASR grammar file.

Simply select “Dialplan” and then “Deploy” from the main many and Visual Dialplan will SSH to your remote Asterisk server, deploy the dial plan and grammar files, and reload it into the Asterisk PBX. 

You’ll be presented with a box to confirm the remote deployment. Just click yes and few seconds later a confirmation window will appear.

Make sure you validate your Asterisk dial plan before you deploy it to Asterisk server. This will make the whole process go a lot smoother.

Download hello-world speech enabled dial plan here