1 VoiceXML Tutorial: Part 1 Introduction and User Interaction with DTMFPresented by Plum Voice
2 About Plum Plum offers high-performance, versatile, and scalable IVR systems and hosting that can automate any phone. We pride ourselves on delivering solutions that satisfy customers ranging from small and medium-sized business to some of the largest enterprises in the world.
3 1. VoiceXML Tutorial VoiceXML 2.0 is the World Wide Web consortium standard for scripting voice applications. In this tutorial, we construct a VoiceXML interactive voice response (IVR) for a customer service center. Some aspects of this tutorial assume you have your own web server. For a full production level application, this is the recommended configuration. Starting from a simple "Hello World" application, we build a telephony application which includes: Dynamic response driven by touch tone or speech input Advanced text-to-speech (TTS) speech synthesis and automatic speech recognition (ASR) System integration with enterprise databases
4 1.1 Introduction to VXML We begin with nearly the simplest complete VoiceXML application. The application here is analogous to an answering machine set to play an announcement only.
5 1.1 Introduction to VXML Also, as the tag declares, every VoiceXML document is an XML document. The basic structure of the VoiceXML should be familiar to anyone who has looked at HTML web documents. Tags are set off by brackets . VoiceXML documents must adhere strictly to the XML standard. The document must begin with the tag. Then the rest of the document is enclosed within the
6 1.1 Introduction to VXML For static prompts such as this welcome message, we'll probably want to use a human announcer instead of TTS. TTS has come a long way, but there's still no substitute for the real thing. For recorded prompts, we use the
7 1.1 Introduction to VXML The text within the audio tag is not required. We could have included no content: which is equivalent to The text included within the audio tag in the example above is something like the ALT text for images in HTML. If the VoiceXML platform is unable to open or play the source ("src") file in the audio tag, it falls back on generating TTS from the included text.
8 1.1 Introduction to VXML It is good practice to store your audio files on the same local server as your application script. For example, here is what our server files would look like on our local server. From the screenshot above, note that in the files folder of our local server, test.php is our script that contains the reference to the file, welcome.wav.
9 1.1 Introduction to VXML welcome.wav is stored in our wav folder. Thus, when referencing the source ("src") file in our audio tag, we do: Welcome to Plum Voice. The benefit of storing audio files on your local server as opposed to the audio repository is that it allows for easier file management. Suppose you wanted to change the name of one of your audio files. If this file is stored locally on your server, you could just go in and rename the file yourself. However, with the audio repository, you are not able to manage these files. For example, if you deleted a recording in the audio repository (in this case, let's call it 12.wav) and uploaded a replacement file, the replacement file would not take the deleted recording's old name. It would take the next highest number available out of your recordings (in this case, let's say it got named 21.wav). If you are concerned about loading times for audio files from your local server, please note that when these audio files have been cached, they will have the same load times as if stored on our audio repository. 0
10 1.2 User Interaction with DTMFGrammars are used by speech recognizers to determine what the recognizer should listen for, and so describe the utterances a user may say. Starting with VoiceXML Version 2.0, the W3C requires that all VoiceXML platforms must support at least one common format, the XML Form of the W3C Speech Recognition Grammar Specification (SRGS). Plum implements the SRGS+XML grammar format for both Voice and DTMF grammars as well as JSpeech Grammar Format (JSGF).
11 1.2 User Interaction with DTMFTo control user input, we can explicitly create input fields and specify allowable grammars for user input. We do this by explicitly using the
12 1.2 User Interaction with DTMFThe following example shows how to set up a grammar for DTMF input from the user:
13 1.2 User Interaction with DTMF
14 About us Plum Voice was founded in 2000 as The Plum Group Inc. With headquarters in New York and offices in New York City, Boston, Denver and London, Plum creates technologies for personalized audio communication. Plum provides interactive voice response platforms, systems and hosting services to developers and companies to automate call center and business process over the phone. Products and services include: The Plum VoiceXML Platform Plum IVR Hosting Suite Plum Survey Plum IVR Server Plum Professional Services QuickFuse
15 Up Next: User Interaction with Speech, Built-in Grammars, and Standard Events