VoiceXML 2.1 Development GuideHome  |  Frameset Home

  tutorial Foreach and Data  |  TOC  |  Intro to Server Side  

VoiceXML Tutorial: Using the VoiceXML 2.1 "marktime" Shadow Variable to Simplify Dialog Choices

This month's Voxeo tutorial touches on the usage of the marktime shadow variable which is a new addition to the VoiceXML specification. So what is this shadow variable all about, anyway? Per the specification:

"When these properties are set on the application.lastresult$ object, if an input item (as defined in section 2.3 of [VXML2]) has also been filled and has its shadow variables assigned, the interpreter must also assign markname and marktime shadow variables, the values of which equal the corresponding properties of the application.lastresult$ object."

In layman's terms, marktime allows us to clearly detect where a user barged in on a prompt residing in a voice recognition field. A reco field gets a valid input called  lastresult$.marktime, which will be populated with a time value (in milliseconds) showing us where the user interrupted our <prompt>. Leveraging this shadow variable, we can then assign a universal grammar ("that one," or even a dtmf key) to choose items from a list and then use some simple mathematical calculations to determine which choice the caller wants.


Note: This tutorial assumes that the user is using the Prophecy VoiceXML platform. Other VXML platforms and partitions on the Voxeo network do not support this new addition to the specification.

Note: At the time of this writing, the only shadow variable supported for this new element is lastresult$.marktime. To be clear, lastresult$.markname is not presently supported but will be added in future software releases of the platform.


Step 1: Write the Main VoiceXML File

Our VoiceXML dialog is going to be pretty simple, comparatively speaking: All we really need to get started is a simple form/block/field/filled structure that most of you have worked with already:


<?xml version="1.0" encoding="UTF-8"?>
<vxml version="2.1" xmlns="http://www.w3.org/2001/vxml" xmlns:voxeo="http://community.voxeo.com/xmlns/vxml">
<meta name="maintainer" content="your_email_address@somewhere.com"/>

  <form> 

  <block>
  <voxeo:recordcall value="100" info="marktest"/>
  </block>

  <field name="LenandBarrys">

  <prompt bargein="false">
    Welcome to Len and Barry's Ice Cream Ordering System.
    You can select from the following menu your favorite ice cream of the day.
    At any time you can say "that one."
  </prompt>

  <prompt>
    <break time="500ms"/>
    Cocoa Banana Cabana
    <break time="2500ms"/>

    Banana Caramel Crunch
    <break time="2500ms"/>

    Chocolate Devotion             
    <break time="2500ms"/>

    German Chocolate
    <break time="700ms"/>
  </prompt>

  <grammar mode="voice" root="LenBarry">
    <rule id="LenBarry" scope="public">
      <one-of>
      <item>that one</item>
      </one-of>
    </rule>
    </grammar>

      <filled>
      </filled>

    </field>

  </form>

</vxml>


Nothing too fancy in the code above. We have a very basic field, containing a very basic SRGS grammar with a single choice defined. We learned all about this stuff in VoiceXML kindergarden, right? There are a few things that savvy readers will notice about our starting snippet. First, we have defined a <voxeo:recordcall> element in our <block>. Second, we have defined a Voxeo XML namespace within the <vxml> element. These are important for our next step in the tutorial: In order to effectively set up demarcations for our "universal choice" grammar, we will need to determine where the different choices of ice cream are output and how much time they take to render to the caller. More importantly, you'll note that we have some pauses added after each flavor of ice cream listed in our TTS. This allows our caller a "window" to declare their choice and have the system determine what choice they really wanted.


Step 2: Fun with Goldwave

With this basic code in place, we will now map our application and give it a test call. I know, it's incomplete at this moment, but we need to make an asessment as to where we can "bookmark" our choices on a visual timeline that is representative of the audio. We now place a test call to the app as it currently stands and stay totally quiet; putting the phone or mic on "mute" helps us out quite a bit. Once the default "noinput" prompt is played, we can hang up and grab the audio file that is generated from the <voxeo:recordcall> element.


If you are running Prophecy on the hosted network, then you can find your audio file uploaded to your Voxeo File Manager, within the 'root/recordings' directory. Your filename will look something like this:

123-123-123abc123abc123abc123abc12-0-marktest.wav


If you are running Prophecy locally, then the audio from the <voxeo:recordcall> element will be dumped into the "C:\Program Files\Voxeo\www\MRCP\Recordings" directory, and will look something like this.

1-1-71ee1fce0f3d516ed2c3beb139ad4609-0-marktest.wav

Now the fun begins, as promised. You will next want to grab a free copy of Goldwave, which is a highly functional audio editing utility. Once you have this downloaded and installed, you will then want to grab the audio recording generated by the <voxeo:recordcall> and drag it into the Goldwave window. This will give us a tidy representation of where the audio is output and how much time there is between the rendering of the different ice cream choices. An annotated screenshot of what this will look like is found below for reference:



From the annotations, we can see several distinctive areas of text-to-speech output that occured when we queried the caller, each followed by some user-defined pauses. Checking the 'timeline' at the bottom of our Goldwave window, we can then denote some timing "bookmarks":




Step 3: Timing is Everything

Our Goldwave diagram and grocery list of timing values above give us several "windows" that are open for input from our callers where we will be able to use some conditional statements to determine what the caller chose. As we are filtering caller input based on the point in time where it was uttered, we can use some simple conditional statements to determine which flavor our caller chose. As we have the introductory prompt set to bargein="false," then we can discount any user input between 0ms and 12800ms in our calculations. Matter of fact, since all we care about for the first utterance is the endpoint (including the 2500ms pause) of the choice being played, all we care about is any value less than this endpoint. Following this logic, we then start putting in place our conditional logic:


  <if cond="LenandBarrys$.marktime &lt;= 16500">
    <!-- USER MUST HAVE CHOSEN "Cocoa Banana Cabana" -->



For the rest of our choices, we base our logic on the starting point of the choice in question and the endpoint of the <break> tag immediately following it. Remember, greater-than and less-than values need to be encoded, as the "<" and ">" brackets would cause us a parse error. As such, they need to be represented with "&gt;" and "&lt;" within the XML file:



<filled>
  <log expr="'****** LenandBarrys$.marktime' + LenandBarrys$.marktime + '********'" />

  <if cond="LenandBarrys$.marktime &lt;=16500">
  <prompt>You chose Cocoa Banana Cabana.</prompt>

  <elseif cond="LenandBarrys$.marktime &gt;=16500 && LenandBarrys$.marktime &lt;=23000" />
    <prompt>You chose Banana Caramel Crunch.</prompt>

  <elseif cond="LenandBarrys$.marktime &gt;=23000 && LenandBarrys$.marktime &lt;=27500" />
    <prompt>You chose Chocolate Devotion.</prompt>

  <else />
    <prompt>You chose German Chocolate.</prompt>

  </if>
  </filled>



You'll note that we also added the inevitable <log> statement, as it is going to give us some external validation on our math skills and show the actual marktime value for us in the Voxeo debugger. And we always keep the debugger open when making test calls, right? Right?


Step 4: Code Complete!

Ahh yes, "code complete." This is a phrase that is, to any developer, a joyous thing to hear. All that remains now is to add some basic event handling (for our OOGNI events) and we are in a state of development Nirvana. Picture Siddhartha himself with a day off work and you'll get the picture. Let's bask in the enlightenment of our fully developed, code-complete application:


<?xml version="1.0" encoding="UTF-8"?>
<vxml version="2.1" xmlns="http://www.w3.org/2001/vxml" xmlns:voxeo="http://community.voxeo.com/xmlns/vxml">

<meta name="maintainer" content="your_email_address@somewhere.com"/>

  <form> 

  <block>
  <voxeo:recordcall value="100" info="marktest"/>
  </block>

  <field name="LenandBarrys">

  <prompt bargein="false">
    Welcome to Len and Barry's Ice Cream ordering system.
    You can select from the following menu your favorite ice cream of the day.
    At any time you can say "that one."
  </prompt>


  <prompt>
    <break time="500ms"/>

    Cocoa Banana Cabana
    <break time="2500ms"/>

    Banana Caramel Crunch
    <break time="2500ms"/>

    Chocolate Devotion             
    <break time="2500ms"/>

    German Chocolate
    <break time="700ms"/>
  </prompt>


  <grammar mode="voice" root="LenBarry">
    <rule id="LenBarry" scope="public">
      <one-of>
      <item>that one</item>
      </one-of>
    </rule>
    </grammar>


    <catch event="noinput nomatch">
      <log expr="'***** OOGNI event caught: ' + _event"/>
      <prompt>
        I'm sorry, but I didn't quite catch that.
        Let's try again.
      </prompt>
      <reprompt/>
    </catch>

    <filled>
      <log expr="'***** LenandBarrys$.marktime' + LenandBarrys$.marktime + '********'"/>

      <if cond="LenandBarrys$.marktime &lt;=16500" >
        <prompt>
          You chose Cocoa Banana Cabana.
        </prompt>

      <elseif cond="LenandBarrys$.marktime &gt;=16500 &amp;&amp; LenandBarrys$.marktime &lt;=23000"/>
        <prompt>
        You chose Banana Caramel Crunch.
        </prompt>

      <elseif cond="LenandBarrys$.marktime &gt;=23000 &amp;&amp; LenandBarrys$.marktime &lt;=27500"/>
        <prompt>
        You chose Chocolate Devotion.
        </prompt>

        <else/>
          <prompt>
            You chose German Chocolate.
          </prompt>

        </if>
      </filled>

    </field>

  </form>

</vxml>



Step 4: Dial In and Test Out Your Code

All that remains now is to test your code. If you are using the Voxeo hosted platform, simply use the File Manager to upload your code onto our free hosting environment, or you can host it on your own webserver if you like. If you are using a local instance of the Prophecy platform, then you can place the files in your 'www' subdirectory and change your call routing to point to this file.

If you have any difficulties with your hand-typed code from this tutorial, you can always grab the zip file below to use as a comparison.


Download the Code!

  Prophecy source code


  ANNOTATIONS: EXISTING POSTS
0 posts - click the button below to add a note to this page

login
  tutorial Foreach and Data  |  TOC  |  Intro to Server Side  

© 2008 Voxeo Corporation  |  Voxeo IVR  |  VoiceXML & CCXML IVR Developer Site