VoiceXML 2.1 Development GuideHome  |  Frameset Home

  tutorial Mixed Initiative Dialogs  |  TOC  |  tutorial Foreach and Data  

Tutorial: XML Grammars

This tutorial is based on concepts you accomplished in previous tutorials. If you have not completed most of the VoiceXML tutorial section, we recommend you go through them first.

In this tutorial, we will:

The different grammar formats

With the newer voice browsers available today, we have several options when deciding what formats to code our grammars in. What's so great about that, you ask? Some back story on the formats will clear this up. Most of us are accustomed to the GSL grammar format, as illustrated below:


Inline GSL grammar


<grammar type="text/gsl">
  [ gimli aragorn legolas (frodo ?baggins) ]
</grammar>


Inline GSL grammar with slot definitions


<grammar type="text/gsl">
  <![CDATA[
    [gimli] {<MySlot "dwarves">}
    [aragorn] {<MySlot "men">}
    [legolas] {<MySlot "elves">}
    (frodo ?baggins) {<MySlot "hobbits">}
  ]]>
</grammar>



This grammar format has served us well over the years, so where does the deficiency lie? Well, nowhere really. The primary reason to consider coding your applications using grXML is the fact that GSL is a Nuance-proprietary format. If you spend months working on a custom grammar, subsequently deciding to run it on a non-Nuance platform, all your work may have been for naught. grXML, being a W3C standard format, is not subject to these limitations, and will eventually be supported by all vxml platforms and vendors. You can certainly expect grXML to be THE standard for future vxml coding, and for the GSL and ABNF formats to be leaving for the Gray Havens.

Additionally, grXML is a recognized standard by the w3c, allowing more flexibility regarding grammar structure, repeated phrases, special utterances (such as phrases you do not want recognized at all), as well as a bevy of other options. Also, the grXML format is easier to debug -- while a troublesome GSL grammar can take hours of troubleshooting to get right, a grXML grammar will let you know right away if something is amiss when you parse it in your favorite web browser. In summary, while the GSL format does work just fine, it can be catankerous to work with (especially for those new to writing voice apps), is very proprietary, and limits the platforms your code can run on. grXML is easier to debug, is platform independent, and has the backing of the w3c, and Eowyn.


Step 1: creating our initial VoiceXML file

If you have been through the previous tutorials, authoring a simple form/field grammar document is no big deal. If you are simply skipping ahead to the last tutorial without researching vxml/grammar theory, then the code below may cause abject terror.




<?xml version="1.0" encoding="UTF-8"?>

<vxml version = "2.1">

<meta name="author" content="J. R. R. Tolkein"/>
<meta name="copyright" content="2004 voxeo corporation"/>
<meta name="maintainer" content="YourEmailAddress@here.com"/>

<form id="F1">
  <field name="F_1">
    <prompt> who is the coolest character in the Lord Of The Rings saga? </prompt>
   
    <grammar src="XMLGrammar.xml" type="application/grammar-xml"/>

    <filled>
    <prompt>
      you said
        <value expr="lastresult$.utterance"/>
      from the race of
        <value expr="F_1$.interpretation.F_1"/>
      is the coolest character. I would have to agree.
    </prompt>
    </filled>
  </field>
  </form>
</vxml>




A few notes about the above structure, which should seem familiar. Note the <meta> tag (not the author's self aggrandizing namestamp, but the 'maintainer'). If you have done your homework, you will remember that adding this tag will enable debugging via email, which is a super-handy thing to have in the event your application decides to detonate due to a coding error. Get in the habit of doing this in all of your VXML documents, and you will be very glad that you did. Trust us on this one.

You will also notice we have explicitly defined the grammar type as 'application/grammar-xml' -- remember, we must always return the proper MIME type of the grammar, else you will undoubtedly get the 'application has an internal error' message, a haunting call the darkness from Isengard, which is bad.

Step 2: Writing our XML grammar

We will start off writing a grammar of medium complexity. Our goals for this grammar are twofold: we want a grammar explicitly defining return values, and we want that grammar to reside in an external file. Let us also base it on the GSL grammar above, so we have something to compare it to:


<?xml version= "1.0"?>

<grammar xmlns="http://www.w3.org/2001/06/grammar"
          xml:lang="en-US" root = "MYRULE">

  <rule id="MYRULE" scope="public">

    <one-of>
      <item> gimli  </item>
      <item> aragorn </item>
      <item> legolas </item>
      <item> frodo baggins </item>
      <item> rosie o donnell </item>
  </one-of>

    </rule>
</grammar>


Here we have a grXML document that illustrates a simple grXML structure, sans fancy slot definitions. If 'legolas' is the user utterance, then all that will be returned to the invoking VXML is, you guessed it, the dual-purpose utterance/value of 'legolas'. This is where the journey begins, fellow wanderers, so let us take a closer look at some of the key points of this grammar structure so we do not get lost on the way to Mordor, shall we? 

Riding a barrel through Mirkwood

You would not be the intrepid adventurer you are if the topic of optional utterances did not strike your fancy. While they are not really needed as part of our lesson, it is a question bound to arise.  When using GSL syntax, we could define utterances that are not necessarily required, but still allowable all the same, by adding a '?' in front of them:


<grammar type="text/gsl"> [ gimli aragorn legolas (frodo ?baggins) ] </grammar>


So how do we do this in grXML, you ask? Easy as rabbit pie, My Precious. We want to use the 'repeat' attribute of the <item> element, and do some creative nesting of our tags:



<?xml version= "1.0"?>
<grammar xmlns="http://www.w3.org/2001/06/grammar"
          xml:lang="en-US" root = "MYRULE">

  <rule id="MYRULE" scope="public">

  <one-of>
    <item>


<!-- optional utterance prefixes go here -->
<!--  <item repeat="0-1">(some utterance)</item> -->

    <one-of>
      <item> gimli  </item>
      <item> aragorn </item>
      <item> legolas </item>
      <item> frodo  </item>
      <item> rosie </item>
    </one-of>

<!-- optional utterance suffixes go here -->
      <item repeat="0-1"> baggins </item>
      <item repeat="0-1"> o donnell </item>

    </item>
  </one-of>


  </rule>
</grammar>


Okay, to break down the key points of this humble addition to our grammar, those with keen eyes will notice we nested our original utterances within another level of <one-of/item> tags. As we are accounting for an utterance suffix in this case (for last names), we have placed these optional utterances so they are below the required utterances (first names). If we wanted to add an utterance prefix, we would want to add them so they are just above the required utterances.  Even an Uru'kai could figure that one out!  Also, make note of the fact that we have added the aforementioned 'repeat' attribute to these utterances, and specified they can be said either once, or not at all. 

Adding our slot definitions

One last thing to do before we leave Middle Earth for good -- add in some slot returns.


<?xml version= "1.0"?>
<grammar xmlns="http://www.w3.org/2001/06/grammar"
          xml:lang="en-US" root = "MYRULE">

  <rule id="MYRULE" scope="public">

  <one-of>
    <item>

<!-- optional utterance prefixes go here -->
<!--  <item repeat="0-1">(some utterance)</item> -->

    <one-of>
      <item> gimli  <tag>          <![CDATA[  <F_1 "dwarves">  ]]>  </tag>  </item>
      <item> aragorn <tag>        <![CDATA[  <F_1 "men">  ]]>          </tag> </item>
      <item> legolas <tag>        <![CDATA[  <F_1 "elves">  ]]>        </tag> </item>
      <item> frodo <tag>          <![CDATA[  <F_1 "hobbits">  ]]>      </tag> </item>
      <item> rosie <tag>          <![CDATA[  <F_1 "cave trolls">  ]]>  </tag> </item>

    </one-of>

<!-- optional utterance suffixes go here -->
      <item repeat="0-1"> baggins </item>
      <item repeat="0-1"> o donnell </item>

    </item>
  </one-of>

  </rule>
</grammar>



Note: we most graciously highlighted our additions to the Precious with The Light of Elendial so it is easier to follow. For every slot we want to return,  we will want to nest a <tag> element within the <item> element holding the utterance.  Within this <tag>, we will need to make sure the return value and the slot definition is enclosed in CDATA tags, ensuring non-parseable characters (such as Shelob), will not bring our journey to a screeching halt. Nested within the CDATA element, we define our slot (which, you will note, matches up with the fieldname we defined in the VXML document) within angle brackets:

<F_1>

...and then we define our return value within double quotes contained by the slot definition:

<F_1 "frodo">

Say, don't we need some slots for our optional utterances?  Nope.  Remember, it doesnt matter if our callers say 'frodo baggins' or just 'frodo' -- our return value will be, in this instance, 'hobbits.' The only action needed now is for the Ring-Bearer to upload and map the application.


Download the Code!

  Source Code



What we covered:


What we hinted at:









  ANNOTATIONS: EXISTING POSTS
darrenj
2/9/2006 7:33 PM (EST)
Riddle me this Batman. Int this tutorial:

1. The document references the grammar as XMLGrammar.cfm instead of XMLGrammar.xml.

2. It seems from looking at it that the optional utterances will be accepted for each declared utterance. This is in fact the case when tested.

This begs the question. Would I associate the optional utterance of baggins with just the utterance of frodo like:

  <item> frodo  <tag> <![CDATA[  <F_1 "hobbits">    ]]> </tag>
    <one-of>
      <item repeat="0-1"> baggins </item>
    </one-of>
  </item>

AND Why doesn't this work with o donnel like this?

  <item> rosie  <tag> <![CDATA[  <F_1 "cave trolls"> ]]> </tag>
    <one-of>
      <item repeat="0-1"> o donnel </item>
    </one-of>
  </item>
MattHenry
2/10/2006 2:47 PM (EST)

Hiya Darren,

I'd be happy to help clarify for you. Regarding the .cfm grammar reference, this is an artifact from my original test code, (I use .cfm cache headers when working on code to make sure I'm not getting a stale copy of my code when testing). In any case, I got this fixed in our internal Build of the dcos, which is set to be updated sometime in the next week or two.

On to your second question:

A) The optioanl utterances wqould indeed work for all grammar values; I didnt recall Rosie Baggins, or Frodo O Donnel having any speaking roles in the trilogy, so I thought it was a fairly safe bet that these wouldn't be encountered by anyone.

I think you are close on your syntax; hard to tell exactly how you have your stuff nested from the code fragments, but for the edification of our developers, I'll post a full-blown sample that further subdivides the 'optionals' on a per-entry basis:


---------------------------------------

<?xml version= "1.0"?>
<grammar xmlns="http://www.w3.org/2001/06/grammar" xml:lang="en-US" root="INTRO">

<rule id="INTRO">
<one-of>

<!-- ************* -->

    <item>
      <one-of> 


        <item>
    erik    
        </item>

      </one-of>
<tag> <![CDATA[  <MySlot "Ponch">  ]]>  </tag>
              <item repeat="0-1">
estrada
      </item>

    </item>

<!-- ************* -->

    <item>
      <one-of> 


        <item>
    jon    
        </item>

      </one-of>
<tag> <![CDATA[  <MySlot "bon jovi">  ]]>  </tag>

              <item repeat="0-1">
bon jovi
      </item>

    </item>

</one-of>

    </rule>
</grammar>

---------------------------------------

B) This is kind of difficult to diagnose why this snippet isn't working, but I suspect that we aren't nesting our values properly in one-of/item tags one more level up.

Hope this helps clear things up!

~Matthew Henry
draftbeer80
12/17/2006 6:45 AM (EST)
Hi everbody!

I'm starting to play with external xml grammar.  And I need a little assistance.  First I'm gonna post the code of my test app (XMLGramTest.xml):

<?xml version="1.0" encoding="UTF-8"?>
<vxml version = "2.1" >
<form id="main">
<field  name="F_1">
<prompt>Speak.</prompt>
<grammar src="testGram.xml" type="application/grammar-xml"/>

<noinput>Nothing heard.<reprompt/></noinput>
<nomatch>No match found.<reprompt/></nomatch>
</field>

<filled>
<prompt>
You said.  <value expr="F_1$.interpretation.MySlot"/><break time="2000"/>
</prompt>
</filled>
</form>
</vxml>

And here's the code for the external grammar (testGram.xml):

<?xml version="1.0" encoding="UTF-8"?>
<grammar xmlns="http://www.w3.org/2001/06/grammar" xml:lang="en-US" root="TOPLEVEL">
    <rule id="TOPLEVEL" scope="public">
<one-of>
<item><ruleref uri="#LOWLEVEL"/><tag><![CDATA[<MySlot $return>]]></tag></item>
</one-of>
    </rule>
   
    <rule id="LOWLEVEL">
<one-of>
<item>hello<tag>return ("hi!")</tag></item>
</one-of>
    </rule>
</grammar>

As you can see in the grammar code we have two rules.  The toplevel rule references the lowlevel rule.  Now, what I really want to happen is simple.  I want the string "hi" which was returned after a match is found will be returned to the main rule and put into the MySlot slot.  That way, it will be passed to the application using the grammar.  I know there's something wrong with it.  And i've been trying to figure it out for hours.  I think it's time for me to ask for help from the geniuses.  Hahah. 

Thanks
Chris ;-D
MattHenry
12/18/2006 1:31 PM (EST)


Cris,

Can you please let me know which VXML platform that you are using to test this grammar? I don't want to spend time working on a solution that may not apply to the VXML version that you are using, (Voxeo-Motorola VXML vs. Prophecy-Voicecenter 7.0 VXML).

~Matt
draftbeer80
12/18/2006 8:15 PM (EST)
Hi Mat!

I am testing that application with Prophecy Voice Platform 7.0.  I installed it in my computer and I pointed it to a local webserver. 
I hope that helped. ;-D
VoxeoTony
12/18/2006 9:05 PM (EST)
Hello Chris,

With what you sent in for platform we were able to test run some code with modifications for you.  In the grammar section we updated the code to better define a slot than the method you were using.  Matt wanted me to get this new snipped to you.

<item>hello<tag><![CDATA[ <mySlot "hi"> ]]></tag></item>

In addition, with your main page you listed the line

$.interpretation.MySlot

But it is not needed for your code.  Without it the code worked fine.  Please give it a try and let us know your thoughts.

Regards,

Tony
draftbeer80
12/21/2006 12:35 AM (EST)
Hi!

I've tried the above suggestion.  I'm sorry, But it doesn't work.  At least, not the way I wanted it to.  If you re-examine the code above that I have posted, what I really wanted to happen is that, if the user says "hello" then the main program should say "You said HI".  Instead of saying "You said HELLO".  Because in my repeated trial-and-error, I always come up with that result.

By the way, in the above grammar I posted, what does $return put into the MySlot slot?  Because I get the result "You said UNDEFINED".  Isn't $result supposed to have the value "hi"?

I appreciate so much your patience. 

Thanks.  ;-D
Chris

login
  tutorial Mixed Initiative Dialogs  |  TOC  |  tutorial Foreach and Data  

© 2008 Voxeo Corporation  |  Voxeo IVR  |  VoiceXML & CCXML IVR Developer Site