Internet Engineering Task Force D. Cromwell INTERNET DRAFT Nortel Networks File: draft-cromwell-navdec-mgcp-audio-pkg-00.txt Date: November 1998 A Syntax For The MGCP Audio Package Status of this Document This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." To view the entire list of current Internet-Drafts, please check the "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast). Abstract The MGCP protocol describes a protocol for controlling a VOIP (Voice Over IP) gateway from an external call agent. The protocol defines a number of event packages, each of which defines a group of events supporting a particular type of gateway functionality. For example, there is a package for MF trunks, one for DTMF, and one for RTP. A gateway may support one or more packages depending on the functional- ity it provides. A Residential Gateway might support the Generic Media, DTMF, Line, and RTP packages while an MF Trunk Gateway could support the Generic Media, MF, DTMF, MF Trunk, and RTP packages. This document proposes a set of IVR-related events that constitute an MGCP Announcement Package for use by an Announcement Server Gateway. This event package provides support for the standard IVR operations of Play Announcement, Play Collect, and Play Record. It supports direct references to simple audio as well as indirect references to Cromwell expires May 1999 [Page 1] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 simple and complex audio. It also provides multi-language audio vari- ables, control of audio interruptibility, digit buffer control, spe- cial key sequences, and support for reprompting during data collec- tion. Cromwell expires May 1999 [Page 2] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 Table of Contents 1. Introduction 2. Announcement Server Package 3. Events 4. Parameters 5. Return Parameters 6. Variable Qualifiers 7. Key Qualifiers 8. Examples 9. References 10. Author's Address Cromwell expires May 1999 [Page 3] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 1. Introduction The following syntax supports both simple and complex audio struc- tures. A simple audio structure might be a single announcement such as "Welcome to Bell South's Automated Directory Assistance Service." A complex audio structure might consist of an announcement followed by voice variable followed by another announcement, for example "There are thirty seven minutes remaining on your prepaid calling card," where the number of minutes is a voice variable. There are two methods of specifying complex audio. The first is to directly reference the individual components. This requires a com- plete description of each component to be passed via the protocol. The second method is to provision the components on the Announcement Server as a single referencable entity and to export that reference to the call agent. In this case, only the reference (plus any required variable related data) is passed via the protocol, and no description of individual components is necessary. The Announcement Server Package provides significant functionality, which is controlled by specifying parameters. However most parame- ters are optional, and where appropriate they default to reasonable values. An audio application that uses single references to (provisioned) complex audio structures, and which takes advantage of parameter optionality and defaults, can specify audio events using a minimum of syntax. 2. Announcement Server Package Package Name: A The Announcement Server Package is comprised of events, parameters, variable qualifiers, and key qualifiers. 3. Events ________________________________________________________________________ | Symbol | Definition | R | S Duration | |______________|________________________|_______|_______________________| | pl(parms) | Play Announcement | | TO variable | | pc(parms) | Play Collect | | TO variable | | pr(parms) | Play Record | | TO variable | |______________|________________________|_______|_______________________| Cromwell expires May 1999 [Page 4] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 The events provided by the ANN Package are defined as follows: Play Announcement: Plays an announcement in situations where there is no need for interaction with the user. Because there is no need to monitor the incoming media stream this event is an efficient mechanism for treatments, informational announcements, etc. Play Collect: Plays a prompt and collects DTMF digits entered by a user. If no digits are entered or an invalid digit pattern is entered, the user may be reprompted and given another chance to enter a correct pattern of digits. Play Record: Plays a prompt and records user speech. If the user does not speak, the user may be reprompted and given another chance to record. 4. Parameters The Play Announcement, Play Record, and Play Collect events may each be qualified by a string of parameters, most of which are optional. Where appropriate, parameters default to reasonable values. The only event with a required parameter is Play Announcement. If a Play Announcement event is not provided with a parameter specifying some form of playable audio an error is returned to the application. The events in the ANN package are defined as follows: Cromwell expires May 1999 [Page 5] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 _______________________________________________________________________ | Parameters | |______________________________________________________________________| | Symbol | Definition | pl | pc | pr | |___________|_________________________________|________|________|______| | ps | provisioned segment | x | x | x | | ts | text to speech | x | x | x | | dt | display text | x | x | x | | si | silence | x | x | x | | tn | tone | x | x | x | | vb | variable | x | x | x | | it | iterations | x | | | | iv | interval | x | | | | du | duration | x | | | | sp | speed | x | x | x | | vl | volume | x | x | x | | ip | initial prompt | | x | x | | rp | reprompt | | x | x | | nd | no digits reprompt | | x | | | ns | no speech reprompt | | | x | | fa | failure announcement | | x | x | | sa | success announcement | | x | x | | cb | clear digit buffer | | x | x | | mx | maximum # of digits | | x | | | mn | minimum # of digits | | x | | | pa | digit pattern | | x | | | fd | first digit timer | | x | | | id | inter digit timer | | x | | | ed | extra digit timer | | x | | | pr | pre-speech timer | | | x | | po | post-speech timer | | | x | | tr | total recording length timer | | | x | | rsk | restart key | | x | x | | rik | reinput key | | x | x | | rtk | return key | | x | x | | psk | position key | | x | x | | stk | stop key | | x | x | | sik | start input key | | x | x | | eik | end input key | | x | x | | na | # of attempts | | x | x | | ik | interrupting key sequence | | x | x | | ap | amount played | | x | x | | dc | digits collected | | x | | | ri | recording id | | | x | |___________|_________________________________|________|________|______| Parameters to the ANN package events are defined as follows: Cromwell expires May 1999 [Page 6] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 Text To Speech: Specifies a text string to be converted to speech. Display Text: Specifies a text string to be displayed on a device. Silence: Specifies a length of silence to be played in units of 100 mil- liseconds. Tone: Specifies a tone to be played by algorithmic generation. Exact specification of this parameter is tbd. Most tones will likely be recorded, not generated. Variable: Specifies a multilanguage voice variable by type, subtype, language, and value. Provisioned Segment: Reference by id and possibly language to a provisioned sequence of recording, spoken text, display text, silence, tone, vari- able, or other provisioned segments. An id shall be a specified as a unique 32 bit binary integer. A provisioned segment could, for example, resolve to a single recording or a single voice variable, or a single tone. An example of a more complex provisioned segment is a recording followed by silence followed by a voice variable followed by another recording. Provisioned segments may reference other provisioned segments. For example, a provisioned segment could resolve to a recording followed by another provisioned segments. Direct or transitive definition of a provisioned segment in terms of itself must not be permitted. Speech components of provisioned segments shall support multiple langagues. Language is encoded using the two letter codes defined in ISO standard 639, Code For The Representation Of Names Of Languages [4]. If a language indicator is not sup- plied, the the variable is played in the system-defined default Cromwell expires May 1999 [Page 7] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 language. Iterations: The maximum number of times an announcement is to be played. A value of minus one indicates the announcement is to be repeated forever. D efaults to one if not specified. Interval: The interval of silence to be inserted between iterative plays. Specified in units of 100 milliseconds. Defaults to one second if not specified. Duration: The maximum amount of time to play and possibly replay an announcement. Takes precedence over iteration and interval. Specified in units of 100 milliseconds. No default. Speed: The relative playback speed of announcement specifiable as a positive or negative percentage variation from the normal play- back speed. Volume: The relative playback volume of announcement specifiable as a positive or negative percentage variation from the normal play- back volume. Initial Prompt: The initial announcement prompting the user to either enter DTMF digits or to speak. If not specified, the event immediately begins digit collection or recording. Reprompt: Played after the user has made an error such as entering an invalid digit pattern or not speaking. Defaults to Initial Prompt. No Digits Reprompt: Played after the user has failed to enter a valid digit pattern during a Play Collect event. Defaults to Reprompt. Cromwell expires May 1999 [Page 8] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 No Speech Reprompt: Played after the user has failed to speek during a Play Record event. Defaults to Reprompt. Failure Announcement: Played when all data entry attempts have failed. No default. Success Announcement: Played when data collection has succeeded. No default. Clear Digit Buffer: If set to true, clears the digit buffer before playing the ini- tial prompt. Defaults to false. Maximum # Of Digits: The maximum number of digits to collect. Defaults to one. Minimum # Of Digits: The minimum number of digits to collect. Defaults to one. Digit Pattern: An extended regular expression specifying a digit collection pattern. Uses extended regular expressions as supported by the Rogue Wave Class Library [6], which supports a subset of the POSIX.2 standard [7] for regular expressions. If not specified, pattern matching is not attempted. First Digit Timer: The amount of time allowed for the user to enter the first digit. Specified in units of 100 milliseconds. Defaults to five seconds. Inter Digit Timer: The amount of time allowed for the user to enter each subsequent digit. Specified units of 100 milliseconds seconds. Defaults to three seconds. Extra Digit Timer: Cromwell expires May 1999 [Page 9] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 The amount of time to wait for a user to enter a final digit once the maximum expected amount of digits have been entered. Typically this timer is used to wait for a terminating key in applications where a specific key has been defined to terminate input. Specified in units of 100 milliseconds. If not speci- fied, this timer is not activated. Pre-speech Timer: The amount of time to wait for the user to initially speak. Specified in units of 100 milliseconds. Defaults to three seconds. Post-speech Timer: The amount of silence necessary after the end of the last speech segment for the recording to be considered complete. Specified in units of 100 milliseconds. Defaults to two seconds. Total Recording Length Timer: The maximum allowable length of the recording, not including pre or post speech silence. Specified in units of 100 milliseconds. If not specified, this timer is not activated. Restart Key: Defines a command key optionally followed by a sequence of keys with the following action. Discard any recording in progress, replay the prompt, and resume digit collection or recording. No default. Reinput Key: Defines a command key optionally followed by a sequence of keys with the following action. Discard any digits collected or recordings in progress and resume digit collection or recording. No default. Return Key: Defines a command key optionally followed by a sequence of keys with the following action. Terminate the current event and any queued event and return the terminating key sequence to the call processing agent. No default Position Key: Cromwell expires May 1999 [Page 10] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 Defines a key with the following action. Stop playing the current announcement and resume playing at the beginning of the first, last, previous, next, or the current segment of the announcement. No default. Stop Key: Defines a key with the following action. Terminate playback of the announcement. No default. Start Input Keys: Defines a set of keys that are acceptable as the first digit collected. This set of keys can be specified to interrupt a playing announcement or to not interrupt a playing announcement. Defaults to 0-9. End Input Key: Specifies a key that signals the end of user input. Also speci- fies whether or not the key is included in the collected digits. Defaults to #. Number Of Attempts: The number of attempts the user is given to enter a valid digit pattern or to make a recording. Defaults to one. Also used as a return parameter from the Play Collect and Play Record events giving the number of attempts the user made. Amount Played A return parameter from the Play Announcement, Play Collect, and Play Record events indicating the length of an interrupted announcement that was played before the interrupt. Specified in 100 millisecond units. Digits Collected A return parameter from the Play Collect event indicating the digits that were collected. Recording Id A return parameter from the Play Record event indicating the id of a recording that was made. Specified as a unique 32 bit binary integer. Cromwell expires May 1999 [Page 11] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 5. Return Parameters Each event has an associated set of possible return parameters which are listed in the following tables. The parameters themselves are defined in the previous section. ________________________ |Event | Parameters | |______|________________| | pl | none | | pc | ip, lp, na, dc | | pr | ip, lp, na, ri | |______|________________| Here are some examples of how these parameters are used: The Play Announcement event completed successfully. O: pl() The Play Collect event completed successfully on the user's second attempt when the user entered the digits 04375182. O: pc(na:2, dc:04375182) The Play Record event was successful on the user's first attempt; the id of the recording made by the user is 983. O: pr(na:1, ri:983) 6. Variable Qualifiers Variables are be specified by type, subtype, language, and value. Subtype is a refinement of type. For example the variable type Money might have an associated range of subtypes such as Dollar, Rupee, Dinar, etc. Language is encoded using the two letter codes defined in ISO standard 639, Code For The Representation Of Names Of Languages [4]. If a language indicator is not supplied, the the vari- able is played in the system-defined default language. A small excerpt from ISO 639 follows: Cromwell expires May 1999 [Page 12] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 _________________ |Code | Language | |_____|__________| | cs | Czech | | cy | Welsh | | da | Danish | |_____|__________| Variables can be invoked directly as a parameter to an event: S: pl(vb:my,usd,en,1153) or as an element of provisioned audio: S: pl(ps:37,my,usd,en,1153). Nos all variables require a subtype. In that case, the subtype is left blank: S: pl(vb:da,,en,101598). If a variable is to be played in the default language, the language indicator is left blank: S: pl(vb:da,,,101598). In some cases it may be desirable to play an announcement that con- tains an embedded variable without playing the variable itself. To do this, replace type, subtype, language, and value with a single dollar sign: S: pl(ps:37,$). Cromwell expires May 1999 [Page 13] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 ________________________________________________________________________ | Variable Qualifiers | |_______________________________________________________________________| | Symbol | Definition | Type | Subtype | Subtype Of | |_________|__________________________|________|___________|_____________| | da | date | x | | | | di | digits | x | | | | gn | generic | | x | di | | na | North American DN | | x | di | | du | duration | x | | | | my | money | x | | | | nm | number | x | | | | ca | cardinal | | x | nm | | or | ordinal | | x | nm | | si | silence | x | | | | st | string | x | | | | tx | text | x | | | | dt | display text | | x | tx | | ts | text to speech | | x | tx | | ti | time | x | | | | tw | twelve hour format | | x | ti | | tf | twenty four hour format | | x | ti | | tn | tone | x | | | | wk | weekday | x | | | |_________|__________________________|________|___________|_____________| Parameters to the ANN package events are defined as follows: Date: Speaks a date specified as MMDDYY. For example "101598" is spo- ken as "October fifteenth nineteen ninety eight." Digits: Speaks a string of digits one at a time. If the subtype is North American DN, the format of which is NPA-NXX-XXXX, the digits are spoken with appropriate pauses between the NPA and NXX and between the NXX and XXXX. If the subtype is generic, the digits are spoken no pauses. Duration: Duration is specified in seconds and is spoken in one or more units of time as appropriate, e.g. "3661" is spoken as "One hour, one minute, and one second." Cromwell expires May 1999 [Page 14] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 Money: Money is specified in the smallest units of a given currency and is spoken in one or more units of currency as appropriate, e.g. "110" in U.S. Dollars would be spoken "one dollar and ten cents." The three letter codes defined in ISO 4217, Currency And Funds Code List [5] are used to specify the currency sub- type. A small excerpt from ISO 4217 follows: __________________________________________________________ |Alpha-code | Numeric-code | Currency | Entity | |___________|______________|__________|___________________| |GQE | 226 | Ekwele | Equatorial Guinea | |GRD | 300 | Drachma | Greece | |GTQ | 320 | Quetzal | Guatemala | |___________|______________|__________|___________________| Month: Speaks the specified month, e.g. "10" is spoken as "October." Specification is in MM format with "01" denoting January, "02" denoting February, etc. Number: Speaks a number in cardinal form or in ordinal form. For exam- ple, "100" is spoken as "one hundred" in cardinal form and "one hundredth" in ordinal form. Silence: Plays a specified period of silence. Specification is in 100 millisecond units. String: Speaks each character of a string, e.g. "a34bc" is spoken "A, three, four, b, c." Text: Produces the specified text as speech or displays it on a dev- ice. Time: Speaks a time (specified in twenty four hour format) in either Cromwell expires May 1999 [Page 15] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 twelve hour format or twenty four hour format. For example "1700" is spoken as "Five pm" in twelve hour format or as "Seventeen hundred hours" in twenty four hour format. Tone: Plays an algorithmically generated tone, specification of which is tbd. Probably most applications will use prerecorded tones. Weekday: Speaks the day of the week, e.g. "Monday." Weekdays are speci- fied as single digits, with "1" denoting Sunday, "2" denoting Monday, etc. 7. Key Qualifiers The Start Input parameter, which defines a set of keys that are acceptable as the first digit collected, can be qualifited to specify whether or not they interrupt a playing announcement. The End Input parameter, which specifies a key that signals the end of user input, can be qualified to specify whether or not it is included in the col- lected digits which are sent to the call agent. These qualifiers are presented in the following table: ________________________________________________________________________ | Key Qualifiers | |_______________________________________________________________________| | Symbol | Definition | Qualifies | |_________________|_______________________________|_____________________| | ni | Non-interruptible | sik | | in | Interruptible | sik | | ex | Exclude key | eik | | ic | Include key | eik | |_________________|_______________________________|_____________________| Cromwell expires May 1999 [Page 16] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 8. Examples Play a provisioned audio segment in the default language: _________________ | | | S: pl(ps:5) | |_______________| Play the same provisioned audio segment in the English: ____________________ | | | S: pl(ps:5,en) | |__________________| Play a sequence of three provisioned segments: _____________________________ | | | S: pl(ps:5, ps:6, ps:7) | |___________________________| Play three seconds of silence: __________________ | | | S: pl(si:30) | |________________| Play text as speech, specifying the text as a parameter: _____________________ | | | S: pl(ts:hello) | |___________________| Display text on a device, specifying the text as a parameter: _____________________ | | | S: pl(dt:hello) | |___________________| Play "Eleven dollars and fifty three cents" in English, specifying the variable as a parameter: ______________________________ | | | S: pl(vb:my,usd,en,1153) | |____________________________| Cromwell expires May 1999 [Page 17] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 Play "October fifteenth, nineteen ninty eight" in english, specifying the variable as a parameter. This is an example of a variable without a subtype: _____________________________ | | | S: pl(vb:da,,en,101598) | |___________________________| Play a segment followed by 1 second of silence, followed by a variable date, followed by another segment, specifying this sequence as using parameters: ___________________________________________________ | | | S: pl(ps:45, si:10, vb:da,,en,101598, ps:534) | |_________________________________________________| The same operation as above, however the sequence of segment, silence, variable, and segment is defined in data as provisioned segment 37: ________________________________ | | | S: pl(ps:37,da,,en,101598) | |______________________________| Play an announcement 10% faster than normal speed and 5% softer than normal volume: ________________________________ | | | S: pl(ps:7, sp:+10, vl:-5) | |______________________________| Play an announcement three times with two seconds of silence between plays: _______________________________ | | | S: pl(ps:98, it:3, iv:20) | |_____________________________| The same operation as above only the operation is terminated after twenty seconds: _______________________________________ | | | S: pl(ps:98, it:3, iv:20, du:200) | |_____________________________________| Cromwell expires May 1999 [Page 18] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 Clear the digit buffer before playing the prompt, and give the user two attempts to enter a two digit pattern, each digit being in the range 0-9. The user can signal end of input using the # key. _______________________________________ | | | S: pc(ip:87, cb:true, mx:3, na:2) | |_____________________________________| Give the user three chances to enter a three digit pattern, each digit of which is in the range 0-9. If the first or second attempts fail (for example if the user enters just one digit or enters three digits), a reprompt is played. If the user enters no digit on the first or second prompt a no digits reprompt is played. If all three attempts fail, a failure announcement is played. If one of the attemps is successful, a success announcement is played. The user can signal end of input using the # key. _________________________________________________________ | | | S: pc(ip:87, re:5, nd:409, fa:9, sa:18, mx:3, na:3) | |_______________________________________________________| Give the user a single attempt to enter a 4 digit pattern, each digit in the range 0-9, allow 8 seconds for the user to enter the first digit, and allow 6 seconds for the user to enter each subsequent digit. The user can signal end of input using the # key. _______________________________________ | | | S: pc(ip:4, fdt:80, idt:60, mx:4) | |_____________________________________| Give the user one chance to enter 2 digits where the first digit is 3,4, or 5 and the second digit is any digit except 5, 6, or 7. _________________________________ | | | S: pc(ip:8, pa:[3-5][^567]) | |_______________________________| Give the user three chances to enter an 11 digit number that begins with 0 or 1. If the user makes a mistake while entering digits, he can press the * key to discard any digits already collected, replay the prompt, and resume collection. __________________________________________________ | | | S: pc(ip:33, mx:11, sik:0-1,ic, rsk:*, na:3) | |________________________________________________| Give the user three chances to enter an 11 digit number that begins with 0 or 1. If the user makes a mistake while entering digits, he can press the key sequence *11 to discard any digits already Cromwell expires May 1999 [Page 19] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 collected, replay the prompt, and resume collection. If the user enters the key sequences *12, *13, or *14, the current event is terminated along with any queued event, and the terminating key sequence is returned to the call agent for processing. ___________________________________________________________________ | | | S: pc(ip:33, mx:11, sik:0-1,ic, rsk:*,11 rtk:*,12,13,14 na:3) | |_________________________________________________________________| Give the user two chances to make a recording. After playing the prompt, wait 5 seconds for the user to speak, otherwise replay the initial prompt and try again. If the user does speak, wait for seven seconds after speech stops to make sure the user is finished, and return a reference to the the recording to the call agent. _______________________________________ | | | S: pr(ip:6, pst:50, pot:70, na:2) | |_____________________________________| 9. References [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [2] Arango, M. , Dugan, A., Elliott, I., Huitema, C., and Pickett, S., "Media Gateway Control Protocol (MGCP)", Version 0.1, November 9, 1998, INTERNET DRAFT (this is a work in progress and currently has no status as a standard). [3] Cromwell, D., Durling, M., "Requirements For Control Of A Media Services Function", Version 0.0, November, 1998, INTERNET DRAFT (this is a work in progress and currently has no status as a standard). [4] ISO 639, "Code For The Representation Of Names Of Languages", 1998. [5] ISO 4217, "Currency And Funds Code List", 1981. [6] Tools.h++ Class Reference Version 7, Rouge Wave Software Inc., 1996. [7] ANSI/IEEE Standard 1003.2 (Portable Operating System Interface), Version D11.2, September 1991. Cromwell expires May 1999 [Page 20] INTERNET DRAFT A Syntax For The MGCP Audio Package November 1998 10. Author's Address David Cromwell Nortel Networks Box 13010 Research Triangle Park, NC 27709 Phone: (919) 992-1373 Cromwell expires May 1999 [Page 21]