Internet Engineering Task Force Janusz Dobrowolski Internet Draft Kumar Vemuri draft-dobrowolski-voip-cm-01.txt Lucent Technologies Nov 10, 2000 Expires: May 10, 2001 IPTel Working Group Internet-based Service Creation and the Need for a VoIP Call Model STATUS OF THIS MEMO This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract: This Internet Draft explains the concept of a Telephony Call Model in some detail using a simple example. We then explore how support for a common Call Model among a plurality of network nodes each of which performs service processing functions facilitates more rapid service creation and promotes feature transparency. 1.0 Introduction: A previous Internet-Draft[1] titled "Call Model For IP Telephony" raised a number of issues related to the service aspects of the IP Telephony Call Model. This I-D is an attempt to provide more clarifications in this area, mainly by studying some of the issues in greater detail. The Internet today through VoIP technology, supports a viable means of making and receiving telephone calls over packet networks. The main goals of VoIP telephony include: I. Making the IP telephony paradigm as close as possible to the Internet paradigm. II. Making the telephony service creation paradigm as close as possible to the Internet service creation environment J. Dobrowolski et al Internet Draft [Page 1] Internet-based Service Creation and the Need for a VoIP Call Model III. Enabling IP Telephony Enhanced Services to execute in identical manners in terms of end user look and feel regardless of the service provider execution platform. Several benefits are expected to be derived from use of this technology, as a result of the above focus. Some of these include: a. new and powerful Internet-based service creation mechanisms and paradigms can be more easily supported thus permitting independent third parties (including, but not just limited to telecommunications companies) to create and deploy newer, more attractive, multimedia services. (Ref I & II above). b. there is a low-barrier to entry, shorter learning curve (as compared to application development in the telephony industry), and there are orders of magnitude more developers in the Internet space than there are in the telecom space. Thus more services can be deployed in shorter intervals. Internet mechanisms are more open. Several industry fora such as PARLAY [2], OSA [3] and JAIN [4] are targeting a regulated means for opening up telephone networks to external developers through secured, controlled mechanisms. This makes network services more accessible to third party programmers, and enables IP Telephony Enhanced Services to execute in identical manners in the terms of end user look and feel regardless of the service provider execution platform. (Ref III above). c. Various Standardization bodies and Partnership Projects work on the issue of assuring an identical look and feel of a service to an end user. One of the approaches is to use the Virtual Home Environment (VHE). Without going into details and definitions of VHE one can state that at least one of the VHE approaches advocates an approach of service emulation in the visited network. This is a disadvantage as compared to the Internet paradigm where the end-user typically access the same service regardless of the Internet access provider. (Ref I above). In the rest of this document, we briefly study Internet Service Creation Environments and then present how support for a common Call Model in the VoIP domain would be beneficial from the service creation and execution perspective. 2.0 Internet Service Creation Environments: SIP[5] and H.323[6] are the most popular call signaling protocols in most common use in today's packet networks for VoIP. Each has its own service creation mechanisms. H.323 has the H.450.x [7] series of standards, each of which is targeted at the additional signaling required for each new service. H.323 also supports service deployment on gatekeepers when the "gatekeeper-routed call signaling" model is used. SIP supports three different service creation techniques - the SIP CGI [8], CPL [9] and Servlet [10] mechanisms. One of these techniques, namely the SIP CPL, was designed to be signaling architecture and protocol agnostic. In other words, it is capable of supporting both H.323- and SIP- compliant network nodes. J. Dobrowolski et al Internet Draft [Page 2] Internet-based Service Creation and the Need for a VoIP Call Model To quote from the specification: "If an output to a node is not specified, it indicates that the CPL server should perform a node- or protocol-specific action. Some nodes have specific default actions associated with them; for others, the default action is implicit in the underlying signalling protocol, or can be configured by the administrator of the server." 3.0 Signaling Protocol Independence for Service Implementation: This capability for services to function in a protocol or signaling architecture independent manner is fairly important. There may exist protocol level differences that make the execution of certain services impossible within certain contexts, and properly thought -out protocol-agnostic design can lessen discrepancies in application behavior. Let us consider a simple example. Let's assume that the following application has been implemented as a CPL script: "If the waiting time for a call to be answered exceeds 15 sec, and the caller is calling from a multimedia terminal, then present the caller with an offer of an animated commercial displayed before or after the call is connected, in exchange for a 10% discount on a call". (Say the caller has to click on a button or somehow interact with the "Ad Applet" on his screen to avail of the discount). Now let's assume that one of the execution platforms is not aware of the "StableCall" (Call Active) state. Implementation of this feature would then be impossible on this platform. (You cannot give someone a discount without knowing whether or when the call was set up or torn down). This also brings up the point that it is important to track states as call processing proceeds, for it is by such close state-tracking that one can provide new and innovative services to improve the quality of user experience. Systems that execute enhanced services today implement such state machines, called Call Models. Some VoIP protocols like SIP also support deployments which are edge-centric in terms of call-model deployment (Both the UAS (User Agent Server) and the UAC (User Agent Client) implement state machines, but at the end-points). 4.0 Call Model Explanation: The example which follows is presented to make a point that a Call Model always exists, even if the State Machine is not physically implemented. The example also explains a relationship between protocol and signaling. Definition: Call Model- A Call Model is an abstract representation of user and/or terminal and/or network expectation built during the process of establishing, progressing and terminating a call. A Call Model is most conveniently represented using the notation of a graphical Finite State Machine (Moore and Mealy state machines are commonly used). J. Dobrowolski et al Internet Draft [Page 3] Internet-based Service Creation and the Need for a VoIP Call Model While the terms "Call Model", "Call State Machine" and "Finite State Machine" are often used interchangeably, it is probably worthwhile discussing subtle differences between these terms, for clarity. Finite State Machines or FSMs embody processing within states and state transitions supported by logical entities within particular systems. FSMs can be used to describe the states of a large number of different kinds of systems. Complete system behavior can be represented by a collection of a cooperating Finite State Machines. One such application of Finite State Machines is in modeling the states associated with a call. In such cases, we call these "Call State Machines". Call state machines may represent call state from the user, network, or terminal perspective. A related application of FSMs lies in modeling states of a protocol - "Protocol State Machines" are a result of this use. A Call Model is a set of call state machines which when taken together, represents the complete state of a given call. Depending on what kind of call model is being used, and the number of call state machines needed or supported, the call model may comprise of one, two or more call state machines for a single two-party call. A Call Model contains FSM states which are abstract in the sense of not being associated necessarily with any logical or physical entity. For example, in traditional telephony systems today, a "half-call" model is used. Thus, there are two FSMs on every switch involved in a two-party call - one of which models the state of the "originating" half-call, associated with the calling party, the other which models the state of the "terminating" half-call, tied to the call state of the called party. Thus, there is a single call model, and two different call state machines. (See Figure 0). Calling Party's Called Party's Telephone Telephone *====* *====* | | | | +--+ +---------+ +---------+ +--+ | | {O-FSM} | | {O-FSM} | | +----+ +-------+ +----+ | {T-FSM} | | {T-FSM} | +---------+ +---------+ Switch 1 Switch 2 Figure 0: Call Model Example: Half-call Model. We explain the concept of a Call Model in very simple terms, using the example of a "two cans, one string" model - that small children play with, to communicate. This model consists of two cans A and B, that each have a small hole drilled at the bottom. A thread or string is then pulled through the holes connecting the cans together, and knotted on the inside. We embellish this simple model with a bell being fastened to each of these two cans. Operation is simple. The two cans are placed on two tables. Alice (aged 8) wants to call Bob (aged 7). She picks up one of the cans, and tugs lightly at the string. Bob's phone moves, and the attached bell rings. Bob picks up his can, and (assuming the string is taut), conversation commences. Eventually, one of them puts their can down, and the "call" is terminated. J. Dobrowolski et al Internet Draft [Page 4] Internet-based Service Creation and the Need for a VoIP Call Model +///---+ +-///--+ |can A |-------------------|can B | +------+ +------+ Can A and can B with attached bells rest on a counter top (Idle, NULL state). Alice picks up can A. (Off-hook). +-///--+ +--///-+ |can A |-------------------|can B | +------+ +------+ <-------- Alice gently tugs on the string. (Call Sent). Can B "rings" (Alerting) +-///--+ --------> +--///-+ |can A |-------------------|can B | +------+ <------- +------+ Bob picks up can B. (Connected). Communication commences (Active Call). +///---+ +-///--+ |can A |-------------------|can B | +------+ +------+ Alice or Bob puts down his or her can. (Disconnected). The other party hangs up as well. Figure 1: Simple Call Model example. "Two Cans, One String" model. The comments in parenthesis represent call states for this simple case. This very simple example illustrates several very important aspects of a Call Model and the finite state machine associated with a call. Even though this is very minimalistic in scope, it still supports several states. For instance: a. initially both cans are on their respective tables. No call exists. This is the NULL or IDLE state. b. Alice picks up the phone, she has just "gone off-hook". (of course, there is no dial tone, and no dialing of digits). c. Alice tugs lightly at the cord. Similar to "Call Sent" in some of the existing telephony Call Models. d. Bob's can "rings". "Alerting". e. Bob answers the phone, and conversation ensues. "Call Active". f. Alice or Bob places their can back on the counter. "Disconnected". J. Dobrowolski et al Internet Draft [Page 5] Internet-based Service Creation and the Need for a VoIP Call Model Admittedly, this is a very simplistic model. In practice however, more states would be added in order to allow for the creation of classes of applications that could be possibly implemented. But the basic concept of the Call Model remains the same. The above example shows that the mutually agreed-to Call Model exists with simple state machines (agreed-upon procedures) "executed" by the callee and caller. This is also an indication that the Call Model is a higher level of agreement (more fundamental) than protocol or signaling. +----------------+ | Origination | Originating +------->| Attempt |------>+ Called_Party _Terminal | +----------------+ | _Selected _Seized | | (Chosen string (Alice's can up)| | to Bob) | | | V +----------------+ +----------------+ | Idle | | Addressing | | State | | | +----------------+ +----------------+ Disconnect ^ | Called_Party _Acknowledge | | _Alerted (Alice's can | V (Pulled string down) +----------------+ +----------------+ to Bob) | Disconnect | | Alerting | | | | | +----------------+ +----------------+ ^ | | | Disconnect | | Connect _Request | | _Acknowledge (Bob's can | +----------------+ | (Bob lifts his down) +--------| Stable |<------+ can) | Call | +----------------+ Figure 2: An Example Call Model (Network Perspective). Alice is the Caller, Bob the Callee. Bob disconnects the call after communication. ("Cans and String" model). Let's assume that tugging the cord was replaced with raising a hand. This would be an example of retaining the same Call Model and the same Finite State Machine while changing the signaling. Now let's assume that Bob always keeps the can next to his ear and ignores the tug on the string. This would change the Call Model and the State Machines. Signaling (tugging) would not be affected while becoming somewhat redundant (not really needed). J. Dobrowolski et al Internet Draft [Page 5] Internet-based Service Creation and the Need for a VoIP Call Model Here, the Call Model state machines are (logically) executing within the minds of the two communicating parties. They monitor external behavior (signaling), and determine what actions are required of them to make the communication work. Once these simple actions are carried out, communication commences. In real networks it is the finite state machine that executes on every node on the network that satisfies user, network and end-point expectations with respect to signaling interactions during call setup, processing, stable state, possible roaming state in a wireless network and tear-down. It should be noted that though the finite state machines executing on different network nodes may be different, the most consistent user experience results when the state machines comply with the same Call Model. 5.0 Services-related Considerations: It therefore stands to reason that if a service creation model was built to directly leverage the states of a given Call Model, then there would be no deviation from user expected behavior provided every network node complied with the same Call Model. For example, if a service creation environment in common use supported the development of scripts that instructed network nodes what to do when certain events (say Ea, Eb1, Eb2 and Ec) occured during call processing in states A, B and C in a Call Model respectively, then unless every node that executes said script does indeed support each of these states, it is possible that the behavior that results from the execution of this script is a function of the state machine upon which script-suggested changes were invoked (say feature X requires that the call transition from state A to B on event Ea, but the node implements a Call Model with no state B!). For example, if a feature "Call Forward on No Answer" were to be deployed into a network, but one or more network call processing nodes had no support for a NoAnswerTimer (that times out if a call was not answered after a certain number of rings), then those nodes would be incapable of supporting said feature, simply because they would be incapable of timing out were a call not to be answered within a pre-specified time-window. The CPL specification does consider this issue. To quote from the specification: "The language is also designed so that a server can easily confirm scripts' validity at the time they are delivered to it, rather that discovering them while a call is being processed." Network nodes executing a Call Model could easily validate scripts by comparing the set of call states and corresponding actions to be taken against the set of states contained within the supported Call Model. However, a common Call Model has the additional advantage that script developers can confidently build a script that they know will execute seamlessly on ALL network nodes, even across boundaries of administrative domains. J. Dobrowolski et al Internet Draft [Page 7] Internet-based Service Creation and the Need for a VoIP Call Model Features (and feature developers) often assume a certain level of "similarity" between network nodes, and elements that violate this assumption might lead to unexpected changes in application behavior. In other words, a given feature may execute differently across different nodes, unless some commonality between these various network nodes was guaranteed. That "commonality" is precisely what the Call Model offers. This makes the case for a common Call Model in the VoIP domain. It is to be further noted, that as the number of states in this Call Model state machine increases (upto a threshold beyond which the rise in complexity offsets these benefits), there is a greater opportunity for service developers to add more services. Services add value during various stages of call processing and thereby positively impact the user experience. This suggests that we adopt a Call Model with a large number of states as the base Call Model in the VoIP domain, while ensuring that the adopted model satisfies multimedia considerations, so that newer kinds of services may be more efficiently supported. The difficulty of retaining the identical look and feel in existing telephony (sometimes also called feature transparency) is directly related to the proliferation of different variations of Call Models. Minor variances may be tolerated, so long as they do not impact service execution. For instance, the Internet supports several TCP/IP variants, but the State Models used are similar enough in most cases to permit seamless interoperation [11]. SIP CPL is designed to be protocol and signaling architecture agnostic. However, the interpretation of the tags themselves when CPL scripts are delivered is up to the target nodes that received said scripts, based on the protocols they execute. Thus, though the language itself factors out signaling level considerations, the resulting behavior at script execution may still be a function of the underlying signaling protocol. A common Call Model would help ensure that expected behavior results most, if not all, of the time. It may not solve all problems in that regard, but it is a major step in the direction of ensuring uniform service behavior across the network. 6.0 Conclusion: A common Call Model when deployed for VoIP could significantly help accelerate the development of services by third-party application developers writing to Internet paradigms. This would also support seamless operation of services across a diverse set of network nodes, and in Multi-Vendor Environments (MVEs). 7.0 Security Considerations: This draft addresses general Call Model requirements in the IP Telephony domain. At this level of discussion, specific security considerations do not apply. J. Dobrowolski et al Internet Draft [Page 7] Internet-based Service Creation and the Need for a VoIP Call Model More generally it stands to reason that the signaling exchanges between IP-based network entities should be secured from tampering to ensure that call setup, communication, and tear-down take place in a manner that is consistent with user-expected behavior. Security mechanisms in common use in the Internet domain today could be employed to ensure that these requirements are met. 8.0 References: [1] Call Model For IP Telephony, IETF Internet Draft, Janusz Dobrowolski et al, IPTel Working Group, Work in Progress. [2] PARLAY, [3] 3rd Generation Partnership Project, Technical Specification Group Services and Systems Aspects, "Virtual Home Environment/ Open Services Architecture", 3GTS 23.127, July 2000. [4] JAIN, [5] Handley, et al, 'SIP: Session Initiation Protocol', RFC 2543, Internet Engineering Task Force, March 1999. [6] Recommendation H.323 (02/98) - Packet-based multimedia communications systems [7] Recommendation H.450.1 (02/98) - Generic functional protocol for the support of supplementary services in H.323 [8] Common Gateway Interface for SIP, J. Lennox, J. Rosenberg, H. Schulzrinne, IETF Internet-Draft. Work in progress. [9] CPL: A Language for User Control of Internet Telephony Services, J. Lennox, H. Schulzrinne, IETF Internet-Draft. Work in progress. [10] The SIP Servlet API, A. Kristensen, A. Byttner, IETF Internet-Draft, Expires March 2000. Work in progress, [11] Why we don't know how to simulate the Internet (Section 4.2), Sally Floyd and Vern Paxson, ACIRI, Berkeley, CA, October 11, 1999. 9.0 Authors' addresses: Janusz Dobrowolski, Lucent Technologies, 263 Shuman Blvd. Naperville, IL 60566 USA. jdobrowolski@lucent.com Kumar Vemuri, Lucent Technologies, 263 Shuman Blvd. Naperville, IL 60566 USA. vvkumar@lucent.com 10.0 Acknowledgments: The authors would like to thank John Stanaway and John Voelker for reading through earlier versions of the draft and providing some insightful comments. We would also like to thank Milo Orsic for interesting discussions on this and related topics. J. Dobrowolski et al Internet Draft [Page 7] Internet-based Service Creation and the Need for a VoIP Call Model 11.0 Full Copyright Statement: Copyright (C) The Internet Society (2000). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."