Network Working Group J. Babiarz Internet-Draft X-G. Liu Intended status: Informational K. Chan Expires: December 31, 2007 Nortel M. Menth University of Wuerzburg June 29, 2007 Three State PCN Marking draft-babiarz-pcn-3sm-00 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on December 31, 2007. Copyright Notice Copyright (C) The IETF Trust (2007). Abstract This document proposes metering and marking mechanisms for PCN- enabled nodes to label packets with pre-congestion information. The marker marks all PCN packets with an admission-stop (AS) codepoint if the PCN traffic rate on a link exceeds its admissible rate (AR) and when it exceeds its supportable rate (SR), it marks some of those Babiarz, et al. Expires December 31, 2007 [Page 1] Internet-Draft Three State PCN Marking June 2007 packets exceeding SR with an excess-traffic (ET) codepoint. The flows with ET-marked packets will be terminated until the aggregate PCN traffic on the path decreases below its SR. This document proposes metering and marking mechanisms for these objectives. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Requirements Notation . . . . . . . . . . . . . . . . . . 4 1.2. Overview of PCN . . . . . . . . . . . . . . . . . . . . . 4 1.3. Terminology used in this Document . . . . . . . . . . . . 6 2. Three State PCN Marker with Marking Frequency Reduction for Marked Flow Termination . . . . . . . . . . . . . . . . . 8 2.1. SR-Meter and ET-Marker . . . . . . . . . . . . . . . . . . 9 2.1.1. Mechanism for SR-Metering and ET-Marking . . . . . . . 9 2.1.2. Pseudo Code for the SR-Meter and ET-Marker . . . . . . 9 2.1.3. Configuration of the SR-Meter and ET-Marker . . . . . 10 2.1.4. Characteristics of the Proposed SR-Meter and ET-Marker Behaviour . . . . . . . . . . . . . . . . . 10 2.2. AR-Meter and AS-Marker . . . . . . . . . . . . . . . . . . 11 2.2.1. Mechanism for AR-Metering and AS-Marking . . . . . . . 11 2.2.2. Pseudo Code for AR-Meter and AS-Marker . . . . . . . . 12 2.2.3. Configuration of the AR-Meter and AS-Marker . . . . . 12 2.2.4. Characteristics of the Proposed AR-Metering and AS-Marking Behaviour . . . . . . . . . . . . . . . . . 12 2.3. Marking Codepoints . . . . . . . . . . . . . . . . . . . . 13 3. Security Considerations . . . . . . . . . . . . . . . . . . . 13 4. Changes from Previous Revision . . . . . . . . . . . . . . . . 14 4.1. Acknowledgements . . . . . . . . . . . . . . . . . . . . . 14 5. Informative References . . . . . . . . . . . . . . . . . . . . 14 Appendix A. Overview of Token Bucket (TB) and Virtual Queue (VQ) . . . . . . . . . . . . . . . . . . . . . . . . 15 A.1. Virtual Queue (VQ) . . . . . . . . . . . . . . . . . . . . 15 A.2. Token Bucket (TB) . . . . . . . . . . . . . . . . . . . . 16 A.3. Tail Marking . . . . . . . . . . . . . . . . . . . . . . . 16 A.4. Tail Marking with Marking Frequency Reduction . . . . . . 17 A.5. Threshold Marking . . . . . . . . . . . . . . . . . . . . 18 A.6. Related Work . . . . . . . . . . . . . . . . . . . . . . . 19 Appendix B. Discussion of PCN Characteristics . . . . . . . . . . 20 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22 Intellectual Property and Copyright Statements . . . . . . . . . . 24 Babiarz, et al. Expires December 31, 2007 [Page 2] Internet-Draft Three State PCN Marking June 2007 1. Introduction Pre-Congestion Notification (PCN) builds on the concepts of [RFC3168], "The addition of Explicit Congestion Notification (ECN) to IP". It is used to implement admission control and flow termination for real-time flows (such as voice, video and multimedia streaming) in DiffServ [RFC2474], [RFC2475] enabled networks. Flow admission control determines whether a new flow can be added into the network without overloading any of its links, whereas flow termination reduces the current PCN traffic load by terminating marked flows when at least one link in the network is overloaded for some reason. For each link in a PCN-enabled network an admissible rate (AR) is configured. When the current PCN traffic rate on a link exceeds its AR, the link is said AR-overloaded. Then, the corresponding PCN node re-marks all PCN packets on this link with an "admission-stop" (AS) codepoint. The PCN egress nodes analyze the packet markings and if sufficiently many packets are AS-marked within an ingress-egress aggregate, the PCN egress nodes signal "admission-stop" for this aggregate to the appropriate admission control entity. The objective is to avoid that the admitted PCN traffic on the links exceeds their ARs. When the PCN egress nodes stop receiving AS-marked packets, they signal "admission-continue" after some time. Similarly, a supportable rate (SR) is configured for each link in a PCN-enabled network. When the current PCN traffic rate on a link exceeds its SR, the link is said SR-overloaded. Then, the corresponding PCN node re-marks some of the PCN packets on this link with an "excess-traffic" (ET) codepoint. The PCN egress nodes pass the marking information to the appropriate flow termination entity (e.g. at the respective PCN ingress nodes) to terminate flows in order to reduce the PCN traffic rate of the SR-overloaded link below its SR. The reader is referred to [PCN Architecture draft] for details. A simple and intuitive approach for flow termination works as follows. When every packet exceeding the SR is re-marked to ET, the excess traffic rate can be determined at the PCN egress node by measuring the rate of ET-marked packets. This rate is signalled to the appropriate flow termination entity to choose a suitable set of flows for termination. However, this method has two major drawbacks. 1. A trustworthy measurement result takes time and delays the flow termination. 2. In case of multipath routing, the flows of a single ingress- egress aggregate may take different paths. Terminating arbitrary flows of an ingress-egress aggregate may result in the Babiarz, et al. Expires December 31, 2007 [Page 3] Internet-Draft Three State PCN Marking June 2007 termination of flows that do not contribute to the SR-overload of the bottlenecked link. This document proposes "marked flow termination" with "marking frequency reduction" to solve these problems. In order to terminate only flows that really contribute to the experienced SR-overload, only ET-marked flows must be terminated (marked flow termination). Terminating all ET-marked flows leads to a very fast reaction because no rate measurement at the PCN egress is required. However, if all PCN packets exceeding SR are ET-marked, this terminates more flows than necessary. Therefore, we propose to reduce the marking frequency by a slow-down parameter "s" to control the speed of the flow termination in order to avoid that more flows than necessary are terminated. Thus, the objective of "marked flow termination" with marking frequency reduction is to quickly terminate the right flows and the right number of flows in case of SR-overload. This document defines metering behaviours to quickly detect whether the PCN traffic rate on a link has exceeded its admissible or supportable rate and marking behaviours to label packets with admission-stop and excess-traffic information accordingly. In particular, marking frequency reduction is supported by the proposed ET-marker. This yields "three state PCN marking" because packets are labelled either with "no pre-congestion" (NP), "admission-stop" (AS), or "excess-traffic" (ET). The document is structured as follows. After a short introduction to PCN and the used terminology, we explain the admission control and "marked flow termination" functions. Then, "three state PCN marking with marking frequency reduction" is described which is suitable to support "marked flow termination". The appendix summarizes background information about token bucket (TB) and virtual queue (VQ) metering and marking as they are the base for the proposed metering and marking mechanisms, and it discusses the behaviour of the proposed three state PCN marking with marking frequency reduction for "marked flow termination". 1.1. Requirements Notation The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 1.2. Overview of PCN PCN traffic can be classified by DSCP or group of DSCPs and is forwarded by an appropriated PHB. PCN configures for each link an admissible and a supportable rate (AR, SR). PCN traffic enters the Babiarz, et al. Expires December 31, 2007 [Page 4] Internet-Draft Three State PCN Marking June 2007 network with a "no-precongestion" (NP) mark. PCN nodes meter the PCN traffic on every link. When the PCN traffic rate on a link exceeds the corresponding AR, the PCN node re-marks all NP-marked PCN packets to "admission-stop" (AS). Similarly, they re-mark some non-ET-marked PCN packets to "excess- traffic" (ET) when the PCN traffic rate on a link exceeds the corresponding SR. Figure 1 summarizes the relation between the AR and SR thresholds and the marking behaviour. The SR normally is at least a delta above the AR and a delta below the maximum service rate for PCN traffic for the sake of stability of the measurement-based reactive system. If the PCN traffic rate is below AR, no packets are re-marked; if it is between AR and SR, all NP-marked PCN packets are re-marked to AS; and if it is above SR, some non-ET-marked PCN packets are re-marked to ET and all other NP-marked PCN packets are re-marked to AS. Hence, the meters and markers operate in a marking- aware mode: NP-marked packets can be remarked to AS or ET, AS-marked packets can be remarked to ET but not to NP, and ET-marked packets cannot be remarked at all. PCN traffic rate 100%^ | AR- and SR-overload: | re-mark SOME non-ET-marked | packets to ET and the remaining to AS, | indicating that AR and SR are exceeded Supportable rate|---------------------------------------------- | AR-overload: | re-mark ALL NP-marked packets to AS, | indicating admissible rate is exceeded Admissible rate |---------------------------------------------- | | No overload: do not re-mark any packets | 0%+-------------------------------------------------> Figure 1: Packet re-marking by PCN nodes. PCN egress nodes monitor the PCN traffic per ingress-egress aggregate. If they detect sufficiently many AS-marked packets, they relay this information to the appropriate admission control entities. More specifically, if most of the packets are AS-marked, the PCN traffic rate of at least one of the links in the ingress-egress path has exceeded its admissible rate, i.e. it is AR-overloaded. Therefore, the PCN egress signals "admission-stop" to the admission control entity for the corresponding ingress-egress aggregate such that no further flows are admitted. As an alternative, probing may Babiarz, et al. Expires December 31, 2007 [Page 5] Internet-Draft Three State PCN Marking June 2007 be used for admission control. The PCN ingress node generates and sends probe packets to test and verify the pre-congestion level on the ingress-egress path. The PCN egress intercepts these probe packets. If they are AS- or ET-marked, the requesting flows are not admitted (blocked). Probing is useful or even essential under the following conditions: o when an ingress-egress aggregate carries no traffic, o in the presence of multipath routing, when some paths are already AR-overloaded, but others are not, and further flows should be admitted if they use a non-overloaded path. Although flows are admitted only if the PCN traffic rate does not exceed the AR on any link of their paths, it is possible that the PCN traffic rate on a link exceeds the SR, e.g., due to changed sending behaviour of admitted flows or due to route changes after a failure. PCN egress nodes monitor the PCN traffic and if they observe ET- marked packets, they send the information about the corresponding flows to the flow termination entity (e.g. the appropriate PCN ingress node) for possible termination. If several ET-marked packets arrive at the PCN egress node within a short interval, the PCN egress node may collect the information of several ET-marked flows and send them in a single message to the appropriate flow termination entity to reduce the signalling rate. As a consequence of the ongoing flow termination process, the PCN traffic rate decreases on the SR- overloaded link until it drops below the SR such that no further PCN packets are ET-marked. It is important to terminate only ET-marked flows because other flows of the same ingress-egress aggregate may take different paths such that they do not contribute to the observed SR-overload. This description of flow termination is just one possible approach and other methods are possible with the marking proposed in this document. 1.3. Terminology used in this Document We provide a brief definition of the terminology used in this document. o PCN - Pre-Congestion Notification meters traffic rates per service class on a link and notifies the PCN egress nodes using packet marking whether a certain rate threshold is exceeded on any link of the path through the PCN-enabled network taken by the packet. The rate thresholds may be significantly lower than the line rates such that PCN egress nodes are notified long before queues build up in the buffers and real congestion occurs. PCN is intended for Babiarz, et al. Expires December 31, 2007 [Page 6] Internet-Draft Three State PCN Marking June 2007 the implementation of measurement-based admission control and flow termination for real-time inelastic traffic, e.g., voice. The PCN marking in the packet headers need to be standardized. o ECN Field - Refers to the use of the standardized two bit field in the IP header that is used for signalling Explicit Congestion Notification [RFC3168]. In the PCN framework the ECN field maybe reused to signal two levels of Pre-Congestion Notification Marking. o Service class - By service class we mean a grouping of packets belonging to one or more applications or services that generated traffic with similar characteristics and requiring similar QoS treatment. See [RFC4594] for details. o Admissible Rate - A rate threshold configured for links in the network; if it is exceeded by PCN traffic, the link is said to be AR-overloaded and no further flows using the AR-overloaded link should be admitted. AR-overload is a type of pre-congestion. o Supportable Rate - A rate threshold configured for links in the network; if it is exceeded by PCN traffic, the link is said to be SR-overloaded and some flows using the SR-overloaded link should be terminated. SR-overload is a type of pre-congestion. o Admission Control - It is the function of admitting or blocking requests of new flows or sessions for access to the network to prevent AR-overload. o Flow Termination - It is the function of terminating already admitted flows in the sense that they cannot continue to send PCN traffic. Only flows contributing to SR-overload are terminated to reduce SR-overload. o No pre-congestion (NP) - Default marking for PCN packets that have not been carried over links with any type of pre-congestion (AR- overload or SR-overload). o Admission-stop (AS) - Marking for packets to indicate that they have been carried over at least a single link with AR-overload. o Excess-traffic (ET) - Marking for packets to indicate that they have been carried over at least a single link with SR-overload. Babiarz, et al. Expires December 31, 2007 [Page 7] Internet-Draft Three State PCN Marking June 2007 2. Three State PCN Marker with Marking Frequency Reduction for Marked Flow Termination The three state PCN marker (3sM) meters PCN packet streams per link and performs packet re-marking according to Figure 1. As a consequence the following three marking states are required: o no pre-congestion (NP), o admission-stop (AS), and o excess-traffic (ET). In theory, the meter meters each packet and passes the packet and the metering result to the marker and the marker marks packets according to the results of the meter according to Figure 2. However, the two functions are described by integrated algorithms. +------------+ | Result | | V +-------+ +--------+ | | | | Packet stream ===>| Meter |===>| Marker |===> Marked packet stream | | | | +-------+ +--------+ Figure 2: Block diagram of meter and marker function. The meter and marker controlling AR and marking packets with AS are called AR-meter and AS-marker and the meter and marker controlling SR and marking packets with ET are called SR-meter and ET-marker. The marking may be coded in the ECN field [RFC3168] of the packet for a specified PHB in a specific manner. In the following, we choose to explain the behaviour of both meters and markers based on a token bucket (TB). Similar metering and marking algorithms have been used (srTCM) [RFC2697] and are readily available in today's routers. However, our description does not mandate TB-based implementations. The same behaviour can be achieved from a similar implementation based on a virtual queue (VQ) or by any other approach. The concepts of TB and VQ together with tail and threshold marking as well as reduced marking frequency are explained in the Appendix A. The packet sizes counted by the meters and markers pertain to the size of the IP packet including its header bytes. Babiarz, et al. Expires December 31, 2007 [Page 8] Internet-Draft Three State PCN Marking June 2007 2.1. SR-Meter and ET-Marker We explain the mechanism for SR-metering and ET-marking, give pseudo code, explain its configuration, and discuss its behaviour. We use object-oriented notation for most variables. 2.1.1. Mechanism for SR-Metering and ET-Marking We propose an SR-meter and ET-marker based on a token bucket with tail marking and marking frequency reduction (see Appendix A for explanation). The TB has a bucket of size TB.size which is continuously filled with tokens at rate TB.rate. When a non-ET- labelled PCN packet arrives, it is re-marked with "ET" if the fill state of the bucket (TB.fill) in tokens is smaller than its size (packet.size) in bytes and "s" additional tokens are added to the bucket; otherwise, the fill state is reduced by packet.size tokens. The slow-down parameter "s" reduces the marking frequency of the mechanism. If an ET-marked packet arrives, the TB's fill state is also incremented by "s" (this is an option and needs further discussion see Section 2.1.4). 2.1.2. Pseudo Code for the SR-Meter and ET-Marker The behaviour of the token bucket with tail marking and marking frequency reduction for SR-metering and ET-marking is expressed by the following pseudo code using object-oriented notation. It requires the time TB.lastUpdate at which the fill state of TB was last updated and a global variable "now" providing the current time. A PCN packet has the variables packet.mark showing its marking (NP, AS, ET) and packet.size showing its size. Input: pcn packet TB.fill = min(TB.size, TB.fill + TB.rate * (now - TB.lastUpdate)); if (packet.mark<> ET) / / if packet is not ET-marked if (TB.fill < packet.size) packet.mark = ET; TB.fill = min(TB.size, TB.fill + s); else TB.fill = TB.fill - packet.size; endif else TB.fill = min(TB.size, TB.fill + s); // optional, see 2.1.4 endif TB.lastUpdate = now; Output: void Babiarz, et al. Expires December 31, 2007 [Page 9] Internet-Draft Three State PCN Marking June 2007 According to the comment at the end of Section 2.1.4 the algorithm may be simplified by omitting the statement commented with "for the sake of fairness". Further simulations need to evaluate its impact. 2.1.3. Configuration of the SR-Meter and ET-Marker The following parameters must be configured: o TB.rate: supportable rate (SR) o TB.size: supportable burst size (SBS), needs to be set appropriately (-> simulations) o Slow-down parameter "s": needs to be set appropriately (-> simulations) 2.1.4. Characteristics of the Proposed SR-Meter and ET-Marker Behaviour The proposed mechanism can be applied with and without marking frequency reduction, i.e., s>0 and s=0, respectively. a) No marking frequency reduction (s=0) If the slow-down parameter is set to s=0, marking frequency reduction is switched off. As an alternative, a simplified version of the given algorithm can be used. If the PCN traffic rate on a link constantly exceeds its SR, the fill state of the TB decreases. Arriving packets for which the number of tokens in the bucket does not suffice are ET-marked. The size of the token bucket (supportable burst size (SBS)) controls how fast the marker reacts to a traffic rate above SR: if it is set to a low value, packets are already marked at a rate lower than SR in the presence of bursts, if it set to a high value, marking starts delayed if the PCN traffic rate exceeds SR. A nice property of this option is that the rate of the ET-marked packets is exactly the rate of the excess traffic rate under the assumption that there was no significant packet loss. Its drawback in conjunction with "marked flow termination" is that too many flows are terminated if all flows with ET-marked packets are selected for termination. With marking approach where s=0, the egress node needs to perform rate measurements to determine how much traffic needs to be terminated. b) Marking frequency reduction (s>0) If the slow-down parameter is set to a value s>0, marking frequency reduction is achieved because for each marked packet up to additional Babiarz, et al. Expires December 31, 2007 [Page 10] Internet-Draft Three State PCN Marking June 2007 "s" bytes can pass the SR-meter and ET-marker without being re-marked to ET. Thus, increasing the slow-down parameter "s" decreases the number of ET-marked packets. In combination with "marked flow termination", a suitable "s" is required to achieve a fast termination of sufficiently many flows without terminating more flows than necessary. Its drawback in conjunction with the measurement of the excess traffic rate at the PCN nodes is that this rate cannot be determined directly. For each ET-marked packet, up to "s" additional bytes can pass without ET-marked. Therefore, an upper bound for the excess traffic rate is obtained by the rate of ET-marked packets at the PCN egress node plus the rate of ET-marked packets multiplied by "s" bytes per ET-marked packet. If "s" is significantly larger than the maximum packet size, a CPU-saving approximation may be used: the rate of ET-marked packets multiplied by "s" bytes/packet. Hence, it is possible to infer an upper bound for the excess traffic rate when marking frequency reduction is applied. ET-marked packets arriving at the SR-meter and ET-marker signify that a flow on the link will be terminated soon. Therefore, the same fill state increment is performed by the SR-meter and ET-marker as if it had ET-marked the packet itself. This is a matter of fairness and streamlining, but possibly has no visible impact under overload conditions. Therefore, the algorithm may be simplified by ignoring packets that are already ET-marked. This is subject to future research. 2.2. AR-Meter and AS-Marker We explain the mechanism for AR-metering and AS-marking, give pseudo code for this mechanism, explain its configuration, and discuss its behaviour. 2.2.1. Mechanism for AR-Metering and AS-Marking We propose an AR-meter and AS-marker based on a token bucket with threshold marking (see Appendix A for explanation). The TB has a bucket of size TB.size which is continuously filled with tokens at rate TB.rate. The AR-meter and AS-marker consider only packets that are not ET-marked. When a non-ET-marked PCN packet arrives, it is re-marked to "AS" if the fill state of the bucket (TB.fill) in tokens is smaller than its size (packet.size) in bytes; otherwise, the fill state is reduced by packet.size tokens and if the fill state is then smaller than the marking threshold (TB.threshold), the packet is also re-marked to "AS" while if the fill state is then larger than or equal to the marking threshold, the packet is not re-marked. Babiarz, et al. Expires December 31, 2007 [Page 11] Internet-Draft Three State PCN Marking June 2007 The AR-meter an AS-marker is sensitive to ET-markings in the sense that only non-ET-marked packets are considered. Therefore, packets should be first SR-metered and ET-marked by a PCN node before being AR-metered and AS-marked. However, this requirement may be relaxed. 2.2.2. Pseudo Code for AR-Meter and AS-Marker The behaviour of the token bucket with threshold marking for AR- metering and AS-marking is expressed by the following pseudo code using the same nomenclature like above. Input: pcn packet if (packet.mark <> ET) //consider only non-ET-marked packets TB.fill = min(TB.size, TB.fill+TB.rate*(now-TB.lastUpdate)); if (TB.fill < packet.size) packet.mark = AS; else TB.fill = TB.fill - packet.size; if (TB.fill < TB.threshold) packet.mark = AS; endif endif TB.lastUpdate = now endif Output: void Note: The above algorithm may be simplified by considering all PCN packets (NP-marked, AS-marked and ET-marked). This needs further analysis and simulations to evaluate its impact. 2.2.3. Configuration of the AR-Meter and AS-Marker The following parameters must be configured: o TB.rate: admissible rate (AR) o TB.size (TBS): needs to be set appropriately (-> simulations) o TB.threshold: TBS-ABS with ABS being the admissible burst size, needs to be set appropriately (-> simulations) 2.2.4. Characteristics of the Proposed AR-Metering and AS-Marking Behaviour If the AR is exceeded, the TB fill state continuously decreases, it eventually falls below its marking threshold TB.threshold and only Babiarz, et al. Expires December 31, 2007 [Page 12] Internet-Draft Three State PCN Marking June 2007 increases if the PCN traffic rate on the link falls below AR. As a consequence, all packets are AS-marked during that time and admission of further flows is stopped until the PCN traffic rate drops below AR. In particular, also all probe packets are AS-marked. TB.threshold controls how fast the marker reacts to a PCN traffic rate that exceeds AR: if it is set to a high value (close to TB.size), packets are marked in the presence of burst at a PCN traffic rate lower than AR, if it set to a low value, AS-marking is delayed when the PCN traffic exceeds AR. As well, the parameter TB.threshold controls how long the marker continues marking packets after the PCN traffic rate falls below AR. ET-marked packets are not subject to AR-metering and AS-marking because they belong to flows that are terminated soon. In conjunction with marking frequency reduction, every ET-marked packet stands for up to "s" additional bytes that were ET-marked without marking frequency reduction, and as a consequence, not subject to AR- metering and AS-marking. Therefore, the fill state of the TB for the AR-metering and AS-marking algorithm may be incremented by the slow- down parameter "s" whenever an ET-marked packet is observed. However, this makes the AR-metering and AS-marking algorithm more complex and this issue probably has only minor effect in practice. Therefore, this is not reflected in the pseudo code. However, the impact of this difference is subject to future simulations. 2.3. Marking Codepoints PCN metering and marking requires classification of traffic which is subject to PCN metering and PCN marking (PCN-capable). Furthermore, PCN-aware flows that are subject to PCN-marking require at least the following codepoints: o "no-precongestion" (NP), o "admission-stop" (AS), and o "excess-traffic" (ET). These signals may be encoded by re-using the two-bit ECN field or by different DS codepoints. The actual encoding is out of the scope of this document. 3. Security Considerations The Three State PCN Marker has no known security concerns. Babiarz, et al. Expires December 31, 2007 [Page 13] Internet-Draft Three State PCN Marking June 2007 4. Changes from Previous Revision The 00 version of this draft defines PCN metering and marking for both admission control and flow termination. This draft incorporates the metering and marking approach for flow termination that was defined in draft-babiarz-pcn-explicit-marking-00. Simulation results for the proposed "AR-metering and AS-re-marking" and "SR-metering and ET-re-marking" will be published in a separate draft. 4.1. Acknowledgements The authors would like to thank the following people for reviewing this draft or earlier versions thereof and for their suggestions to make this document more complete: Dave McDysan, Nicolas Chevrollier and Frank Lehrieder. 5. Informative References [Maglaris-88] Maglaris et al, "Performance Models of Statistical Multiplexing in Packet Video Communications, IEEE Transactions on Communications 36, pp. 834-844", July 1988. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, "Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers", RFC 2474, December 1998. [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., and W. Weiss, "An Architecture for Differentiated Services", RFC 2475, December 1998. [RFC2697] Heinanen, J. and R. Guerin, "A Single Rate Three Color Marker", RFC 2697, September 1999. [RFC2698] Heinanen, J. and R. Guerin, "A Two Rate Three Color Marker", RFC 2698, September 1999. [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, September 2001. [RFC3246] Davie, B., Charny, A., Bennet, J., Benson, K., Le Boudec, Babiarz, et al. Expires December 31, 2007 [Page 14] Internet-Draft Three State PCN Marking June 2007 J., Courtney, W., Davari, S., Firoiu, V., and D. Stiliadis, "An Expedited Forwarding PHB (Per-Hop Behavior)", RFC 3246, March 2002. [RFC4594] Babiarz, J., Chan, K., and F. Baker, "Configuration Guidelines for DiffServ Service Classes", RFC 4594, August 2006. [SIM-07] Liu, X-G. and J. Babiarz, "Simulation Results for Explicit PCN Marking and Flow Termination (http://standards.nortel.com/pcn/Simulation_EPCN.pdf)", February 2007. Appendix A. Overview of Token Bucket (TB) and Virtual Queue (VQ) Token buckets (TB) and virtual queues (VQ) serve to control whether a packet is conform to its flow's indicated rate R and burst size S. Therefore, TB parameters are frequently used as traffic descriptors. TBs and VQs are dual approaches: while packets are TB-conform as long as sufficient tokens are in the bucket at their arrival times, they are VQ-conform as long as sufficient free space is available in the queue at their arrival times. Therefore, TBs and VQs can be used interchangeably and, in particular, algorithms given based on a TB description can be implemented by a VQ and vice-versa. In the following, we explain the basic VQ and TB mechanisms (Appendix A.1 and Appendix A.2). Packets are marked depending on the state of the VQ or TB at their arrival time. There are different marking options. Only those packets that are not conform to its flow description may be marked (tail marking, Appendix A.3), or only some non-conforming packets may be marked (tail marking with marking frequency reduction, Appendix A.4), or all packets may be marked until the flow again reaches conformity (threshold marking, Appendix A.5). Appendix A.6 gives an overview of where and how VQ and TB descriptions are used. A.1. Virtual Queue (VQ) We use an object-oriented notation for a more intuitive readability of the algorithms. The VQ has a VQ rate (VQ.rate) and a queue which is capable to store up to VQ.size bytes. The current length of the queue is denoted by VQ.length. This length is reduced over time at rate VQ.rate. When a packet arrives, it is "accepted" by the VQ and increments VQ.length by its size (packet.size) if there is still enough free space in the queue to accommodate it; otherwise it is "rejected". As the queue size is decreased continuously over time, the behaviour of a VQ is best described by a fluid model. However, Babiarz, et al. Expires December 31, 2007 [Page 15] Internet-Draft Three State PCN Marking June 2007 the state of the VQ shortly after packet arrivals can be calculated based on the current time "now" and the length of the VQ at the last update time of the VQ (VQ.lastUpdate) using the following algorithm: VQ.length = max(0, VQ.length - (now - VQ.lastUpdate) * VQ.rate); if (VQ.length + packet.size <= VQ.size) VQ.length = VQ.length + packet.size; endif VQ.lastUpdate = now; A.2. Token Bucket (TB) The TB is basically the same mechanism, but it looks at the problem from a different angle. The TB has a TB rate (TB.rate) and a bucket which is capable to store up to TB.size tokens. A token is the permission to send one byte. The current fill state of the bucket is denoted by TB.fill. This fill state is increased over time at rate TB.rate. When a packet arrives, it is "accepted" and decrements TB.fill by its size (packet.size) if there are enough tokens in the bucket to send the entire packet; otherwise it is "rejected". As the fill state is increased continuously over time, the behaviour of a TB is best described by a fluid model. However, the state of the TB shortly after packet arrival can be calculated based on the current time "now" and the fill state of the TB at the last update time of the TB (TB.lastUpdate) using the following algorithm: TB.fill = min(TB.size, TB.fill + (now - TB.lastUpdate) * TB.rate); if (TB.fill >= packet.size) TB.fill = TB.fill - packet.size; endif TB.lastUpdate = now; A.3. Tail Marking To control whether packets of a stream with rate R and maximum burst size MBS are conform to the description R and MBS, the stream is metered either by a VQ with rate R and size MBS, or by a token bucket with rate R and bucket size MBS. If a packet is accepted by the VQ or by the TB, it is marked in-profile. If it is rejected, it is marked out-of-profile. The corresponding pseudo codes are for the VQ: Babiarz, et al. Expires December 31, 2007 [Page 16] Internet-Draft Three State PCN Marking June 2007 VQ.length = max(0, VQ.length - (now - VQ.lastUpdate) * VQ.rate); if (VQ.length + packet.size <= VQ.size) VQ.length = VQ.length + packet.size; packet.mark = in-profile; else packet.mark = out-of-profile; endif VQ.lastUpdate = now; and for the TB: TB.fill = min(TB.size, TB.fill + (now - TB.lastUpdate) * TB.rate); if (TB.fill >= packet.size) TB.fill = TB.fill - packet.size; packet.mark = in-profile; else packet.mark = out-of-profile; endif TB.lastUpdate = now; A.4. Tail Marking with Marking Frequency Reduction The objective of tail marking with marking frequency reduction is to mark only some of the packets that are out-of-profile. The strength of the reduction can be controlled by the slow-down parameter "s". When a packet is classified out-of-profile, the VQ length is decremented by "s" bytes and the TB fill state is incremented by "s" tokens, respectively. As a consequence, the VQ and the TB are not likely to mark consecutive packets as out-of-profile which reduces their marking frequency. The corresponding pseudo codes are for the VQ: VQ.length = max(0, VQ.length - (now - VQ.lastUpdate) * VQ.rate); if (VQ.length + packet.size <= VQ.size) VQ.length = VQ.length + packet.size; packet.mark = in-profile; else VQ.length = max(0, VQ.length-s); //marking frequency reduction packet.mark = out-of-profile; endif VQ.lastUpdate = now; and for the TB: Babiarz, et al. Expires December 31, 2007 [Page 17] Internet-Draft Three State PCN Marking June 2007 TB.fill = min(TB.size, TB.fill + (now - TB.lastUpdate) * TB.rate); if (TB.fill >= packet.size) TB.fill = TB.fill - packet.size; packet.mark = in-profile; else TB.fill = min(TB.size, TB.fill+s) //marking frequency reduction packet.mark = out-of-profile; endif TB.lastUpdate = now; If the slow-down parameter is set to s=0, the marking algorithm behaves like pure tail marking. A.5. Threshold Marking The objective of threshold marking is to mark all packets with, e.g., "rate-exceeded" as long as some packets are out-of-profile with respect to flow parameters R and MBS. We achieve that by setting the VQ or TB size larger than MBS, but by marking packets if the VQ length exceeds MBS or if the fill state of the TB falls below TB.size-MBS. Thus, marking thresholds need to be configured differently for VQ and TB to obtain the same behavior: o VQ.threshold = MBS; o TB.threshold = TB.size - MBS. The corresponding pseudo codes are for the VQ: VQ.length = max(0, VQ.length - (now - VQ.lastUpdate) * VQ.rate); if (VQ.length + packet.size <= VQ.size) // packet in-profile VQ.length = VQ.length + packet.size; if (VQ.length > VQ.threshold) packet.mark = "rate-exceeded"; endif else // packet out-of-profile packet.mark = "rate-exceeded"; endif VQ.lastUpdate = now; and for the TB: Babiarz, et al. Expires December 31, 2007 [Page 18] Internet-Draft Three State PCN Marking June 2007 TB.fill = min(TB.size, TB.fill + (now - TB.lastUpdate) * TB.rate); if (TB.fill >= packet.size) ) // packet in-profile TB.fill = TB.fill - packet.size; if (TB.fill < TB.threshold) packet.mark = "rate-exceeded"; endif else // packet out-of-profile packet.mark = "rate-exceeded"; endif TB.lastUpdate = now; A.6. Related Work The single rate three color marker (srTCM) [RFC2697] meters an IP packet stream and marks its packets green, yellow, or red. The traffic is described by a committed information rate (CIR), a committed burst size (CBS), and an excess burst size (EBS). If a packet is conform to a token bucket with parameters (CIR, CBS), it is colored green. If it is conform to (CIR, EBS), it is colored yellow. Otherwise, it is colored red. The implementation given in [RFC2697] is slightly different from an exact TB implementation, but it behaves similarly. srTCM is implemented in most of today's routers and can be used to perform threshold marking using the following mapping: green = non-AS; yellow = AS; red = AS; CIR = rate; CBS = MBS; EBS = TB.size or VQ.size Therefore, the proposed behavior for admission-stop marking can be obtained with today's technology. In contrast to the srTCM, the two rate three color marker (trTCM) [RFC2698] meters an IP packet stream based on a committed and a peak information rate (CIR, PIR) and marks its packets either green, yellow, or red. The traffic is described by a CIR, a committed burst size (CBS), a PIR, and a peak burst size (PBS). If a packet is conform to both (PIR, PBS) and (CIR, CBS), it is marked green. If a packet is conform to (CIR, CBS) only, it is marked yellow. Otherwise, it is marked red. Babiarz, et al. Expires December 31, 2007 [Page 19] Internet-Draft Three State PCN Marking June 2007 Both srTCM [RFC2697] and trTCM [RFC2698] offer a color-aware and a color-blind operation mode where the packet marking is dependent or independent of already existing packet markings. This is required for PCN marking in a similar way. They are useful, for example, for ingress policing of traffic streams in a DiffServ environment. Policers drop out-of-profile packets instead of marking them. Shapers delay out-of-profile packets until they become in-profile. The leaky bucket shapes a traffic stream to a maximum rate R and drops all packets that exceed a maximum burst size MBS. [http://en.wikipedia.org/wiki/Leaky_bucket] Appendix B. Discussion of PCN Characteristics The Three State PCN marking behaviour can be realized through the use of a normal token bucket arrangement such as that is used for coloring packets in trTCM [RFC2698] with the following changes. By redefining Peak Information Rate (PIR) to be supportable rate (SR) and Committed Information Rate (CIR) be admissible rate (AR). When the PIR or SR token bucket runs out of tokens, the non-conforming packet is marked red or excess-traffic (ET). The delta functionality is that each time a packet is ET marked, "s" tokens are added to the SR token bucket. As well, the trTCM and the Three State PCN Marker has a lower rate that packets are measured against CIR or AR. The delta functionality for the AR token bucket is a packet is marked as admission-stop (AS) when the token bucket reaches a "k" level, versus empty. This delta function can be added to many of the routers that are in use today. Here we highlight some of the characteristics and benefits that excess-traffic (ET) marking approach has: 1. This marking approach works with Equal Cost Multipath (ECMP) routing in the network. The router that has its traffic above supportable rate ET-marks packets. The egress node detects ET- marked packets and signals flow termination to ingress node. 2. Works reasonable well in presence of low and high packet loss. (See simulation results details). Packet loss only intrudes small delta delay for excess load reduction through flow termination to be completed. If a marked packet is lost and the overload is still presented, the congested router will mark another packet. Babiarz, et al. Expires December 31, 2007 [Page 20] Internet-Draft Three State PCN Marking June 2007 3. This approach is not that sensitive to the different number of flow, works well with small and large number of flows, constant bit rate, variable rate (on-off voice) and variable rate MPEG-4 like traffic. To date simulation results [SIM-07] indicate that this marking approach is not that sensitive to different flow counts and traffic characteristics. 4. It is believed that this marking approach will work reasonably well in presence of multiple congestion points in the path. The ET marking approach has an exponential decay marking property, whereby the marking frequency decreases as the excess traffic load decreases and as traffic drops below supportable rate, ET marking stops. In other words, flow termination slows down as the excess load approaches that supportable rate traffic level. This behaviour reduces traffic to just below supportable rate on all the routers in the path. 5. Simulations show that with mixed traffic of different rates and packet sizes, ET-marking approach marks higher bandwidth flows more aggressively. Therefore after link failure condition the result is that more flows as total can be supported with the aggregate traffic being below the supportable rate. 6. Also it is believed that the ET marking approach and its exponential decay property should work well with bidirectional flows. When a bidirectional flows is terminated, excess load is removed from the pre-congested router(s), therefore reducing the frequency of marking. 7. Simulations show that it works well over a wide range of round- trip times (RTTs) that are reasonable for real-time traffic. See [SIM-07] simulation results with different RTTs. The "s" value should be configured so that it is greater than the number of octets transmitted by a flow in RTT. For aggregation of voice flows that have various rates, using the mean rate should produce reasonable results. More work is required before any guidelines for "s" can be stated. 8. Stable behaviour under most operating conditions. Simulations show very good accuracy both for constant rate and variable rate traffic with small and large number of overload conditions. 9. This approach works well in gateway-to-gateway and host-to-host deployment models. 10. The ET marking behaviour is friendly to behaviour in [RFC3168] and should PCN flow encounter a router that performs [RFC3168] marking, it would provide some protection against congestion. A Babiarz, et al. Expires December 31, 2007 [Page 21] Internet-Draft Three State PCN Marking June 2007 packet belonging to a PCN flow that is CE marked would be terminated. 11. If the egress edge node (gateway) reports which flows that need to be removed (terminated) versus bandwidth, than any reasonable value for "s" can be used. The value for "s" and the algorithm do not need to be standardized, but only the metering and marking behaviour. Different algorithms may be used to obtain the described metering and marking behaviour. Authors' Addresses Jozef Z. Babiarz Nortel 3500 Carling Avenue Ottawa, Ont. K2H 8E9 Canada Phone: +1-613-763-6098 Email: babiarz@nortel.com Xiao-Gao Liu Nortel 3500 Carling Avenue Ottawa, Ont. K2H 8E9 Canada Phone: +1-613-763-7516 Email: xgliu@nortel.com Kwok Ho Chan Nortel 600 Technology Park Drive Billerica, MA 01821 USA Phone: +1-978-288-8175 Email: khchan@nortel.com Babiarz, et al. Expires December 31, 2007 [Page 22] Internet-Draft Three State PCN Marking June 2007 Dr. Michael Menth University of Wuerzburg Institute of Computer Science Am Hubland, D-97074 Wuerzburg, Room B206 Germany Phone: (+49)-931/888-6644 Email: menth@informatik.uni-wuerzburg.de Babiarz, et al. Expires December 31, 2007 [Page 23] Internet-Draft Three State PCN Marking June 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Babiarz, et al. Expires December 31, 2007 [Page 24]