Internet Engineering Task Force C-Y Lee INTERNET DRAFT L. Andersson Expires December 1999 Nortel Networks Ken Carlberg SAIC Bora Akyol Pluris June 1999 Engineering Paths for Multicast Traffic using MPLS Status of this memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." To view the list Internet-Draft Shadow Directories, see http://www.ietf.org/shadow.html. Abstract This document describes a solution to engineer paths for IP multicast traffic in a network, by directing the control messages to setup multicast trees on engineered paths. This proposal partitions the multicast traffic engineering problem such that multicast routing protocols do not have to be modified to setup engineered routes or allocate resources for multicast traffic nor do resource allocation protocols such as RSVP or CR-LDP have to be able to setup forwarding states (in this case labels) like multicast routing protocols. Resources are allocated on the same trip that paths are selected and setup. An important aspect of this proposal is that it enables multicast paths to be engineered in an aggregatable manner, allowing this solution to scale in the backbone. Expires December 1999 [Page 1] Internet Draft Engineering Paths for Multicast Traffic June 1999 1. Overview In general, traffic is engineered to traverse certain paths so as to utilize resources in a network in a more optimal manner, while at the same time improving the level of service that can be offered. In conventional IP routing, traffic may be engineered to use a path by configuring preferred links towards a destination with a lower metric. This method only allows traffic to be engineered based on the destination address. Since the forwarding is based on the destination address only, traffic cannot be engineered based on other attributes (which maybe useful for traffic engineering purposes) of the packet such as the source address of a packet or the requested service level. In contrast, MPLS abstracts the forwarding paradigm and allows traffic to be forwarded based on attributes (known as forwarding equivalence class (FEC) in MPLS) in addition to the destination address. This provides a versatile and convenient syntax for traffic engineering purposes. This document describes a way to provide a basic traffic engineering mechanism for multicast. Traffic Engineering (TE) functionalities (in the MPLS entity) are used to forward the join control messages of multicast protocols, based on different traffic engineering requirements and to allocate resources. (Note that multicast data packets however are forwarded based on Layer 3 (L3) address information and are not label switched. ) Using this basic multicast traffic engineering mechanism, ISPs can define particular FECs for their network, resources required to receive traffic from certain root prefix, decrease fanouts at a node by limiting the number paths towards the node(prefix), allowing only certain paths to carry multicast traffic, experiment with heuristics to better engineer multicast trees, use a function to dynamically compute suitable paths based on current or predicted network resources. All these additional network or content provider specific functions to engineer traffic can be developed independently of the basic traffic engineering mechanism. 2.0 Motivation The fundamental problem with doing multicast Traffic Engineering (TE) is the difficulty in doing it in a scalable manner. Multicast routes are very difficult (and some claim impossible) to aggregate. One can associate a label with a unicast route(prefix) and packets sent to that destination can be aggregated and engineered by associating them with the label. Since multicast routes are not aggregatable in general, associating a Expires December 1999 [Page 2] Internet Draft Engineering Paths for Multicast Traffic June 1999 label with a multicast route will require per flow/group resource allocation. In essence, this kind of association will result in RSVP (or ATM) style resource allocation and is more applicable to per flow QOS than traffic engineering. In contrast the approach taken in this proposal decouple traffic engineering from multicast route setup, thereby allowing the resources and paths for multicast data delivery to be independently allocated. What this implies is, resources and paths can be aggregated and engineered; and traffic can be statistically multiplexed, enabling network operators to provide differentiated services for multicast traffic in a scalable manner. 3.0 Scope This draft described mechanisms which is applicable to multicast routing protocols such as PIM-SM, CBT, BGMP, Express or Simple Multicast, which will be called 'control driven' in this draft. 'Data driven' or flood and prune protocols (eg DVMRP and PIM-DM) are described in another draft. This proposal assumes a multicast group/tree has a common 'QOS' requirement. It is envisaged that heterogeneous receivers requirement can be met by layer encoding data in different multicast groups or other variation of layer encoding. It should be noted that the MPLS concepts of interest here are the FEC, ERO and resource allocation and path selection. An entirely new supporting protocol could be designed to support the traffic engineering mechanisms proposed here, however since the concepts of interest have already been defined and have been implemented in one form or another, the solution is described in terms of how it can be realized in MPLS. 4.0 Approach A control driven multicast routing protocol sends a 'join' message to graft a node to a multicast distribution tree, creating multicast routes in the process. Since the join messages are forwarded based on unicast routes, if the conventional routing table is used, the multicast routes setup will be based on conventional routes. To constrain multicast paths, the join message should be sent via paths, computed or statically configured. This draft describes a scheme where multicast routing control messages (including join messages) are forwarded by the MPLS entity in a router on the constraint path. To allow a router to process control messages, the control messages Expires December 1999 [Page 3] Internet Draft Engineering Paths for Multicast Traffic June 1999 should contain the router alert option. The control message is identified at the ingress LSR by its FEC. Based on the FEC, the MPLS entity can derive the path the control message should take and allocate resources accordingly. A multicast routing protocol would setup the forwarding state on the ports/interface where the join is received. To enable the establishment of multicast forwarding state based on constraint (unicast) routes, multicast routing protocols which verify the Reverse Path Forwarding (RPF) must turn off this check. To prevent redundant data and loops, a loop avoidance scheme based on the concepts described in [MPLS-LOOP-AVOID] or [SM] can be used in the routing protocol. If there is a loop, the routing protocol should not create forwarding states for the group on the port where the join is received. Other alternatives to send the join on the engineered path such as - extending CR-LDP/TE-RSVP to send and merge joins for the multicast tree associated with a label - changing the multicast routing protocol to send the join along the explicit route, either require multicast routing protocol functionalities to be present in MPLS or MPLS functionalities to be incorporated into multicast routing protocols. This proposal uses MPLS (label and explicit route object) to cause engineered paths to be selected but forward data using multicast routing. It does not require MPLS or multicast routing protocols to be merged, an exercise which tend to - result in redundant or the reinventing, of functionalities at L2/L3; increase the complexity of multicast traffic engineering while not providing any means of aggregating multicast traffic engineering. The alternative approaches listed above require traffic to be engineered for each group/tree since multicast labels/routes are most likely to be not aggregatable. Each group must be assigned a different label as well. In contrast this proposal allows a network provider to aggregate the 'QOS' path towards a root or root prefix (since resource allocaton and path selection can be independent of the setup of forwarding states/routes). The root prefix could be a subnet or domain. ulticast traffic in the backbone network can then be, provisioned in a more scalable manner and statistically multiplexed on the (aggregated) engineered paths. 5.0 Procedure 5.1 Egress LSR At any egress LSR (i.e a router where the traffic exits the MPLS network) that may join multicast trees - FECs, the associated path selection mechanisms and resources required are specified. These FECs will match the the control messages of routing protocols (eg PROTO_ID=PIM-SM/CBT, destination = root prefix/well known multicast Expires December 1999 [Page 4] Internet Draft Engineering Paths for Multicast Traffic June 1999 address, TOS=codepoint). Note that the message that carries this information traverses the network from egress to ingress. The path selection mechanisms can be based on, a static table or a constraint based routing table or a path selection algorithm (dynamic). (See 6.0 Path Selection as well) Figure 1 shows the passage of control messages in an egress LSR (dotted lines) and the interface between the various entities in the LSR (+++ lines) When a control messages arrives at the ingress LSR the packet will be sent to L3 for processing (where a multicast routing protocol may setup forwarding states), since the control message contain the IP Router Alert option. After processing the control message, L3 will attempt to forward the packet towards the destination specified in the control message. ------------------------ | Multicast routes | ------------------------ + + ------------------------- | Multicast Routing | ------------------------- ^ | | | | v ---------- ------------ ----> | MPLS | | MPLS | ----> ---------- ------------ + + + --------------- | FEC,Path and | | Resource | | Specification | ---------------- Fig. 1 At the egress (wrt data flow) LSR If the packet (control message) matches the FEC defined in the above manner, the MPLS entity will invoke the appropriate path selection Expires December 1999 [Page 5] Internet Draft Engineering Paths for Multicast Traffic June 1999 mechanism. The root address of the multicast tree may be provided to the path selection mechanism to obtain the constraint routes towards the root. The root address of a multicast tree can be retrieved via a generic API provided by multicast routing protocols. The constraint routes obtained from the path selection mechanism will be placed in an ERO. An MPLS control message (CR-LDP/RSVP with MPLS extension) containing the FEC, ERO TLV, resources required (eg Traffic Parameter and any other relevant TLVs) will be prepended to the IP packet. It should then forward the MPLS control message to the next hop specified in the ERO. To allow routers downstream to process this control message, the packet will be labeled as Router Alert. The explicit routes in the ERO object is removed as it traverses the explicit path towards the root, in the same manner as described in CR-LDP and TE-RSVP. 5.2 Intermediate LSRs Figure 2 shows the passage of control messages in an intermediate LSR (dotted lines) and the interface between the various entities in the LSR (+++ lines) ------------------------ | Multicast routes | ------------------------ + + ------------------------- | Multicast Routing | ------------------------- ^ | | | | v ---------- ------------ ----> | MPLS | | MPLS | ----> ---------- ------------ + + + + + + ---------------- | FEC | | State | ---------------- Fig. 2 At an intermediate LSR Expires December 1999 [Page 6] Internet Draft Engineering Paths for Multicast Traffic June 1999 When the next hop (or other intermediate nodes) receives the packet with Label Router Alert, it will be taken out of the forwarding path and directed to the MPLS entity. (If the control messages are not labeled, L3 would send this control message directly to a L3 multicast routing protocol, instead of the MPLS entity). The MPLS entity will allocate the resources requested by the CR-LDP or RSVP with MPLS extension message, create a state for the FEC (and other objects eg ERO, Traffic) - called the FEC state for short. It will then sent the packet to the multicast routing protocol (MRP). The MRP will then create the forwarding state for the group and will forward the join message towards the root. Since the FEC for this control message will match the FEC state created earlier, the join message will be dispatch to the MPLS entity, which will process the ERO object and will sent the packet to the next hop listed in the ERO. Note that the FEC need only be specified in the ingress LSR, intermediate LSRs are informed of the FEC information by previous hops. Similarly, the explicit (constraint) routes is only computed or configured at the ingress LSR; the next hop and other intermediate nodes learn of the explicit routes via the ERO object propagated from the ingress LSR. Loose Source Route can be specified in the ERO and intermediate nodes (LSRs) may forward it to the next explicit route/node specified in the ERO based on local routing information. If an LSR already have an FEC state, the packet will be sent directly to L3 for processing. L3 will decide if it needs to forward this control message any further. If it is a join message, and there is already L3 forwarding states, the join is terminated. If it is a maintenance control message, the control message is processed and forwarded. This packet will match the FEC state created earlier and MPLS will forward the packet according to the next hop in the ERO list associated with this label and FEC. 5.3 Loops If the MPLS control message specifies looping explicit routes : * then if the tree is uni-directional, only the join message will loop. Data will not loop since data flow is only in one direction from root to members. * then if the tree is bi-directional, the join message will loop, but because permanent states would not be established in this case, data will not be forwarded on the looping path. However if there is a change in next hop towards the root at a node where there is already an existing forwarding state, then multicast Expires December 1999 [Page 7] Internet Draft Engineering Paths for Multicast Traffic June 1999 routing protocols which uses bi-directional trees or a hybrid of uni-directional and bi-directional branches could invoke a loop avoidance procedure. One way to avoid loops in this case is (using splice message) described in SM. This procedure should ideally be specified in the multicast routing protocol itself. 6.0 Path Selection This proposal allows different path selection algorithms to be used, depending on the FEC and path selection mechanism association. Paths can be configured, computed, discovered or obtain through other means. A path selection mechanism will return the constraint routes given for eg the group address, root of multicast tree and other criteria. How the paths are selected are independent of this proposal, but a generic interface (API) between path selection algorithms and this multicast traffic engineering scheme is required and is FFS. 7.0 Examples This section list some examples of how multicast traffic can be engineered using the procedures described in this proposal. a) A network operator may define an explicit route [Rx, Ry, Rz] towards a domain with prefix 10.0.0.0 for multicast traffic. Any member joining a group where the root address has the prefix 10.0.0.0 will have data delivered to it via the explicit route [Rz, Ry, Rx] (data is in the reverse direction of the join control message). This explicit route may be a Loose Source Route, or a route calculated by an algorithm eg an Internal Gateway Protocol (IGP) which can provide constraint based routes. It is worth noting that the explicit route can be the desired path from a root towards a member instead of the reverse path (from member towards the root). b) Another variation of the above may define an additional field of interest in the FEC, the TOS. This will allow a network operator, to allocate resources for traffic belonging to a diffserv forwarding class, for eg Assured Forwarding. c) To decrease fanout, egress LSRs (where multicast data traffic exits) can obtain the contraint routes (via manual configuration or a constraint based routing entity which can be developed independently of the basic TE scheme described in this proposal) Expires December 1999 [Page 8] Internet Draft Engineering Paths for Multicast Traffic June 1999 d) Load Balancing - a load balancing algorithm can provide the alternative path that a control message can take depending on the QOS requirement of the group and the current utilization of the equal cost paths. As mentioned in the Scope section, this draft assumes the QOS requirement of the group is constant (or the maximum value is used) or can be averaged to a constant, for traffic engineering purposes. e) Policy routing - Different paths may be defined for different groups. 8.0 Acknowledgments The authors are grateful to Dirk Ooms and Yunzhou Li for reviewing this draft and their helpful suggestions to improve this proposal. Thanks to Jon Crowcroft for providing insightful comments. References [ARCH] E. Rosen, A. Viswanathan, R. Callon, "Multiprotocol Label Switching Architecture", Work in Progress, July 1998. [TE-MPLS] Awduche, D. et al., "Requirements for Traffic Engineering over MPLS", Internet Draft, draft-ietf-mpls-traffic-eng-00.txt, October 1998. [CRLDP] L. Andersson, A. Fredette, B. Jamoussi, R. Callon, P. Doolan, N. Feldman, E. Gray, J. Halpern, J. Heinanen T. E. Kilty, A. G. Malis, M. Girish, K. Sundell, P. Vaananen, T. Worster, L. Wu, R. Dantu, "Constraint-Based LSP Setup using LDP", Work in Progress, January, 1999. [TE-RSVP] D. Awduche, L. Berger, D-H. Gan, T. Li, G. Swallow, Vijay Srinivasan, Internet Draft, draft-ietf-mpls-rsvp-lsp-tunnel-02.txt, September 1999 Multicast Routing with resource reservation, Journal of High Speed Networks 7 (1998) 113-139, B. Rajagopalan, R. Nair CBT, Core Based Tree Multicast Routing, Internet-Draft, March 1998, Ballardie, Cain, Zhang PIM-SM, Protocol independent multicast-sparse mode Specification, RFC-2117, June 1997 Estrin, Farinacci, Helmy, Thaler, Deering, Handley, Expires December 1999 [Page 9] Internet Draft Engineering Paths for Multicast Traffic June 1999 Jacobson, Liu, Sharma, and Wei. BGMP, Border Gateway Multicast Protocol Specification, Internet-Draft, March 1998, Thaler, Estrin, Meyers Express, H. Holbrook, D. Cheriton Sigcomm Paper SM, Simple Multicast, Internet-Draft, March 1999, draft-perlman-simple-multicast-02.txt, Perlman et al [MPLS-LOOP-AVOID] "Avoiding Loops in MPLS", Internet Draft, draft-leecy-mpls-loop-avoid-00.txt, June 1999 C-Y Lee, L. Andersson, Y. Ohba, YAM, K. Carlberg, J. Crowcroft Hipparch 1998 Expires December 1999 [Page 10] Internet Draft Engineering Paths for Multicast Traffic June 1999 Authors' Information Cheng-Yin Lee Nortel Networks PO Box 3511, Station C Ottawa, ON K1Y 4H7, Canada leecy@nortel.com Loa Andersson Nortel Networks Inc Kungsgatan 34, PO Box 1788 111 97 Stockholm Sweden Phone: +46 8 441 78 34 obile: +46 70 522 78 34 email: loa_andersson@baynetworks.com Ken Carlberg SAIC S 1-2-8 1710 Goodridge Drive cLean, VA. 22102 Bora Akyol Pluris Terabit Network Systems 10445 Bandley Drive Cupertino, CA 95014 USA akyol@pluris.com Phone: (408) 861-3302 Fax: (408) 863-0271 email: akyol@pluris.com Expires December 1999 [Page 11]