INTERNET DRAFT M. Ohta draft-ohta-static-multicast-02.txt Tokyo Institute of Technology J. Crowcroft University College London June 1999 Static Multicast Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract The current IP Multicast model appears to achieve a level of simplicity by extending the IP unicast addressing model (historically the classful A,B, and C net numbers) from the mask and longest match schemes of CIDR, with a new classful address space, class D. The routing systems have been also built in a deceptively simple way in one of three manners - either broadcast and prune (DVMRP, Dense Mode PIM), destination list based tree computation (MOSPF) or single centered trees (current sparse mode PIM and CBT). The multicast service creates the illusion of a spectrum that one can "tune in to", as an application writer. Due to this view, many have seen the multicast pilot service, the Mbone, as a worldwide Ethernet, where simple distributed algorithms can be used to allocate "wavelengths" and advertise them through "broadcast" on a channel (the session directory), associated with a spectrum. These three pieces of the picture have tempted people to construct a distributed architecture for a number of next level services that M. Ohta Expires on December 25, 1999 [Page 1] INTERNET DRAFT Static Multicast June 1999 cannot work at more than a modest scale, since they ignore the basic spirit of location independence for senders and receivers of IP packets, whether unicast or multicast. The problem is that many of these services are attempting to group activities at source, when it is only at join time that user grouping becomes apparent (if you like, multicast usage is a good example of very late binding). These services include Address Allocation and Session Creation, Advertisement and Discovery. This memo proposes approaches to solve some current multicast problems rather statically with DNS and URL based approach, and avoid the misguided pitfalls of trying to use address allocation to implement traffic aggregation for different sources or aggregation of multicast route policy control through control of such aggregated sources. Note that a minor level of aggregation occurs in applications which source cumulative layered data (e.g. audio/video/game data - ref vic/rat/rlc) - this memo is orthogonal to such an approach, which in any case only results in a small constant factor reduction in state. A lot of the IP multicast additional pieces of baggage are associated with the multimedia conferencing on Mbone - however, the commercial internet use of multicast includes many other applications - for these, SDR may not be the best directory model. 1. Introduction Multicast and related applications have traditionally been developed in Routing and Transport areas. Naturally, designers have tried to solve many problems using techniques familiar to those working in the routing and transport areas, that is, with flooding or multicast. Of course, global flooding or multicast do not scale very well, which means that scalable solutions that make use of these techniques only are often impossible in the world wide Internet. An attempt to reduce the scalability requirement to localize multicast and flooding area through TTL or administrative scoping (intra-site, intra-provider multicast etc.) works only in a small scale experiment like Mbone. In the real Internet, senders and receivers of multicast communication, in general, may be using different providers and are distributed beyond AS boundaries. As a result, there was a hope that address aggregation and unicast area topology report aggregation can solve the multicast scalability problems in the same way that they have bailed out the unicast Internet from problems with limitation of router memory and the M. Ohta Expires on December 25, 1999 [Page 2] INTERNET DRAFT Static Multicast June 1999 capacity needed for route update reports: a) Unicast addresses refer to a location, however. Multicast addresses are logical addresses, and refer to sets of members who may be anywhere, and may be sent to by sources which are also in more than one of many places. This means that for unrelated multicast group (and we anticipate that, in general, we can expect relationship between groups only when the groups belongs to a single application and that there is far more group that is unrelated than there is group that is related), there is no meaningful allocation at session creation time of a mask/prefix style multicast address, either for destination group, or sources. b) To control the amount of state and routing control messages, the Internet has divided the routing systems into autonomous systems/regions, which can run their own routing, and need only report summarized information at the edge to another region. This serves two purposes in the Unicast world: 1/ Inter-domain routing protocols can be deployed that are different in different areas (this may be applied recursively). 2/ Summarization can be applied at "min-cut" points in the topology, and reachability information only needs to be exported/imported across borders. Note that, autonomous system boundaries are merely for operational purpose of easy policy description. The boundary does not contribute to protocol issues to reduce the amount of routing information, which is accomplished with multi-layered OSPF without BGP. With multicast, while one could define inter-working boundaries and functions as the IDMR WG has, the principle goal of scaling the reports at a border cannot be achieved in a location independent manner (in the sense that without moving all the receivers to a particular region, there is no aggregation feasible). As a result of this confusion, intra-domain multicast protocols, which are expected to operate within a single AS have been developed that scale poorly, even though there was no known inter domain multicast protocol which solves the scalability problem. It has been shown [MANOLO] that aggregation of multicast routing table entries, the number of which is a major scalability problem for IP multicast, is, in general, impossible. The impossibility proof assumes nothing about QoS. That is, M. Ohta Expires on December 25, 1999 [Page 3] INTERNET DRAFT Static Multicast June 1999 multicast QoS Flow state can be aggregated as good/bad as multicast best effort communication. RSVP may be extended to aggregate RSVP requests of strongly interrelated flows, for example, for streams with layered encoding, which may or may not share a single multicast address, latter case of which may result in a small constant factor of routing table entry reduction. There may be a counter argument that a broadcast/prune in region (== big ether) and spare in other region for clumpy cast can overcome the problem. However, forwarding for "spare in other region" needs a routing table entry of its own. Moreover, even in the region, "broadcast/prune" scales worse than the theoretical lower bound of PIM-SM/CBT [MANOLO]. The impossibility of the multicast routing entry aggregation is applicable to the entire Internet or to a small part, such as a single subnet, of it. Still, in a small part of the Internet containing are 4 parties (host, subnet, subdomain, etc.), all the possible multicast forwarding pattern is 16 that, in a sense, multicast routing table aggregation is possible, if multicast addresses can be assigned according to the distribution pattern of the receivers. So, there may still be some misunderstanding that in multi-access link layers, such as Ethernet or ATM, where link local address can be assigned dynamically, multicast forwarding state may be aggregated. However, at the boundary, full routing table look-up based on the IP addresses is necessary. Worse, if the part contains 16 parties, the possible pattern of receivers is 65536 that there virtually is no aggregation possible. MPLS hype is, as usual, as good/bad as using some multi-access link and does not help here. Thus, it is now necessary to thoroughly reconsider the architecture of multicast, Given a theoretical lower bound of multicast routing table entries, now is the time to find a multicast algorithm to achieve that lower bound. It is also meaningful to make the multicast architecture independent of unicast address hierarchy. Fortunately, some problems can easily be solved for many common cases using techniques available in other areas without scalability problems. Since the legacy multicast architecture was constructed carefully assuming routing table aggregation possible, it is necessary to change some of it to deploy new techniques. To solve hard scalability problems, it is necessary to recognize that all the details of all the protocols are tightly interrelated. The multicast problems identified to be better solved in internet or application area in this memo are: M. Ohta Expires on December 25, 1999 [Page 4] INTERNET DRAFT Static Multicast June 1999 Multicast Address Allocation There was a proposal to allocate multicast address dynamically along the unicast address hierarchy. Such an allocation policy was expected to enhance the possibility of aggregation. However, as shown in the next section, it is impossible to aggregate a class D multicast routing table using simple mask and longest match type approaches. This, while it is still possible to aggregate multicast address allocations, it is not meaningful. There is a misunderstanding that multicast addresses are scarce resource that must be assigned dynamically. But, first of all, dynamic assignment does not mean efficient assignment. Secondly, as a multicast routing table cannot be aggregated, the limitation on routing table size in many of today's routers is such that we simply will run out of memory on a router before we run out of addresses or even use a significant piece of the class D address space: Considering the current global unicast routing table size, 2^16 global multicast addresses are more than enough. Given these arguments, we can see that from performance grounds, it is meaningful to allocate multicast addresses statically through the DNS. Multicast Core/RP Location CBT and PIM-SM were developed as intra-domain multicast protocols designed to be independent of the underlying unicast routing protocols. Naturally, they achieve the lower bound of spatial routing table size complexity. However, CBT and PIM-SM are not totally independent of unicast routing architecture, since they depends on flooding within an AS to locate the core or rendez-vous point. While this scales a little better than static assignment, it is still fairly bad. On the other hand, it is straight forward to use DNS to map from DNS multicast name to multicast address, core and RP. This solution may not be an option when dynamic multicast address assignment was a MUST and DNS dynamic update was not possible. However, this is now rectified since DNS update is being implemented now. Multicast Session Announcement The announcement of multicast sessions can be performed over a special multicast channel. However, this approach does not scale if the number of multicast channels increases. Of course, M. Ohta Expires on December 25, 1999 [Page 5] INTERNET DRAFT Static Multicast June 1999 it is possible to introduce hierarchy of multicast session announcement channels. The real world complex structure makes the relationships between session announcement a complex problem. In such a system, users would join a session directory hierarchy by joining a group for some level, following the hierarchy, or following short-cut or following links, changing between several multicast groups to reach the final destination multicast for the session they seek. But as is shown later, multicasting costs routing table entries and associated protocol processing power in the routers if multicast data flows over the routers. Hence it is desirable to constrain the number of multicast channels to be as small as possible. If, instead, we use WWW as EPG (Electric Program Guide) and embed SDP or SMIL information in RTP URLs, it can be used as multicast session declaration with an arbitrary complex structure including hierarchy, short-cut or links, and we can use search techniques on this static data more easily. Of course, neither DNS nor WWW scale automatically: they must continue to scale anyway and a lot of effort has already and will continue to be paid to make them better scale, more dynamically and more securely. and their servers are also becoming more capable (caching etc). DNS will be used for unicast name to address lookup for the foreseeable future, as WWW will be the preferred way to retrieve information. 2. Meaningless Aggregation of Multicast Addresses It is, in general, impossible to aggregate Internet standard multicast routing table entries. The minimum amount of state in each multicast router must be proportional to the number of multicast data flows which are running over it. The locations of receivers are different, multicast application by multicast application. Multicast forwarding must be performed over a tree from each source (or from core/RP) to the receiver set for each and every application. The sources are different too. Thus, the tree is different multicast by multicast. It is possible to aggregate multicast address allocation by making multicast location dependent with, say, a root domain. Then it is possible to aggregate routing table entries to the root domain. For some type of central set of agencies (traditional broadcast TV/Radio) it might be possible to site their feeds at the same places in the M. Ohta Expires on December 25, 1999 [Page 6] INTERNET DRAFT Static Multicast June 1999 Internet. But this is antithetical to the arbitrary growth allowed by random siting/evolution of content providers today, even in the Web. Sheer numbers preclude building unicast pipes from each source to a central set of sites. However, it is still impossible to aggregate routing table entries to the receivers. The distribution pattern of receivers is unrelated to the location of the root domain. That is, a separate routing table entry is necessary for each multicast application. A group of multicast receivers sharing a root domain may still have weak relationships in that most of them do not have any member in domains far from the root domain. Then, it is possible to share a default routing table entry, not to forward anything. But, such an entry is meaningless, because there is no data packet that will be forwarded for that entry and we still need unaggregated routing table entries for each multicast running over multicast routers. Alternatively, it is possible to assign multicast addresses aggregated according to the statically or runtime detected distribution pattern of the receiver hosts, areas or domains. However, even with 32 receiver hosts, areas or domains, we need 32 bits for the aggregation prefix of the multicast addresses, which is too many for IPv4. Even IPv6 address space does not help a lot (96 receivers is not a great step forward!). Moreover, as the multicast membership changes dynamically, the multicast address itself must change dynamically. In other words, if we stay in line with the current model of the Internet standard multicast, it is impossible to aggregate multicast routing table entries. It is meaningless to try to aggregate multicast address assignment. It is, of course, meaningful and necessary to delegate multicast address allocation, hierarchically 3. The Difficulty of (Multicast) Address Assignment Compared to the administrative effort for unicast address assignment by IANA, Internic, RIPE, APNIC and all the country NICs and development of the policy they used, it is trivially easy to develop a DHCP protocol. The difficulty with DHCP was in the fact that the clients can not be reached by its IP address. In the absence of this bootstrap problem, it is trivial to develop a DHCP-like dynamic multicast address assignment protocol for clients, whose unicast addresses are already established. It could be as simple as a new option field of DHCP. M. Ohta Expires on December 25, 1999 [Page 7] INTERNET DRAFT Static Multicast June 1999 However, such a use of DHCP is meaningless, unless an administrator of the DHCP server has been delegated a block of unicast addresses and establishes a policy on how to assign them to clients. We argue that the DHCP-like mechanism for multicast is a misdirected solution. Basically, multicast address assignment is not a protocol issue. 4. Recycling the Unicast Policy, Mechanism and Established Address Assignment for Multicast Policy, Mechanism and Address Assignment If rather static allocation of multicast address is acceptable, it is possible to reuse the policy, mechanism, address assignment and protocol of unicast address assignment for multicast addresses.. For example, if we decide to use 225.0.0.0/8 for the static allocation, it is trivial to delegate the authority of multicast address 225.1.2.3 to an administrator of 3.2.1.in-addr.arpa, the administrator of 1.2.3.0/24. We can simply define that the multicast DNS name should be looked up as: 3.2.192.225.in-addr.arpa. CNAME mcast.3.2.192.in-addr.arpa. mcast.3.2.192.in-addr.arpa. PTR bbc.com. bbc.com. A 225.192.2.3 Additionally if we construct applications that check the reverse mapping, unauthorized use of multicast addresses will be automatically rejected, which is what we are doing today with unicast addresses. Note that the administrator of 3.2.192.in-addr.arpa is not the final person to be delegated the address but can further delegate the authority of mcast.3.2.192.in-addr.arpa. to someone else. It should also be noted that, while the delegation uses the existing policy, mechanism, assignment and protocol, it does not mean that the multicast address must be used within the unicast routing domain of the unicast address block. Just as MX servers or name servers can be located anywhere in the Internet regardless of the location of the hosts under the DNS domain they are serving, multicast channels can be used anywhere in the world. The assingment policy automatically assures global uniqueness. M. Ohta Expires on December 25, 1999 [Page 8] INTERNET DRAFT Static Multicast June 1999 However, it is still possible to have multicast addresses with local scopes, as long as they share globally unique well known DNS names, which is what we are using for intra-subnet multicast with IANA assigned well known names [IANA]. 5. Core/RP location The location of core of CBT or rendez-vous point of PIM-SM through DNS is straight forward as: bbc.com. A 255.192.2.3 RVP london-station.bbc.com. or bbc.com. A 255.192.2.3 CORE london-station.bbc.com. Again, just as MX servers or name servers can be located anywhere in the Internet regardless of the location of the hosts under the DNS domain they are serving, core or rendez-vous points can be located anywhere in the world. CORE and RVP RRs have exactly the same syntax as PTR RR. Their query type values are . Neither the current CBT nor PIM-SM allow a single multicast group that has multiple cores or rendez-vous points, though future extension may. Thus, at the DNS level, a single node may have multiple CORE or RVP RRs. That is, the following DNS node is a valid node: bbc.com. A 255.192.2.3 RVP london-station.bbc.com. RVP wales-station.bbc.com. 6. Session Announcement The proposal is essentially to use a URL of RTP combined with SDP like: rtp://london-station.bbc.com/?t=2873397496+2873404696& m=audio+3456+RTP/AVP+0&m=video+2232+RTP/AVP+31 The URL contains all the necessary information to establish a session, including the domain name (or multicast address), port number(s), RTP payload type and optional QoS requirement. Then, users surfing over WWW can actively search or randomly encounter some multicast or unicast RTP URL. M. Ohta Expires on December 25, 1999 [Page 9] INTERNET DRAFT Static Multicast June 1999 If the user clicks on the anchor of a URL, the user will be queried whether he want to receive (should be default for multicast) or send data or both (should be default for unicast). He will also queried the source or destination of the data with appropriate default (his TV at the living room) and the multicast session begins, if necessary, with RSVP. 7. Policy Considerations Some people wrongly believe that the separation of multicast routing into the classical intra and inter domain functions is necessary for multicast policy control. Multicat policy is controlled by controlling forwarding direction of multicast control or data packets, which is controlled by the unicast routing table. If separate control of multicast and unicast policy is desired, what is necessary is to instantiate two sets of unicast routing tables, one used for normal unicast routing and the other to be consulted by multicast protocols. While there may be intra and inter domain unicast routing and inter domain multicast forwarding policy, we don't need a novel protocol to disseminate policies in some multicast specific way for inter and intra domain multicast policy routing. FOrwarding is forwarding. 8. References [MANOLO] Manolo Sola, Masataka Ohta, Toshinori Maeno, "Scalability of Internet Multicast Protocols", Proceedings of INET'98, http://www.isoc.org/inet98/proceedings/6d/6d_3.htm, July 1998. [IANA] For now, http://www.iana.org/ [CBT] RFC 2189 Core Based Trees (CBT version 2) Multicast Routing. A. Ballardie. September 1997. [PIM] RFC 2117 Protocol Independent Multicast-Sparse Mode (PIM-SM): Protocol Specification. D. Estrin, D. Farinacci, A. Helmy, D. Thaler, S. Deering, M. Handley, V. Jacobson, C. Liu, P. Sharma, L. Wei. June 1997. 9. Security Considerations For routers to merge multiple JOIN requests, which may contain different (forged or wrong) Cores/RPs of the same group, and to forward the JOIN to the true Core/RP, the routers must be able to M. Ohta Expires on December 25, 1999 [Page 10] INTERNET DRAFT Static Multicast June 1999 look up Core/RP from group address by themselves with some (weak or strong) security. The Core/RP information can not be flooded in advance for obvious scalability problem that the look up must be on demand. Because of this, e-mail, WWW or SDR can not be used to look up Core/RP and DNS is the easiest way. 10. Authors' Addresses Masataka Ohta Computer Center Tokyo Institute of Technology 2-12-1, O-okayama, Meguro-ku Tokyo 152, JAPAN Phone: +81-3-5734-3299 Fax: +81-3-5734-3415 EMail: mohta@necom830.hpcl.titech.ac.jp Jon Crowcroft Dept. of Computer Science University College London London WC1E 6BT, UK EMail: j.crowcroft@cs.ucl.ac.uk