Internet Draft J-J. Pansiot June 1999 D. Grad Expires December 1999 T. Noel A. Alloui LSIIT, Strasbourg LAR : a Logical Addressing and Routing Architecture for IPv6 Wide-area Multicasting Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document describes an architecture for IPv6 wide-area multicasting. It is based on two levels of addressing : A logical addressing level is used to identify logical objects independently of their current IPv6 address, such as multicast groups or mobile hosts. Logical addresses are used on top of regular IPv6 addresses. This architecture allows to define in a unified way mechanisms for inter-domain multicasting and mobility. 1. Introduction In this draft we present an architecture (called LAR for Logical Addressing and Routing) for efficient IPv6 multicasting and mobility. This architecture is based on two types of addresses: - Routing addresses (such as current IPv6 unicast addresses) are used in the routing process. These addresses may change, reflecting changes in routing (mobility, renumbering). Pansiot et al. Draft - Expires December 1999 [Page 1] INTERNET-DRAFT LAR June 1999 - Logical addresses are used to uniquely identify hosts or other objects. These addresses must not change when routing changes. For example multicast addresses may be considered as identifiers. We observe that in some situations many protocols in the Internet already use packets containing two levels of addressing. This can be illustrated by two examples (for the time being, we consider only destination addresses): - In PIM-SM [RFC2362] or CBT [RFC2189], packets sent by a source which is not part of the multicast tree are encapsulated towards the RP or Core: the outer header contains a routing address (of the RP) while the inner header contains a logical address (a multicast address). - In Mobile IPv6 [MOBILE], when a packet is forwarded to a mobile host from its Home Agent, it contains an encapsulation header. The outer header contains a routing address (the current address of the mobile), while the inner header (the routing header) contains the home address of the mobile. This last address may be considered as an identifier, since it should not change over time when routing changes. In both cases, the outer address is a routing address and the inner one an identifier. Our architecture is a generalization of this idea. It allows some simplification of multicast protocols, in particular in the case of sparse and inter-domain groups, and unifies mechanisms used for multicasting and mobility. Our architecture considers logical objects, each having a logical address (called a LAR address). In the following we call a LAR host (respectively a LAR router) a host (resp. a router) implementing LAR. A node is either a host or a router. The basic logical object is a logical node. For example a mobile host has a LAR address, independently of its physical location in the network. Other logical objects (e.g. groups) have logical addresses derived from host addresses, greatly simplifying multicast address allocation. LAR packets contain a logical header, containing source and destination logical addresses, in addition to the usual header containing source and destination routing addresses. The LAR destination address corresponds to the final destination while the routing destination address may correspond to an intermediate node, for example a multicast router that duplicates packets sent to a group. Routing addresses may be changed from a logical hop to the next, much in the same way as MAC addresses are changed when going through a router. A node maintains a LAR cache table, similar to the ARP cache, containing the current correspondence between routing addresses and LAR addresses of their correspondents. In the case of multicasting, a node maintains a Logical Routing table that associates a LAR group address to the list of LAR addresses of neighbors in the multicasting tree. This table contains an entry for a group only if Pansiot et al. Expires December 1999 [Page 2] INTERNET-DRAFT LAR June 1999 the node has an active role in the group (it duplicates packets). Hosts wishing to join a group send their join request to the group manager. This group manager, possibly after some security checking, forwards the request down the tree towards the new member. Advantages of this architecture: - Since join requests travel downstream from root to members, routers don't need to know in advance root (or RP, core) addresses: there is no need for a bootstrap router mechanism. Similarly, Border Routers don't need to know where is the root domain ( a la BGMP). - Since LAR multicast addresses are just identifiers derived from host LAR addresses, there is no need for a multicast address allocation protocol such as [MASC]. - For a given group, only routers that duplicate packets (i.e. routers with at least 3 interfaces in the multicast tree) must maintain information for that group. This implies a big saving in state, especially for sparse wide area groups. In particular it is possible for a multicast tree to get through whole transit domains without any state information maintained in these domains. - Since a multicast router knows its neighbors (in a given multicast tree) by their logical addresses, a change of unicast route between two neighbors does not break the tree as long as another unicast route exists. In particular the failure of a single link interrupts a multicast communication only the time for the unicast routing protocol to find another route, without any action at the LAR multicast routing level (see section 6). - Since nodes maintain in a cache the correspondence between LAR addresses and routing addresses, mobility mechanisms are just updates of this cache. This works not only for unicast (such as in Mobile IPv6) but also for multicast. When a mobile moves, it just needs to update the table of its neighbors in multicast trees it belongs to. There is no need to remove a tree branch at the former location and add a tree branch at the new location. The duration of traffic disruption is about the same as in the unicast case, namely the time of a binding update. - The LAR cache hides routing addresses from applications: An application need not know of events such as renumbering, or change of interface for a multihomed host. Transport or application level connections are not broken by these events. - LAR multicast trees can be constructed even if some intermediate routers (or even whole routing domains) do not implement LAR, or do not wish to duplicate packets for a given group. At worse, the multicast tree won't be optimal. In particular, it is not mandatory for a group member to be directly connected to a LAR router. Pansiot et al. Expires December 1999 [Page 3] INTERNET-DRAFT LAR June 1999 - In the LAR architecture, it is possible to define a group (a set of hosts sharing some goals) and subgroups (subset of hosts sharing an application at a given time). Note that this draft is not the specification of a protocol, but the general description of an architecture. Section 2 deals with addressing and naming, section 3 with communication management, section 4 with logical routing, section 5 with mobility, section 6 with robustness, section 7 with inter-domain , mainly scalability and policy issues. Section 8 deals with security considerations, section 9 presents an outline of how we implemented LAR. Finally, section 10 presents a brief comparison with related propositions. 2. LAR addressing and naming In this architecture, we propose the addition of a new addressing space in addition to the usual IPv6 addressing. It allows a unique and simplified framework for multicasting and mobility. 2.1. Addressing The basic model is as follows: any logical object has a logical address that is globally unique and non-ambiguous in the Internet. There are primarily two types of logical objects: logical hosts and communications. The basic object that can be addressed in our architecture is a host implementing LAR. Such a host acquires an individual logical address by the same mechanisms it acquires a fully qualified domain name. The main differences between usual IPv6 addresses and LAR addresses are the following: - A host may have only one LAR address even if it is multihomed. This is because applications usually don't care about the actual interface used, and should not be disrupted when the interface used changes. Note that it is possible for a physical host to have several logical addresses, if it is hosting several logical hosts. This could be the case of several web servers with different names sharing the same host. - A LAR address is independent of routing information. Therefore it remains unchanged when a host changes its physical location (mobility) its Internet provider (renumbering) or the interface used (multihoming). Therefore host LAR addresses are logical host identifiers and one may compare them to other identifiers such as MAC addresses or EUI- 64 identifiers except that: - a LAR address identifies a host, not an interface - a LAR address is independent of the hardware (it remains Pansiot et al. Expires December 1999 [Page 4] INTERNET-DRAFT LAR June 1999 unchanged even if the actual machine or interface is changed) - one can easily retrieve the name of a host from its LAR address (see section 2.2). The other type of logical object with a logical address is a communication. Indeed, a communication LAR address identifies either a point-to-point or a multipoint (group) communication. In the point-to-point case, the use of logical addressing provides stable communications with mobiles and may offer a solution for letting TCP connections survive when network addresses change[ETCP]. A group LAR address is similar to an IPv6 group address. A group, in our architecture, has a manager, which deals with the control of the group (see section 3 for more details). Several applications may be run simultaneously in the group, possibly involving subgroups of members. Indeed, members of a same group do not necessarily all have the same needs nor the same resources. For example, some will ask for a video stream whereas others simply wish to dialogue using a white board application. The group manager can decide to create two communication instances (two trees) with two different communication (logical) addresses, allowing each member to take part in one or the other (or both) communication(s). One may have several trees in the same group to deal with applications having different requirements, for example a shared centered tree for a whiteboard, and several source rooted shortest path trees for video sources. Moreover filtering may be used to deliver data to only interested members of the group: a packet sent to a subgroup travels along a tree but may be filtered at some intermediate nodes. That is only one tree has to be maintained for several applications with distinct subsets of receivers. On the other hand, there is a particular shared tree connecting all group members, called the control tree. It offers a support for the exchange of control messages inside the whole group. It can be used in particular by the group manager to advertise the creation of new instances, and their logical addresses, and the launching of new applications. A group LAR address is constructed by adding a suffix to the logical address of the group's creator. Thus, group address allocation may be achieved in a simple and distributed way without the need for a multicast address allocation protocol. In the same way, the logical address of a communication instance is derived from the address of the group by adding a suffix. 2.2. Naming, addressing and their relationship In many cases it is helpful for logical objects to have a name for human use. We define an addressing schema and a name service providing the name to LAR address and LAR address to name mappings Pansiot et al. Expires December 1999 [Page 5] INTERNET-DRAFT LAR June 1999 in a simple way. This service may be achieved using an extension to the DNS, by adding a new record type and using dynamic updates. Addresses and names used in LAR are built in the same hierarchical way: host (creator) > group > instance (tree). One can query the name service with the (LAR) name of a host to get its logical address and its IPv6 address, and similarly a request with a group (LAR) name will provide the logical address of the group as well as the name of its manager. Unlike the current Domain Name System, where the name and address hierarchies are completely independent, we propose an addressing schema in which logical addresses and names reflect the same hierarchy. This is possible only because LAR addresses are independent of routing. A host LAR address is a coding of a Fully Qualified Domain Name. Each domain gives a unique number to its child domain. The "human" representation of LAR addresses will be dotted decimal. For example, if top domain fr is given number 15 among all top domains, domain u-strasbg is given number 155 inside domain fr, and host clarinet is given number 2654 inside domain u-strasbg.fr, we have: clarinet.u-strasbg.fr <-> 15.155.2654 This notation is very similar to the designation of objects in a SNMP MIB [RFC 1155]. With this notation, we don't need to create a new hierarchy of Domain Name Servers. The address to name table can be automatically constructed from the name to address table. In the same way, one can easily add a level of hierarchy to the logical names, taking into account that groups created by a host add a level to the tree structure. In our example if host clarinet is responsible for the creation of a group called jazz with (local) number 3 then one has the mapping: jazz.clarinet.u-strasbg.fr <-> 15.155.2654.3 As far as encoding is concerned, we may use the compact encoding used in SNMP for object identifier. This is possible only if this variable length encoding fits into a fixed size LAR address field. We think this is possible in the case of IPv6 addresses. Indeed with an SNMP-like encoding, we expect that each host can be assigned a LAR address using at most ten bytes. Therefore we suggest that LAR addresses be taken inside the IPv6 address space, with a specific prefix. If we assume for example a 12 bits prefix LAR_PREFIX, a 4- bit address length field, a LAR address will be of the form: LAR_PREFIX.ADDRESS_LENGTH.LAR_HOST.SELECTORS coded with 16 bytes. The LAR_HOST part should be in most cases less than 10 bytes, leaving 4 bytes for selectors. This leaves for example the possibility for one host to create 2^16 groups, each one with 2^16 instances. Pansiot et al. Expires December 1999 [Page 6] INTERNET-DRAFT LAR June 1999 Note that with this choice current applications should not be aware of the difference between LAR addresses and IPv6 addresses, and should be able to use both types of addresses. 3. Communication manager 3.1. Communication manager functionalities Any LAR host has a Communication Manager. This LAR entity manages LAR communications initiated or used by this host. Indeed, the creator of a group can define a certain number of parameters associated with this group. For example, it can authorize only certain hosts to join. It can also define a set of applications that can operate above the LAR communication. It has the possibility also of defining some security rules for adhesion. This functionality offers some security, since requests to join a LAR communication must go through the communication manager of the group creator (or its delegated manager). Joining a LAR group communication does not depend only on the single initiative of the new candidate, but also on the management rules used for this particular group communication. The Communication manager of a LAR host memorizes the characteristics of the communications it belongs to. At the time of its initialization, each local manager must have at least the following configuration parameters: - Its logical address, - Its logical name, - LAR host addresses for which it is the delegated manager. If an application wishes to join a group communication, it asks its (local) manager to make the steps of joining. The manager must get in touch with the manager of the group. Thereafter, it has the responsibility to inform and communicate the LAR communication address to the application. The manager memorizes for each group for which it is responsible, the characteristics of the group and available applications. It deals with LAR address allocation, it queries and updates the domain name service (DNS) for logical names and addresses, it receives and processes join requests. We describe these mechanisms by detailing the various group operations in the LAR architecture: creating, joining and leaving a group. 3.2. Group naming A group communication is identified by a logical hierarchical name, similar to a DNS (Domain Name Service) name. For instance, "jazz.clarinet.u-strasbg.fr" could represent a conference organized by some user of host clarinet, whose goal could be to multicast jazz songs all over the internet. The association between a group name and the corresponding unique LAR address can be performed by a DNS. This DNS should support dynamic updates [RFC 2136] because groups Pansiot et al. Expires December 1999 [Page 7] INTERNET-DRAFT LAR June 1999 may be created dynamically. This association may also be obtained by other means. To join a LAR communication, the group LAR address and the name of the group manager are needed. Thus, with the name of the group communication are associated the LAR group address and the LAR name of the manager. In most cases, we expect that the manager name is the name of the group creator, and is a prefix of the group name. In the previous example, the communication manager of jazz.clarinet.u-strasbg.fr is the hostname clarinet.u-strasbg.fr. But in some cases, the communication manager of a group communication may be located on another host. 3.3. Group creation Preliminary remark : in our architecture, groups are explicitly created, contrary to the current multicasting model. Let us take the example of an application on host H named NAME_H wishing to create a group named PARTIAL_NAME. The application transmits to the host local manager a group creation message that specifies this name. Upon receipt of this message, the manager gives a full name and address to the group. It prepends the partial name provided by the host's application to its name (i.e.: NAME_G = PARTIAL_NAME.NAME_H), then it creates a LAR address identifying the group, by addition of a selector to its own logical address (i.e. @LG = @LH.SEL). The selector uniquely identifies this group among all groups created by this manager. Then the manager dynamically updates the DNS which associates the group name (NAME_G) and its LAR communication address (@LG) as well as the name of the group manager (in this case NAME_H). Finally, the manager triggers the creation of a multicast tree reduced to a single node H if H is the root of the tree. The logical routing table for H associates the group address with its own LAR address (i.e. @LG -> @LH). . If the root R of the tree is not H, the tree is reduced to an edge joining H to R, and the logical routing table of H contains @LG -> @LH, @LR. When an application creates a group, it has the possibility to define some rules for group management. For example, it can authorize or not hosts to send data to the group without being a member (open group). It can also ask the group manager to authorize only certain hosts to join the communication. As seen above, when an application wants to create a group, it asks its local manager to manage it. However it is possible for the local manager to delegate this responsibility to another (remote) manager. For example, this could be the case for a mobile host delegating management to a fixed host. Pansiot et al. Expires December 1999 [Page 8] INTERNET-DRAFT LAR June 1999 3.4. Joining a group We suppose that host H learns a group name by any mean independent of our architecture, for example www, session directory, electronic mail, or news, and H wishes to join this group. The host queries the DNS, which returns the LAR group address (noted @LG), and the LAR manager address, noted @LM. The host could also learn this address directly. The host can then send a join request to the group manager. Upon receipt of the request, the manager accepts or not H as a new member of the group. In a positive case the manager triggers the insertion of H in the multicast tree associated to the group (see algorithm in section 3.5). Finally, host H receives the LAR address of its neighbor in the tree, and stores it in the logical routing table entry corresponding to this group. 3.5 Tree construction The first node of a tree is the root. When the membership of a new host H has been accepted, the group manager sends a join acknowledgment to the root. This message travels hop-by-hop down the tree, until the point where the unicast route to the new member leaves the tree. Note that there are several cases : - At some tree node A, the next hop toward H is different from the next hop toward any tree node. Then a new edge (A, H) is created. - At some tree node A, the next hop towards H is the same as the next hop towards some tree node B, but B is not on the unicast route towards H. This means that the route from A to B and the route from A to H are the same up to a router C, C not yet a LAR node in the tree. In this case, the edge (A, B) is split into two edges (A, C) and (C, B) by adding a new tree node C. Then an edge (C, H) is created. In both cases, H knows it has been inserted in the tree when it receives the edge creation message. This message contains the logical and network addresses of its neighbor in the tree. Note that adding a new member adds at most one new router in the tree. 3.6. Leaving a group A group member may announce its leaving to its neighbor with a LAR_LEAVE message. The neighbor node may also detect that the host as left (or failed) if it has not received a LAR_HELLO message for a specified time. 3.7 Tree instances Once a group is created, with its associated bi-directional tree connecting all members called the control tree, one or several applications may be launched using this tree. If an application has Pansiot et al. Expires December 1999 [Page 9] INTERNET-DRAFT LAR June 1999 a need for a special kind of tree (for example a source rooted tree for a video stream), it may ask the manager to trigger the creation of a new communication instance (with a new address derived from the group address), corresponding to a new tree. The creation of this new instance is advertised via the control tree, and other members may join this new tree. 4. Logical Routing 4.1. Reduced trees and Logical forwarding A LAR tree is a tree whose vertices are nodes implementing a LAR entity, and whose edges are (network) routes between these nodes. In the common case of unicast communications, a LAR tree is usually only a single edge connecting two hosts. In the case of multicast communications, a LAR tree is usually composed of several nodes, including end nodes of degree one corresponding to members (sources or receivers), and interior nodes with degree at least two, that duplicate and forward packets to their neighbors. A packet is forwarded from one node to all end nodes of the LAR tree corresponding to the LAR destination address. In the case of multicast, LAR addresses can be seen as tree identifiers. Note that a LAR neighbor is not necessarily a direct neighbor at the network level. Instead, one may construct reduced multicast trees, where nodes have a duplication role: basically they have degree at least 3. Therefore an edge of a reduced multicast tree is in fact a route at the network level. Intermediate routers along logical tree edges forward packets based on the IPv6 level address as usual, and don't even need to implement a LAR entity. In most unicast cases, only LAR entities at endpoints are concerned. In a LAR node N, given a LAR destination address, the forwarding process consists in determining neighbors in the LAR tree and then sending a copy of the packet to all the neighbors except the sending one. Therefore the IPv6 header must contain the IPv6 address of LAR node N, in order for the receiving LAR node to know which incoming logical edge was used. 4.2. Logical Routing Data Structures One role of the LAR entity is to dynamically map LAR addresses and IPv6 addresses, in presence of network-level changes. This is done by maintaining a LAR cache table. For each LAR node address in use, this table contains an entry with the current correspondence between the node LAR address and a usable IPv6 address. In addition, this entry may contain a list of other IPv6 addresses of the same node, for example for routers or multihomed hosts. This table is updated when IPv6 addresses change (mobility, renumbering, ...). A LAR routing protocol is in charge of constructing and maintaining a LAR tree called a reduced tree. At each node, local tree Pansiot et al. Expires December 1999 [Page 10] INTERNET-DRAFT LAR June 1999 information is maintained in the LAR logical routing table. This table contains the current correspondence between a LAR communication address and the list of neighbor LAR addresses to which a packet must be sent. In the case of a multicast LAR address, this is the list of adjacent nodes in the multicast tree. This table is the local view of a LAR tree. A flag UNI_TREE is encoded in the address. If this flag is set, data may only flow from the root of the tree. If this flag is not set, the tree is bi-directional. Data arriving by any edge is propagated to all other edges. The initial tree for a group (the control tree) is always bi-directional to allow multipoint to multipoint communication. Another problem with multicast routing is the possibility for sources outside a group to send data to the group (such a group is called open). We propose the following mechanism: when a LAR communication is created its creator decides if it is open or not. A corresponding flag OPEN_TREE is encoded in the address. A LAR router may or not accept to be part of a tree whose address has the OPEN_TREE flag set. When a router receives a multicast packet from a node which is not a neighbor in the LAR tree, the packet is forwarded to all neighbors if the tree is open, otherwise it is discarded. In the former case, an OUTSIDE flag is set in the header to warn all members that the sender is not a member of the group. Note: We could define a hop by hop option such that a packet sent toward the root of the tree by an outside source is intercepted by the first router which is part of the open tree, and then propagated along the bi-directional tree. Though, we suggest that the default behavior is not open. 4.3. Filtering When a new application is launched inside a group or inside a specific communication instance, not all group members may be willing or capable to run this application, introducing the notion of a subgroup. A packet sent to a subgroup travels along the tree corresponding to the group (or instance) but may be filtered at some intermediate nodes. In order to do this, a logical routing table entry may contain a MASK field for each neighbor and the logical packet header includes a SUBGROUP field. Only packets whose SUBGROUP field matches the MASK value (A logical AND of this two values is not all zeroes) are forwarded onto corresponding logical edges. Mask values are transmitted by receivers, using the tree maintenance messages (LAR_HELLO). They are OR'ed at router nodes, to be forwarded to their neighbors. In addition to sub-groups, this mechanism allows to create send-only branches for sources not willing to receive data. Pansiot et al. Expires December 1999 [Page 11] INTERNET-DRAFT LAR June 1999 4.4. Sending to a logical object A LAR packet sent by a node S to a logical object D (tree or neighbor in case of unicast) is encapsulated in a IPv6 packet. At the IPv6 level, source and destination addresses are IPv6 addresses of the current logical edge (@NA as sender and @NB as neighbor if we consider logical edge A B). At the LAR level, source and destination addresses are logical addresses of the initial sender @LS and of the final destination @LD (tree or neighbor). | IPv6 header | LAR header | DATA +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | @NA | @NB |...| @LS | @LD | ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The receiving node B, following a logical routing table lookup, duplicates the packet to each tree neighbor, encapsulating the unchanged logical header in each IPv6 packet (@NB as source and @NN for each neighbor). 5. Mobility The LAR architecture is well suited for communications where correspondents must be identified logically (i.e. independently of their current IPv6 address), like mobile hosts. Indeed, LAR hosts use logical addresses as identifiers. In particular, applications or transport level connections may use these identifiers instead of IPv6 addresses. This can be done transparently since LAR addresses and IPv6 addresses share the same addressing space. The binding between the IPv6 addresses and the LAR address of a host is done through the LAR cache table. It associates the logical identifier with the list of current IPv6 addresses of a LAR host. When a mobile node moves and acquires a new IPv6 address, it continues to use its LAR address as an identifier of the communication. It only updates the LAR cache table of its correspondents by sending its new IPv6 address, similarly to the binding update of Mobile IP. However, contrary to Mobile IP, this mechanism works also for multicasting: the mobile host updates the LAR cache table of its neighbors in LAR trees. There is no need to tunnel data between the Home Agent and the Mobile or to have the mobile make a new join at its new location. This relatively simple mechanism allows a transparent use of the mobile hosts in the LAR framework. 6. Robustness In this section we consider problems that may arise from failures in the network. Pansiot et al. Expires December 1999 [Page 12] INTERNET-DRAFT LAR June 1999 6.1 Link failure The most common failure is a network link failure. In our architecture, this would lead to the following scenario: 1) A unicast route used by a logical edge (say from A to D) becomes unavailable, 2) the unicast routing process detects this problem and computes a new unicast route, 3) two cases may arise whether or not the last hop of the unicast route changes, - If the new route has the same last hop than the old one in both directions (for example route A-B-C-D becomes A-B-E-C-D after the failure of link B-C), there is no change at all in the LAR tree, and multicast traffic is disrupted only while unicast routing converges, - If the new route has a new last hop (for example route A-B-C-D becomes route A-B-E-D after the failure of link C-D), LAR packets sent along the logical edge A-D will be sent to the IPv6 address of D corresponding to the failed link. In this case it is possible that a NET (or HOST) UNREACHABLE message is received. The LAR routing process of A may then update the LAR cache table entry for host D, by removing the unavailable IPv6 address from the list and replacing it by another one. This is possible since a LAR cache entry contains the list of all known IPv6 addresses for a node. Again the LAR tree itself is not changed, and multicasting is disrupted while unicast routing converges plus the time to update the cache entry. Note also that all trees using the same logical edge A-B are repaired simultaneously. 6.2 Node failure The failure of a node, which is not a node of the LAR tree, but is an intermediate router along a logical edge is very similar to the failure of a link, and is solved in the same way. More generally, most changes in the unicast routing that do not affect the reachability of LAR nodes of a tree will be considered in the same way. A more serious problem is the failure of a LAR tree node. Note that in the classical model for multicasting, all nodes forwarding data packets for a group are part of the multicast tree, whereas in our architecture, only a fraction of these nodes are LAR tree nodes. A tree maintenance protocol is used, such as in CBT and PIM-SM: tree nodes send periodically a HELLO message to their parent in the tree (with an acknowledgement from the parent). The absence of HELLO for a specified duration indicates that the downstream node has failed. If the specified duration is greater than the unicast routing protocol convergence delay, it really means that the node is dead or the network is partitioned. In this case the failed node is removed Pansiot et al. Expires December 1999 [Page 13] INTERNET-DRAFT LAR June 1999 from the logical routing table. Symmetrically, sons of the failed node detect this failure. They may flush the entire subtree. Another possibility would be for these sons to ask via the group manager to be grafted to the tree. This implies that routers must keep track of group managers. This solution would be more scalable for very large groups when the failure is close to the root. The root failure is an even more important issue. If we consider that the group manager did not failed, or that several redundant managers have been set up, a possible solution is as follows: the group manager detects the root failure and chooses a new root. When sons of the failed root detect this failure, they ask the manager to be grafted back to the tree. The manager sends this request to the new root, similarly to the failure of any LAR node. Note that the group identifier is not linked to the root address, so the root can be changed without changing the whole tree. 6.3 Tree reshaping In the LAR architecture, many changes in the underlying network (change in unicast routing, renumbering, move of a mobile host) are taken into account by updating the LAR cache table. Network routes corresponding to a LAR edge change but the logical tree remains the same. In many cases this will lead to a non-optimal tree. In particular, it is possible that at a given time, two edges from a node share the same first network hop. The LAR routing protocol will be in charge to detect this kind of situation, and to reshape the tree, by adding and/or removing a LAR node. Example: for a given tree, assume node A has 3 LAR neighbors B, C and D. The unicast route from A to B is A-E-B, and the unicast route from A to C is A-F-C. These two routes are disjoint. Assume now that node E fails. The unicast routing will replace route A-E-B by say A-F-G-B. The LAR tree is again operational, but LAR edges AB and AC start with the same first hop F. In this case the LAR routing protocol will detect this, and split edge A-C into two edges A-F and F-C where F is a new LAR node in the tree. Then edge A-B will be replaced by edge F-B. 7. Inter-domain issues Inter Domain multicasting brings several scalability problems, and policy routing problems. 7.1. Tree state in routers State to be maintained in a router which is part of a multicast tree is similar to other sparse mode protocols (CBT, PIM-SM). However, less routers will be involved in a given tree, especially for very sparse groups. For example a group consisting of 3 members in one site in California and 3 other members in another site in France, might involve only two routers, one on each site, whereas current Pansiot et al. Expires December 1999 [Page 14] INTERNET-DRAFT LAR June 1999 protocols would involve at least a dozen routers along the route between the two sites, including routers in Internet backbones. Note that in our architecture, if a multicast tree connects N different LANs, it involves at most N-1 routers. Obviously, if multicast becomes widely used in the internet, routers may still have a huge number of trees to remember, and tree aggregation is a hard problem. The notion of filtering and subgroups is a first step to avoid the creation of many similar trees. 7.2. Address allocation, root advertisement. Our architecture does not need a specific address allocation protocol used with IPv6, nor protocols to advertise root (Core, RP) for trees. This saves both state in routers and bandwidth. The root of a tree is chosen by its communication manager and may be replaced by the same manager. The manager may be dynamically retrieved from the group address via the DNS. Therefore several problems with multicasting are handled by communication managers, not routers. A significant part of the load due to a group is supported by the group manager, which seems more scalable. 7.3. Very large groups A problem with very large groups is the reconnection of whole sub- trees when a node fails. Our notion of logical edge should simplify the grafting of a sub-tree back to a tree.(see 6.2) 7.4. Policy routing Multicast routing between autonomous systems must be subject to control in the same way as unicast routing is. An autonomous system needs an authorization (agreement) of a nearby autonomous system before using its resources for relaying traffic. An inter-domain multicast protocol must take into account policy constraints and hence offer a policy model. Current unicast routing policy routing is achieved through selection/restriction of destinations advertised to neighbors. Autonomous systems exchange routing information by an inter-domain routing protocol, such as BGP4. For example if domain A advertises destination D towards domain B, it means that packets coming through B with destination D are allowed to transit through A. The notion of routing policy for multicast is more complex. We will consider the following type of policy for a domain A, concerning a destination D, and a neighbor domain B: - Policy 1: A allows groups originating on the B side (whose creator is on the B side) to have members in destination D. - Policy 2: A forbids groups originating on the B side to have such members. Pansiot et al. Expires December 1999 [Page 15] INTERNET-DRAFT LAR June 1999 BGP4+ is an extension of BGP4 allowing to advertise other types of routes in addition to unicast IPv4 routes. This allows implementing different policies for unicast and multicast. In order for LAR to support policies 1 and 2, we assume that a boolean attribute MM (multicast member) is added to all destinations. Paths to D with MM set are advertised by A towards B if and only if multicast trees originating on the B side are allowed to transit through domain A towards members in D. The management of this attribute should not be a big extension to BGP. Note that we may have two routes to destination D advertised in the same direction: one with MM set, and one with MM unset. Routes received with MM set will be stored in the M-RIB (Multicast Routing Information Base) of the BR. To comply with multicast routing policy as described above, the tree construction algorithm of section 3.5 is modified as follows. While the join acknowledgment message travels from the root to the new member H hop by hop - if the current node is not a border router, or if the current node is a border router, and the unicast route and multicast route towards H are the same, apply algorithm of section 3.5 (use unicast route table). - If the current node is a border router and the unicast route (MM unset) differs from the multicast route (MM set), send the join message along the multicast route to the ingress border router of the next AS. This is possible because of the two levels of addresses. This router will become a LAR node of the tree in order to insure that the logical edge (which is a unicast route) comply with the multicast policy. This is similar to explicit routing. Note that in the LAR architecture, it is not necessary to advertise group addresses as in BGMP, since trees are constructed from root to members. Also, if we consider that most members will usually be receivers, policy is applied in the same direction as in the unicast case: just consider members as destinations and root as source. 8. Security considerations Since LAR addresses never change, both when travelling inside the network (no translation) and with time (mobiles keep their LAR address), an authentication header should be based on the LAR extension header. The IPv6 header will change with time (mobility) and while the packet gets through LAR nodes. Since a LAR address remains unchanged across network renumbering, and the name of the host can be retrieved from the address, it is also a good key for accounting. Note that a packet with a given LAR source address may come from anywhere, since this address is independent of physical location Pansiot et al. Expires December 1999 [Page 16] INTERNET-DRAFT LAR June 1999 (easy spoofing). Therefore when security is a concern, the authentication header must be used. With regard to secure multicasting, LAR provides a first step for the design of secure and scalable architecture. Indeed, a first level of access control is achieved by the group manager, without any router involvement. Moreover adding a new member to the secure tree will add at most one router, requiring at most one router authentication. 9. Implementation Part of the LAR architecture has been implemented using IPv6, on FreeBSD, both for multicasting and mobility. Since we have taken LAR addresses as a subset of IPv6 addresses, applications may use LAR without much change. LAR routing functionalities are provided by a daemon. The LAR logical routing table and the cache table are implemented in the kernel. The LAR header is implemented as a IPv6 destination option, inserted automatically by the kernel in each LAR packet. In addition, a hop-by-hop option is defined and used by the daemon in some control messages. Particularly, to efficiently implement the tree construction algorithm described in section 3.5. 10. Related works There have been some recent proposals [SIMPLE][EXPRESS] defining new multicasting architectures. Though these architectures were specified for IPv4, we will in this section give a brief comparison of their design concepts with the LAR ones. The basic idea of Simple Multicast (SM) is the identification of a group by a tuple (Core, ID). Multicast Address allocation becomes simple: collisions are managed locally by the Core. Multicast tree construction is similar to CBT. Although, routers don't need any extra mechanism to advertise/discover Cores. Instead, the Core address is carried out by control and data packets. EXPRESS defines a new multicast delivery service. In this proposal, a channel (tree) is defined for each pair (S, E), where S is the sender's source address and E is a channel destination address. Only S may send in channels (S,*). A host subscribes to a channel (S,E) by sending a request to the network (towards S), specifying both S and E in the subscription request. The main similarities with SM and EXPRESS are : - There is no need for a multicast address allocation protocol. Pansiot et al. Expires December 1999 [Page 17] INTERNET-DRAFT LAR June 1999 - Routers don't need (an extra mechanism) to discover the location of the root of the multicast tree. However LAR differs from SM and EXPRESS in that : - A first level of access control is provided by the LAR group manager without any router involvement. - LAR trees construction is done from root to members. So, in case of asymmetric links, this will give better results than routes based on the reverse path. - A LAR group address is not linked to the root of the tree, so changing the root of a tree is possible without rebuilding most of the tree. - The use of logical addressing, allow to construct reduced multicast trees, reducing the number of routers involved in a given tree, especially for sparse groups. Moreover, LAR trees presents interesting properties : (1) They are more stable : in many cases, a change in unicast topology will not lead to a change of the multicast tree. (2) If the unicast level implements some form of load sharing, LAR will implicitly make use of it, since a LAR edge is a unicast route. This will be particularly true of very sparse groups. 11. Further Work Obviously many things remain to be done. Among them: - study how to deal efficiently with broadcast medium - give a precise specification of the LAR routing protocol to construct a LAR tree - study how LAR could interact with other multicast protocols. Note LAR can be run in parallel with other multicast protocol since it uses a different encapsulation. - improve LAR tree construction to deal with QoS. 12. References [BGMP] D. Thaler, D. Estrin, D. Meyer. Border Gateway Multicast Protocol(BGMP):Protocol Specification . August 1998. [ETCP] C. Huitema, Multi-homed TCP, internet-draft, May 1995. [EXPRESS] H. Holbrook and D. Cheriton, IP Multicast Channels: EXPRESS Support for Large-scale Single-source Applications, To Pansiot et al. Expires December 1999 [Page 18] INTERNET-DRAFT LAR June 1999 appear in the Proc. of ACM SIGCOMM '99, Cambridge, Massachusetts, September 1999 [MALLOC] The Multicast Address-Set Claim (MASC) Protocol, , D Estrin et al. August 1998. [RFC 1157] J.D. Case, M. Fedor, M.L. Schoffstall, C. Davin.Simple Network Management Protocol (SNMP). May-01-1990. [MOBILE] D.B. Johnson, C. Perkins: Mobility Support in IPv6, Internet draft (work in progress), draft-ietf-mobileip-ipv6- 07.txt, November 1998. [RFC 2137] D. Eastlake, Secure Domain Name System Dynamic Update, April 1997 [RFC 2189] A. Ballardie, Core-Based Trees (CBT version 2) Multicast Routing. September 1997 [RFC 2201] A. Ballardie. Core-Based Trees (CBT) Multicast Routing Architecture, September 1997. [RFC 2362] D. Estrin, D. Farinacci, A. Helmy, D. Thaler, S. Deering, M. Handley, Jacobson, C. Liu, P. Sharma, L. Wei, Protocol Independent Multicast-Sparse Mode (PIM-SM): Protocol Specification. June 1998. [SIMPLE] R. Perlman, C-Y Lee, A. Ballardie, J. Crowcroft, Z. Wang T. Maufer, C. Diot, J. Thoo, M. Green, Simple Multicast: A Design for Simple, Low-Overhead Multicast, Internet draft, February 1999. 13. Author's Addresses Jean-Jacques Pansiot, Dominique Grad, Thomas Noel, Abdelghani Alloui LSIIT, Louis Pasteur University Boulevard Sebastien Brant 67400 ILLKIRCH FRANCE Email: {pansiot, grad, noel, alloui}@dpt-info.u-strasbg.fr Pansiot et al. Expires December 1999 [Page 19] INTERNET-DRAFT LAR June 1999 Full Copyright Statement "Copyright (C) The Internet Society (date). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into Pansiot et al. Expires December 1999 [Page 20]