Network Working Group                                      A. Csaszar
Internet Draft                                              G. Enyedi
Intended status: Standards Track                              S. Kini
Expires: September 7, 2011                                   Ericsson
                                                         March 7, 2011


                  IP Fast Re-Route with Fast Notification
                       draft-csaszar-ipfrr-fn-00.txt


Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This Internet-Draft will expire on September 7, 2011.

Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.

Abstract


Csaszar, Enyedi, Kini Expires September 7, 2011               [Page 1]

Internet-Draft                IPFRR-FN                      March 2011


   This document describes a mechanism that provides IP fast reroute
   (IPFRR) by using a failure notification (FN) to nodes beyond the ones
   that first detect the failure (i.e. nodes that are directly connected
   to the failure point). The paths used when IPFRR-FN is active are in
   most cases identical to those used after Interior Gateway Protocol
   (IGP) convergence. The proposed mechanism can address all single node
   and link failures in an area and has been designed to allow traffic
   recovery traffic to happen quickly (The goal being to recover in
   under 50msec).

Table of Contents


   1. Introduction................................................2
   2. Overview of current IPFRR Proposals based on Local Repair.....5
   3. Requirements of an Explicit Failure Signaling Mechanism.......6
   4. Conceptual Operation of IPFRR relying on Fast Notification....7
      4.1. Preparation Phase.......................................7
      4.2. Failure Reaction Phase..................................7
         4.2.1. Activating Failure Specific Backups................8
   5. Operation Details...........................................9
      5.1. Message Handling and Encoding...........................9
         5.1.1. Failure Identification Message for OSPF...........10
         5.1.2. Failure Identification TLV for ISIS...............12
      5.2. Bypassing Legacy Nodes.................................12
      5.3. Capability Advertisement...............................13
   6. Protection against Replay Attacks...........................13
      6.1. Calculating LSDB Digest................................14
   7. Security Considerations.....................................14
   8. IANA Considerations........................................15
   9. References.................................................15
      9.1. Normative References...................................15
      9.2. Informative References.................................15
   10. Acknowledgments...........................................16
   Appendix A. Memory needs of a Naive Implementation.............17
      A.1. An Example Implementation..............................17
      A.2. Estimation of Memory Requirements......................18

1. Introduction

   Convergence of link-state IGPs, such as OSPF or IS-IS, after a link
   or node failure is known to be relatively slow. While this may be
   sufficient for many applications, some network SLAs and applications
   require faster reaction to network failures.

   IGP convergence time is composed mainly of:


Csaszar, Enyedi, Kini Expires September 7, 2011               [Page 2]

Internet-Draft                IPFRR-FN                      March 2011


   1. Failure detection at nodes adjacent to the failure

   2. Advertisement of the topology change

   3. Calculation of new routes

   4. Installing new routes to linecards

   Traditional Hello-based failure detection methods of link-state IGPs
   are relatively slow, hence a new, optimized, Hello protocol has been
   standardized [BFD] which can reduce failure detection times to the
   range of 10ms even if no lower layer notices the failure quickly
   (like loss of signal, etc.).

   Even with fast failure detection, reaction times of IGPs may take
   several seconds, and even with a tuned configuration it may take at
   least a couple of hundreds of milliseconds.

   To decrease fail-over time even further, IPFRR techniques [RFC5714],
   can be introduced. IPFRR solutions compliant with [RFC5714] are
   targeting fail-over time reduction of steps 2-4 with the following
   design principles:

           IGP                        IPFRR

   2. Advertisement of the       ==>    No explicit advertisement,
      topology change                only local repair

   3. Calculation of new routes   ==>    Pre-computation of new
                                routes

   4. Installing new routes      ==>    Pre-installation of backup
      to linecards                  routes

   Pre-computing means that the way of bypassing a failed resource is
   computed before any failure occurs. In order to limit complexity,
   IPFRR techniques typically prepare for single link, single node and
   single Shared Risk Link Group (SRLG) failures, which failure types
   are undoubtedly the most common ones. The pre-calculated backup
   routes are also downloaded to linecards in preparation for the
   failure, in this way sparing the lengthy communication between
   control plane and data plane when a failure happens.

   The principle of local rerouting requires forwarding a packet along a
   detour even if only the immediate neighbors of the failed resource
   know the failure. IPFRR methods observing the local rerouting
   principle do not explicitly propagate the failure information.


Csaszar, Enyedi, Kini Expires September 7, 2011               [Page 3]

Internet-Draft                IPFRR-FN                      March 2011


   Unfortunately, packets on detours must be handled in a different way
   than normal packets as otherwise they might get returned to the
   failed resource. Rephrased, a node not having *any* sort of
   information about the failure may loop the packet back to the node
   from where it was rerouted - simply because its default
   routing/forwarding configuration dictates that. As an example, see
   the following figure. Assuming a link failure between A and Dst, A
   needs to drop packets heading to Dst. If node A forwarded packets to
   Src, and if the latter had absolutely no knowledge of the failure, a
   loop would be formed between Src and A.

            +---+             +---+
            | B |-------------| C |
            +---+             +---+
           /                       \
          /                         \
         /                           \
    +---+            +---+  failure   +---+
    |Src|------------| A |-----X------|Dst|
    +---+            +---+            +---+
      =========>==============>=========>
                  Primary path


    Figure 1 Forwarding inconsistency in case of local repair: The path
                       of Src to Dst leads through A

   The basic problem that previous IPFRR solutions struggle to solve is,
   therefore, to provide consistent routing hop-by-hop without explicit
   signaling of the failure.

   To provide protection for all single failure cases in arbitrary
   topologies, the information about the failure must be given in *some*
   way to other nodes. That is, IPFRR solutions targeting full failure
   coverage need to signal the fact and to some extent the identity of
   the failure within the data packet as no explicit signaling is
   allowed. Such solutions have turned out to be considerably complex
   and hard or impossible to implement practically. The Loop Free
   Alternates (LFA) solution [RFC5286] does not give the failure
   information in any way to other routers, and so it cannot repair all
   failure cases such as the one in Figure 1.

   As discussed in Section 2. , solutions that address full failure
   coverage and rely on local repair, i.e. carrying some failure
   information within the data packets, fail to present a practical
   alternative to LFA. This draft, therefore, suggests that relaxing the


Csaszar, Enyedi, Kini Expires September 7, 2011               [Page 4]

Internet-Draft                IPFRR-FN                      March 2011


   local re-routing principle with carefully engineered explicit failure
   signaling is an effective approach.

   The idea of using explicit failure notification for IPFRR has been
   proposed before for Remote LFA Paths [RLFAP]. RLFAP limits the radius
   in which the notification is propagated. This draft attempts to work
   out in more detail what kind of failure dissemination mechanism is
   required to facilitate remote repair efficiently. Requirements for
   explicit signaling are given in Section 3. This draft does not limit
   the failure advertisement radius as opposed to RLFAP. As a result,
   the detour paths remain stable in most cases, since they are
   identical to those that the IGP will calculation after IGP
   convergence. Hence, micro-loop will not occur after IGP convergence.

   Note that the current -00 version of the draft only targets
   protection of single link and single node failures. SRLG protection
   is left for a future revision.

2. Overview of current IPFRR Proposals based on Local Repair

   The only practically feasible solution, Loop Free
   Alternates [RFC5286], offers the simplest resolution of the
   consistency problem: a node performing fail-over may only use a next-
   hop as backup if it is guaranteed that it does not send the packets
   back. These neighbors are called Loop-Free Alternates (LFA). LFAs,
   however, do not always exist, as shown in Figure 1 above, i.e., node
   A has no LFAs with respect to Dst. while it is true that tweaking the
   network configuration may boost LFA failure case coverage
   considerably [Ret2011], LFAs cannot protect all failure cases in
   arbitrary network topologies.

   The exact way of adding the information to data packets and its usage
   for forwarding is the most important property that differentiates
   most existing IPFRR proposals.

   Packets can be marked "implicitly", when they are not altered in any
   way, but some extra information owned by the router helps deciding
   the correct way of forwarding. Such extra information can be for
   instance the direction of the packet, e.g., the interface, which the
   packet arrived through, e.g. as in [FIFR]. Such solutions require
   what is called interface-based or interface-specific forwarding.

   Interface-based forwarding significantly changes the well-established
   nature of IP's destination-based forwarding principle, where the IP
   destination address alone describes the next hop. One embodiment
   would need to download different FIBs for each physical or virtual IP
   interface - not a very compelling idea. Another embodiment would


Csaszar, Enyedi, Kini Expires September 7, 2011               [Page 5]

Internet-Draft                IPFRR-FN                      March 2011


   alter the next-hop selection process by adding the incoming interface
   id also to the lookup fields, which would impact forwarding
   performance considerably.

   Other solutions mark data packets explicitly. Some proposals suggest
   using free bits in the IP header [MRC], which unfortunately do not
   exist in the IPv4 header. Other proposals resort to encapsulating re-
   routed packets with an additional IP header as in e.g. [NotVia] or
   [Eny2009b]. Encapsulation raises the problem of fragmentation and
   reassembly, which could be a performance bottleneck, if many packets
   are sent at MTU size. Another significant problem is the additional
   management complexity of the encapsulation addresses, which have
   their own semantics and need to be calculated in a failure specific
   manner.

3. Requirements of an Explicit Failure Signaling Mechanism

   Any signaling mechanism which should be used to advertise failure
   notifications and so to facilitate extremely quick remote repair
   should have the following properties.

   1. The signaling mechanism should be reliable. The mechanism needs to
      propagate the failure information to all interested nodes even in
      a network where a single link or a node is down.

   2. The mechanism should be fast in the sense that getting the
      notification packet to remote nodes through possible multiple hops
      should not require (considerably) more processing at each hop than
      plain fast path packet forwarding.

   3. The mechanism should involve simple and efficient processing to be
      feasible for implementation in the dataplane. This goal manifests
      itself in three ways: Origination of notification should very
      easy, e.g. creating a simple IP packet, the payload of which can
      be filled easily. When receiving the packet, it should be easy to
      recognize by dataplane linecards so that processing can commence
      after forwarding. No complex operations should be required in
      order to extract the information from the packet needed to
      activate the correct backup routes.

   4. The mechanism should be trustable; that is, it should provide
      means to verify the authenticity of the notifications without
      significant increase of the processing burden in the dataplane.

   5. Duplication of notification packets should be either strictly
      bounded or handled without significant dataplane processing
      burden.


Csaszar, Enyedi, Kini Expires September 7, 2011               [Page 6]

Internet-Draft                IPFRR-FN                      March 2011


   These requirements present a trade-off. A proper balance needs to be
   found that offers good enough authentication and reliability while
   keeping processing complexity sufficiently low to be feasible for
   data plane implementation. One such solution is proposed in [fn-
   transport], which is the assumed notification protocol in the
   following.

4. Conceptual Operation of IPFRR relying on Fast Notification

   This section outlines the operation of an IPFRR mechanism relying on
   Fast Notification.

4.1. Preparation Phase

   Like each IPFRR solution, here it is also required to have means for
   quick failure detection in place, such as lower layer notifications
   or BFD.
   The FN service needs to be activated and configured. The FN service
   should be bound to failure detection in such a way that FN can
   disseminate the information identifying the failure to the area.

   Failure specific alternative path computation should typically be
   executed at lower priority than other routing processing.

   Pre-computing the next hops on the new shortest paths for all the
   possible single failures may seem complex, however, it is not so
   difficult to realize: First, it can be done "offline", while the
   network is intact and the CP has few things to do. Second, for a
   single node, it is not needed to compute all the shortest paths with
   respect to any possible failures; only those link failures are needed
   to be taken into consideration, which are in the shortest path tree
   starting from the node.

   After having calculated the failure specific alternative next-hops,
   those which represent a change to the primary next-hop, should be
   pre-installed to the linecards together with the identifier of the
   failure, which triggers the switch-over. (The resource needs of an
   example implementation are briefly discussed in Appendix A.)

4.2. Failure Reaction Phase

   The main steps to be taken after a failure are the following:

   1. Quick dataplane failure detection


Csaszar, Enyedi, Kini Expires September 7, 2011               [Page 7]

Internet-Draft                IPFRR-FN                      March 2011


   2. Send information about failure using FN service right from
      dataplane.

   3. Forward the received notification as defined by the actually used
      FN protocol such as the one in [fn-transport]

   4. After learning about a local or remote failure, identify failure
      and activate failure specific backups, if needed, directly within
      dataplane

   After a node detects the loss of connectivity to another node, it
   should make a decision whether the failure can be handled locally. If
   local repair is not possible or not configured, for example because
   LFA is not configured or there are destinations for which no LFA
   exists, it should trigger the FN service to disseminate the failure
   description. For instance, if BFD detects a dataplane failure it
   normally invokes routines to notify the control plane. For the
   purpose of IPFRR, BFD (or any other lower layer failure detection
   method) should first trigger FPN before notifying the CP.

   After receiving the trigger, without any DP-CP communication
   involved, FN constructs a packet and adds the description of the
   failure (described in Section 5.1. ) to the payload. The description
   shall enable recipient nodes to decode that, e.g., node X lost
   connectivity to node Z. The encoding of the IPFRR-FN packet is
   described in Section 5.1.

   The packet is then disseminated by the FN service in the routing
   area. Note the synergy of the relation between BFD and IGP Hellos and
   between FN and IGP link state advertisements. BFD makes a dataplane
   optimized implementation of the routing protocol's Hello mechanism,
   Fast Notification makes a dataplane optimized implementation of the
   link state advertisement flooding mechanism of IGPs.

   In each hop, the recipient node needs to perform a "punt and
   forward". That is, the FN packet not only needs to be forwarded to
   the FN neighbors as the specific FN mechanism dictates, but a replica
   needs to be detached and, after forwarding, started to be processed
   by the dataplane card.

4.2.1. Activating Failure Specific Backups

   After the forwarding element extracted the contents of the
   notification packet, it knows that a node X has lost connectivity to
   a node Z via a link L. The recipient now needs to decide whether the
   failure was a link or a node failure. Two approaches can be thought


Csaszar, Enyedi, Kini Expires September 7, 2011               [Page 8]

Internet-Draft                IPFRR-FN                      March 2011


   of. Both options are based on the property that notifications advance
   in the network as fast as possible.

   In the first option, the router does not immediately make the
   decision, but instead starts a timer set to fire after a couple of
   milliseconds. If, the failure was a node failure, the node will
   receive further notifications saying that another node Y has lost
   connectivity to node Z through another link M. That is, if node Z is
   common in the notifications, the recipient can conclude that it is a
   node failure and already knows which node it is (Z). If link L is
   common in the notifications, then the recipient can decide for link
   failure (L). If further inconclusive notifications arrive, then it
   means multiple failures which case is not in scope for IPFRR, and is
   left for regular IGP convergence.

   After concluding about the exact failure, the data plane element
   needs to check in its pre-installed IPFRR database whether this
   particular failure results in any route changes. If yes, the linecard
   replaces the next-hops impacted by that failure with their failure
   specific backups which were pre-installed in the preparation phase.

   In the second option, the first received notification is handled
   immediately as a link failure, hence the router may start replacing
   its next-hops. In many cases this is a good decision. If, however,
   another notification arrives a couple of milliseconds later that
   points to a node failure, the router then needs to start replacing
   its next-hops again. This may cause a route flap but due to the quick
   dissemination mechanism the routing inconsistency is very short lived
   and likely takes only a couple of milliseconds.

   This draft recommends that out of the several FN delivery options
   defined in [fn-transport], the Redundant Tree transport option is
   preferred, which ensures that any event can reach each node from any
   source with any single link or node failure present in the network
   area as long as theoretically possible. This also means that any
   node, when activating failure specific backup entries in its FIB, may
   assume that other nodes have been notified as well and have changed
   their FIBs to present consistent routing. The exception is the case
   of legacy nodes, see Section 5.2. for details.

5. Operation Details

5.1. Message Handling and Encoding

   A failure identifier is needed that unambiguously describes the
   failed resource consistently among the nodes in the area. The
   schemantics of the identifiers are defined by the IGP used to pre-


Csaszar, Enyedi, Kini Expires September 7, 2011               [Page 9]

Internet-Draft                IPFRR-FN                      March 2011


   calculate and pre-install the backup forwarding entries, e.g. OSPF or
   ISIS.

   This draft defines a Failure Identification message class. Members of
   this class represent a routing protocol specific Failure
   Identification message to be carried with the Fast Notification
   transport protocol. Each message within the Failure Identification
   message class shall contain the following fields, the lengths of
   which are routing protocol specific. The exact values shall be
   aligned with the WG of the routing protocol:

   o Originator Router ID: the identifier of the router advertising the
      failure;

   o Neighbour Router ID: the identifier of the neighbour node to which
      the originator lost connectivity.

   o Link ID: the identifier of the link, through which connectivity
      was lost to the neighbour. The routing protocol should assign the
      same Link ID for bidirectional, broadcast or multi access links
      from each access point, consistently.

   o Sequence Number: [fn-transport] expects the applications of the FN
      service that require replay attack protection to create and verify
      a sequence number in FN messages.

   Routers forwarding the FN packets should ensure that Failure
   Identification messages are not lost, e.g. due to congestion. FN
   packets can be put a high precedence traffic class (e.g. Network
   Control). If the network environment is known to be lossy, the FN
   sender should repeat the same notification a couple of times, like a
   salvo fire.

   After the forwarding element processed the FN packet and extracted
   the Failure Identification message, it should decide what backups
   need to be activated if at all - as described in Section 4.2.1.

5.1.1. Failure Identification Message for OSPF

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           FN Length           |  FN App Type  | AuType|unused |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Originator Router ID                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                     Neighbour Router ID                       |


Csaszar, Enyedi, Kini Expires September 7, 2011              [Page 10]

Internet-Draft                IPFRR-FN                      March 2011


   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                            Link ID                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                     Sequence Number                           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                     Sequence Number (cont'd)                  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   FN Header fields:

      FN Length
         The length of the Failure Identification message for OSPF is 16
         bytes.

      FN App Type
         The exact values are to be assigned by IANA for the Failure
         Identification message class. For example, FN App Type values
         between 0x0008 and 0x000F could represent Failure
         Identification messages, from which 0x0008 could mean OSPF,
         0x0009 could be ISIS.

      AuType
         IPFRR-FN relies on the authentication options offered the FN
         transport service. Cryptographic authentication is recommended.


   Originator Router ID
      If the routing protocol is OSPF, then the value can take the OSPF
      Router ID of the advertising router.

   Neighbour Router ID
      The OSPF Router ID of the neighbour router to which connectivity
      was lost.

   Link ID
      If the link is a LAN, the Link ID takes the LSAID of its
      representing Network LSA.
      If the link is a point-to-point link, the Link ID can take the
      minimum or the maximum of the two interface IDs. The requirement
      is that it is performed consistently.

   Sequence Number
      This field stores a digest of the LSDB of the routing protocol, as
      described in Section 6.


Csaszar, Enyedi, Kini Expires September 7, 2011              [Page 11]

Internet-Draft                IPFRR-FN                      March 2011


5.1.2. Failure Identification TLV for ISIS

   TBA.

5.2. Bypassing Legacy Nodes

   Legacy nodes, while cannot originate fast notifications and cannot
   process them either, can be assumed to be able to forward the
   notifications. As [fn-transport] discusses, FN forwarding is based on
   multicast. It is safe to assume that legacy routers' multicast
   configuration can be set up statically so as to be able to propagate
   fast notifications as needed.

   When calculating failure specific alternative routes, IPFRR-FN
   capable nodes must consider legacy nodes as being fixed directed
   links since legacy nodes do not change packet forwarding in the case
   of failure. There are situations when an FN-IPFRR capable node can,
   exceptionally, bypass a non-IPFRR-FN capable node in order to handle
   a remote failure.

   As an example consider the topology depicted in Figure 2, where the
   link between C and D fails. C cannot locally repair the failure.

    +---+   +---+   +---+   +---+
    | E |---| F |---| G |---| H |
    +---+   +---+   +---+   +---+
      |            /          |
      |           /           |
      |          /            |
    +---+   +---+   +---+   +---+
    | A |---| B |---| C |-X-| D |
    +---+   +---+   +---+   +---+
      >========>==============>
         Traffic from A to D

                Figure 2 Example for bypassing legacy nodes

   First, let us assume that each node is IPFRR-FN capable. C would
   advertise the failure information using FN. Each node learns that the
   link between C and D fails, as a result of which C changes its
   forwarding table to send any traffic destined to D via B. B also
   makes a change, replacing its default next-hop (C) with G. Note that
   other nodes do not need to modify their forwarding at all.

   Now, let us assume that B is a legacy router not supporting IPFRR-FN
   but it is statically configured to multicast fast notifications as
   needed. As such, A will receive the notification. A's pre-


Csaszar, Enyedi, Kini Expires September 7, 2011              [Page 12]

Internet-Draft                IPFRR-FN                      March 2011


   calculations have been done knowing that B is unable to correct the
   failure. Node A, therefore, has pre-calculated E as the failure
   specific next-hop. Traffic entering at A and heading to D can thus be
   repaired.

5.3. Capability Advertisement

   The solution requires nodes to know which other nodes in the area are
   capable of IPFRR-FN. The most straightforward way to achieve this is
   to rely on the Router Capability TLVs available both in
   OSPF [RFC4970] and in IS-IS [RFC4971].

6. Protection against Replay Attacks

   To defend against replay attacks, recipients should be able to ignore
   a re-sent recording of a previously sent FN packet. This suggests
   that some sort of sequence number should be included in the FN
   packet, the verification of which should not need control plane
   involvement. Since the solution should be simple to implement in the
   dataplane, maintaining and verifying per-source sequence numbers is
   not the best option.

   We propose, therefore, that messages should be stamped with the
   digest of the actual routing configuration, i.e., a digest of the
   link state database of the link state routing protocol. The digest
   has to be picked carefully, so that if two LSDBs describe the same
   connectivity information, their digest should be identical as well,
   and different LSDBs should result in different digest values with
   high probability.

   The conceptual way of handling these digests could be the following:

   o When the LSDB changes, the IGP re-calculates the digest and
      downloads the new value to the dataplane element(s), in a secure
      way.

   o When a FN packet is originated, the digest is put into the FN
      message into the Sequence Number field.

   o Network nodes distribute (forward) the FN packet.

   o When processing, the dataplane element first performs an
      authentication check of the FN packet, as described in [fn-
      transport].


Csaszar, Enyedi, Kini Expires September 7, 2011              [Page 13]

Internet-Draft                IPFRR-FN                      March 2011


   o Finally, before processing the failure notification, the dataplane
      element should check whether its own known LSDB digest is
      identical with the one in the message.

   If due to a failure event a node disseminates a failure notification
   with FN, an attacker might capture the whole packet and re-send it
   later. If it resends the packet after the IGP re-converged on the new
   topology, the active LSDB digest is different, so the packet can be
   ignored. If the packet is replayed to a recipient who still has the
   same LSDB digest, then it means that the original failure
   notification was already processed but the IGP has not yet finished
   converging; the IPFRR detour is already active, the replica has no
   impact.

6.1. Calculating LSDB Digest

   We propose to create an LSDB digest that is conceptually similar
   to [ISISDigest]. The operation is proposed to be the following:

   o Create a hash from each LSA(OSPF)/LSP(ISIS) one by one

   o XOR these hashes together

   o When an LSA/LSP is removed, the new LSDB digest is received by
      computing the hash of the removed LSA, and then XOR to the
      existing digest

   o When an LSA/LSP is added, the new LSDB digest is received by
      computing the hash of the new LSA, and then XOR to the existing
      digest

7. Security Considerations

   The IPFRR application of Fast Notification does not raise further
   known security consideration in addition to those already present in
   Fast Notification itself. If an attacker could send false Failure
   Identification Messages or could hinder the transmission of legal
   messages, then the network would produce an undesired routing
   behavior. These issues should be solved, however, in [fn-transport].

   IPFRR-FN relies on the authentication mechanism provided by the Fast
   Notification transport protocol [fn-transport]. The specification of
   the FN transport protocol requires applications to protect against
   replay attacks with application specific sequence numbers. This
   draft, therefore, describes its own proposed sequence number in
   Section 6.


Csaszar, Enyedi, Kini Expires September 7, 2011              [Page 14]

Internet-Draft                IPFRR-FN                      March 2011


8. IANA Considerations

   The Failure Identification message types need to be allocated a value
   in the FN App Type field.

   IPFRR-FN capability needs to be allocated within Router Capability
   TLVs both for OSPF [RFC4970] and in IS-IS [RFC4971].

9. References

9.1. Normative References

   [RFC5286] A. Atlas, A. Zinin, "Basic specification for IP Fast-
             Reroute: Loop-Free Alternates", Internet Engineering Task
             Force: RFC 5286, 2008.

   [fn-transport] W. Lu, S. Kini, A. Csaszar, G. Enyedi, J. Tantsura, A.
             Tian, "Transport of Fast Notifications Messages", draft-lu-
             fn-transport-00, 2011

   [RFC4970] A. Lindem et al., Extensions to OSPF for Advertising
             Optional Router Capabilities, RFC 4970, 2007

   [RFC4971] JP. Vasseur et al., Intermediate System to Intermediate
             System (IS-IS) Extensions for Advertising Router
             Information, RFC 4971, 2007

9.2. Informative References

   [BFD]   D. Katz, D. Ward, "Bidirectional forwarding detection",
             RFC 5880, IETF, 2010

   [RFC5714] M. Shand, S. Bryant, "IP Fast Reroute Framework", RFC 5714,
             IETF, 2010.

   [Eny2009a]Gabor Enyedi, Gabor Retvari, Andras Csaszar, "On Finding
             Maximally Redundant Trees in Strictly Linear Time", IEEE
             Symposium on Computers and Communications (ISCC), 2009.

   [Eny2009b] Gabor Enyedi, Peter Szilagyi, Gabor Retvari, Andras
             Csaszar, "IP Fast ReRoute: Lightweight Not-Via without
             Additional Addresses", IEEE INFOCOM-MiniConference, Rio de
             Janeiro, Brazil, 2009.


Csaszar, Enyedi, Kini Expires September 7, 2011              [Page 15]

Internet-Draft                IPFRR-FN                      March 2011


   [FIFR]  J. Wand, S. Nelakuditi, "IP fast reroute with failure
             inferencing", In Proceedings of ACM SIGCOMM Workshop on
             Internet Network Management - The Five-Nines Workshop,
             2007.

   [MRC]   T. Cicic, A. F. Hansen, A. Kvalbein, M. Hartmann, R. Martin,
             M. Menth, S. Gjessing, O. Lysne, "Relaxed multiple routing
             configurations IP fast reroute for single and correlated
             failures", IEEE Transactions on Network and Service
             Management, available online: http://www3.informatik.uni-
             wuerzburg.de/staff/menth/Publications/papers/Menth08-Sub-
             4.pdf, September 2010.

   [NotVia] S. Bryant, M. Shand, S. Previdi, "IP fast reroute using Not-
             via addresses", Internet Draft, draft-ietf-rtgwg-ipfrr-
             notvia-addresses-06, 2010.

   [RLFAP] I. Hokelek, M. Fecko, P. Gurung, S. Samtani, S. Cevher, J.
             Sucec, "Loop-Free IP Fast Reroute Using Local and Remote
             LFAPs", Internet Draft, draft-hokelek-rlfap-01 (expired),
             2008.

   [Ret2011] G. Retvari, J. Tapolcai, G. Enyedi, A. Csaszar, "IP Fast
             ReRoute: Loop Free Alternates Revisited", to appear at IEEE
             INFOCOM 2011

   [ISISDigest]  J. Chiabaut and D. Fedyk. IS-IS Multicast
             Synchronization Digest. Available online:
             http://www.ieee802.org/1/files/public/docs2008/aq-fedyk-
             ISIS-digest-1108-v1.pdf, Nov 2008.


10. Acknowledgments

   The authors would like to thank Jeff Tantsura, Albert Tian, Wenhu Lu
   and Acee Lindem for the continuous discussions and comments on the
   topic, as well as Joel Halpern for his comments and review.

   This document was prepared using 2-Word-v2.0.template.dot.


Csaszar, Enyedi, Kini Expires September 7, 2011              [Page 16]

Internet-Draft                IPFRR-FN                      March 2011


Appendix A.                 Memory needs of a Naive Implementation

   Practical background might suggest that storing and maintaining
   backup next-hops for many potential remote failures could overwhelm
   the resources of router linecards. This section attempts to provide a
   calculation describing the approximate memory needs in reasonable
   sized networks with a possible implementation.

A.1. An Example Implementation

   Let us suppose that the forwarding engine is optimized for forwarding
   performance in the sense that recursive lookups are not performed for
   external destinations but each IP lookup gives back an adjacency (a
   number describing the next hop for the router), even if the packet
   will be terminated outside the current area. From the aspect of
   storing backup next-hops per destination, this is worse than using
   recursive lookup, since in this case the update of a lot more
   destinations is needed. In case of recursive lookup, external
   prefixes are resolved to internal destinations, so we can simply
   choose not to deal with external prefixes.

   This implementation uses an array for all the nodes in the area (node
   array in the sequel), made up by two pointers per record. Both of
   these pointers point to another array with a header describing its
   lengths. The first array (called alternative array) is basically an
   enumeration containing the IDs of those failures influencing a
   shortest path towards that node and an alternative neighbor, which
   can be used, when such a failure occurs. When a failure is detected,
   (either locally, or by FN), we can easily find the proper record in
   all the lists. Moreover, since these arrays can be sorted based on
   the failure ID, we can even use binary search to find the needed
   record.

   Now, we only need to know, which records in the FIB should be
   updated. Therefore there is a second pointer in the node array
   pointing to another enumeration (called FIB array in the sequel)
   containing pointers to the corresponding FIB entries. Recall, that if
   the node is an egress router, FIB array contains more than one entry.
   Moreover, there can be some prefixes reachable through more than one
   egres routers, thus these entries may be in more than one FIB arrays.


Csaszar, Enyedi, Kini Expires September 7, 2011              [Page 17]

Internet-Draft                IPFRR-FN                      March 2011


                  +-------+
                  | ptr6  |
                  +-------+-------+
                  | ptr4  | ptr5  |
          +-------+-------+-------+--   --+-------+
          | ptr1  | ptr2  | ptr3  |       | ptrk  |
          +-------+-------+-------+--   --+-------+
          |length |length |length |       |length |
          +-------+-------+-------+--   --+-------+
              *       *       *               *
             /|\     /|\     /|\             /|\
              |       |       |               |
              |       |       |               |
          +-------+-------+-------+--   --+-------+
          |   r1  |   r2  |   r3  |  ...  |   rk  |
          +-------+-------+-------+--   --+-------+
              |       |       |               |
              |       |       |               |
             \|/     \|/     \|/             \|/
              *       *       *               *
          +-------+-------+-------+--   --+-------+
          |length |length |length |       |length |
          +-------+-------+-------+--   --+-------+
          | fail1 | fail2 | fail3 |       | failk |
          | alt.1 | alt.2 | alt.3 |  ...  | alt.k |
          +-------+-------+-------+--   --+-------+
          | fail4 |       | fail5 |
          | alt.4 |       | alt.5 |
          +-------+       +-------+
          | fail6 |
          | alt.6 |
          +-------+

                 Figure 3 The way of storing alternatives

A.2. Estimation of Memory Requirements.

   Now, suppose that there are D prefixes, the area containing the
   router has V routers, a prefix is connected to K egress routers in
   average, a neighbor descriptor takes X bytes, a failure ID takes Y
   bytes and a pointer takes Z bytes. In this way, if there is no ECMP,
   this data structure takes

       2*Z*(V-1) + (2*(X+Y)*(V-1)+Y)*(V-1) + (K*D*Z+Y*(V-1))

   bytes altogether. The first part is the memory consumption of the
   node array. The memory consumption of all the FIB arrays is described


Csaszar, Enyedi, Kini Expires September 7, 2011              [Page 18]

Internet-Draft                IPFRR-FN                      March 2011


   by the last part (V-1 length fields and K*D pointers). The remaining
   part describes the maximum memory needed by an alternative arrays:
   any path can contain at most V-1 nodes and V-1 links, each record
   needs X+Y bytes plus we have a header for the array; there are
   records for all the other nodes in the area (V-1 nodes). Observe that
   this is a very rough overestimation, since most of the possible
   failures influencing the path will not change the next hop.

   For computing memory consumption, suppose that neighbor descriptors,
   failure IDs and pointers take 4 bytes, there are 200 (500) nodes in
   the area and we have 500K prefixes installed, and a prefix is
   reachable through 2 egress routers in average. In this case, we get
   that the node array needs about 1.6KB (4KB), the alternative array
   needs about 620KB (4MB), and the FIB array needs about 4MB (4MB).
   That is altogether less than 5MB (8MB) in reality, if there is no
   ECMP.

   If however, there are paths with equal costs, the size of the
   alternative array increases. Suppose that there are 10 equal paths
   between ANY two nodes in the network. This would cause that the
   alternative list gets 10 times bigger, and now it needs 6.2MB.
   Observe that now we need about 11MB (44MB) even in this extremely
   unrealistic case, which is likely acceptable for modern linecards
   with gigs of DRAM. Moreover, we need to stress here again that this
   is an extremely rough overestimation, so in reality much less memory
   will be enough.


Csaszar, Enyedi, Kini Expires September 7, 2011              [Page 19]

Internet-Draft                IPFRR-FN                      March 2011


Authors' Addresses

   Andras Csaszar
   Ericsson
   Irinyi utca 4-10, Budapest, Hungary, 1117
   Email: Andras.Csaszar@ericsson.com


   Gabor Sandor Enyedi
   Ericsson
   Irinyi utca 4-10, Budapest, Hungary, 1117
   Email: Gabor.Sandor.Enyedi@ericsson.com


   Sriganesh Kini
   Ericsson
   300 Holger Way, San Jose, CA 95134
   Email: sriganesh.kini@ericsson.com


Csaszar, Enyedi, Kini Expires September 7, 2011              [Page 20]