Routing area S. Hegde
Internet-Draft Juniper Networks, Inc.
Intended status: Standards Track P. Sarkar
Expires: January 9, 2017 Individual
July 8, 2016

Micro-loop avoidance using SPRING
draft-hegde-rtgwg-microloop-avoidance-using-spring-02

Abstract

When there is a change in network topology either due to a link going down or due to a new link addition, all the nodes in the network need to get the complete view of the network and re-compute the routes. There will generally be a small time window when the forwarding state of each of the nodes is not synchronized. This can result in transient loops in the network, leading to dropped traffic due to over-subscription of links. Micro-looping is generally more harmful than simply dropping traffic on failed links, because it can cause control traffic to be dropped on an otherwise healthy link involved in micro-loop. This can lead to cascading adjacency failures or network meltdown.

Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on January 9, 2017.

Copyright Notice

Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

Micro-loops are transient loops that occur during the period of time when some nodes have become aware of a topology change and have changed their forwarding tables in response, but slow routers have not yet modified their forwarding tables. This document provides mechanisms to prevent micro-loops in the network in the event of link up/down or metric change.The micro-loop prevention mechanism uses the basic principles of near-side tunnelling as described in [RFC5715] sec 6.2.

Micro-loops can be formed involving the PLRs or nodes which are not directly connected to the link/node going down. The nodes which are not directly connected to the node/link going down/up are referred to as remote nodes. The micro-loop prevention mechanism described in this document prevents possible micro-loops involving the remote nodes. A new sub-tlv is defined in ISIS router capability TLV [RFC4971] and OSPF router capability TLV [RFC4970] for discovering support of this feature. The details are described in Section 4. The operational procedures for micro-loop prevention are described in Section 3.

2. Procedures for Micro-loop prevention


   +----+ 10 +----+ 10 +----+  10   +----+ 10 +----+
   | S1 |----| R1 |----| S  |-------| E  |----| D1 |
   +----+    +----+    +----+       +----+    +----+
       \                  \          /
        \ 10               \ 100    / 60
         \                  \      /
          \   +----+         +----+   
           +--| R2 |---------| R3 |
              +----+    30   +----+
               /
              / 10
          +----+
          | S2 | 
          +----+

      

Figure 1: Sample Network

The topology shown in figure 1 illustrates a sample network topology where micro-loops can occur. The symmetric link metrics are shown in the diagram above. The traffic from S1 to D1 takes the path S1->R1->S->E->D1 and traffic from S2 takes the path S2->R2->S1->R1->S->E->D1 in normal operation. When the S->E link goes down, traffic can loop between S1->R2 when the FIB on S1 reflects the shortest path to D1 after the failure and the FIB on R2 reflects the shortest path to D1 before the failure. The mechanisms described in [I-D.ietf-rtgwg-uloop-delay] do not address micro-loops involving nodes that are not directly attached to the link that has just gone down or come up. For example when S->E link goes down, S and E are the Point of Local Repair (PLR) and micro-loops formed between S1 and R2 are not handled.

The basic principle of the solution is to send the traffic on tunnelled paths for a certain time period until all the nodes in the network process the event and update their forwarding plane. When the link S->E goes down, all the nodes in the network tunnel the traffic to the nearest PLR. The PLR S needs to maintain the backup path created using FRR ([RFC5286]) or other mechanisms until all other nodes in the network converge. The PLR S forwards the traffic to the affected destinations via the back-up path until the convergence procedure is complete. This document assumes 100% backup coverage for the destinations via various FRR mechanisms. This document describes the procedures corresponding to the traffic flow from sources (S nodes) to the destination nodes (D nodes). The procedures equally apply to the D nodes being source and S nodes being destination.

As soon as a node learns of the topology change, it modifies its FIB to use loop-free tunnelled paths for the affected traffic, and it starts a "convergence delay timer". When the "convergence delay timer" expires, the node modifies its FIB to use the SPF path based on the changed topology. The use of tunnelled paths during the convergence period ensures that (barring other topology changes) all traffic affected by the topology change travels on a loop-free path.

After all the nodes in the network converge to actual SPF path,PLR converges to SPF path and updates the FIB. This micro-loop prevention mechanism delays the time it takes for routing to converge to the optimal paths in the new topology by a factor of 3 but the convergence time is deterministic and completely avoids micro-loops.

In principle, near-side tunnelling could be accomplished using labels distributed via LDP. However, since the application requires that any given router have the potential to create a tunnel to nearly every other router in the IGP domain, a large number of targeted LDP sessions would be needed to learn the FEC-label bindings distributed by the PLRs. SPRING [I-D.ietf-spring-segment-routing] provides a more efficient method for distributing shortest path labels for this application, since any router can compute the locally significant FEC-label bindings for any other router without the need for targeted LDP sessions.

[RFC5715] describes other mechanisms to prevent micro-loop prevention. Near-side tunnelling is more suited for deployments as it does not need additional computation or additional state maintenance in the network nodes.Far side tunnelling has the disadvantage that it requires the use of not-via addresses [RFC6981] which requires additional address configuration on each node.Per destination non micro-looping path computation is another approach to prevent micro-loops but it is computationally intensive.

3. Detailed Solution based on SPRING

    

        +----+
        | R4 | SRGB:1000-2000
        +----+ SID:9
         / \ 
     5  /   \ 5
       /     \         SRGB:1000-2000
SID:1 /       \ SID:2   SID:3       SID:4     SID:5
   +----+ 10 +----+ 10 +----+  10   +----+ 10 +----+
   | S1 |----| R1 |----| S  |-------| E  |----| D1 |
   +----+    +----+    +----+       +----+    +----+
       \                  \          /
     10 \                  \ 100    / 60
         \  SRGB:1000-2000  \      /
          \   +----+         +----+   
           +--| R2 |---------| R3 |SID:7
        SID:6 +----+    30   +----+SRGB:1000-2000
               / 
              / 10
          +----+
          | S2 |SID:8 
          +----+SRGB:1000-2000

      

Figure 2: Sample SR Network

The above sample topology is provided with basic SPRING configurations of SRGB and the indices corresponding to each node. Each node has an SRGB 1000-2000 configured on the node. Same SRGB on all nodes is used for simplifying the example and the procedures are equally applicable when there is different SRGB configured on multiple nodes. Each node is provisioned with a MAX_CONVERGENCE_DELAY value that corresponds to its RIB to FIB convergence time. The information for support of the micro-loop prevention feature and the MAX_CONVERGENCE_DELAY value are flooded across the IGP domain (ISIS level/OSPF area). Each node in the IGP domain sets the MAX_CONVERGENCE_DELAY to the maximum of the values received in the domain.

3.1. Link-down event

When the S->E link goes down, all the nodes in the network receive the event via IGP database flooding. Each node supporting the micro-loop prevention mechanism specified in this document SHOULD perform the steps below.

  1. The PLRs (S and E) perform FRR local repair for destinations affected by the failure of the link. Each computing node identifies the destinations affected by the topology change.In the example above, the destination D1 is affected by S->E link down for nodes S1,R1,R2, and R4. For S2, although the path to D1 changes there is no change in the immediate next-hop and hence its not necessary for S2 to perform any specific actions to prevent micro-loops.
  2. For each affected destination, identify the nearest PLR advertising the change. The link-down event is advertised by both S and E. S is the nearest PLR for the nodes S1,R1,R2, and R4.
  3. Let the S->E link down event occurs at time T0.
  4. Start a timer T1 = max (all MAXIMUM_CONVERGENCE_DELAY) at all non-PLR nodes with affected destinations.
  5. Start a timer T2 = 2 * T1 at the PLR.
  6. For IP routes, modify the FIB for the affected destinations so that the nearest PLR's node-sid is pushed on the packet's label stack. For MPLS ingress and transit routes, modify the FIB for the affected destinations with a two label stack, the inner label corresponding to the destination and the outer label corresponding to the nearest PLR.
  7. In the case of ECMP paths to the nearest PLR, both tunnelled paths are used. S1 has ECMP paths to the destination D1 and both the paths are impacted. Both the paths are modified to carry two label stacks containing the nearest PLR on top and the destination label at the bottom.
  8. After the expiry of timer T1 all the non-PLR nodes modify their FIBs to use the shortest path as computed by the IGP, and they no longer push the node-SID of the nearest PLR on the packets.
  9. After the expiry of T2, the PLR converges and updates the FIB to represent shortest path.

The ingress MPLS routes at various nodes for destination D1 at specified time intervals is mentioned below.

    

   +======+=============+=================+=============+==============+
   | Node | Before T0   | T0-T1           | T1-T2       | After T2     |
   +======+=============+=================+=============+==============+
   | S1   | Push 1005,  | Push 1005,      | Push 1005,  | Push 1005,   |
   |      | Fwd to R1   | 1003(top), Fwd  | Fwd to R2   | Fwd to R2    |
   |      |             | to R1           |             |              |
   |      +-------------+-----------------+-------------+--------------+
   |      | Push 1005,  | Push 1005,      |             |              |
   |      | Fwd to R4   | 1003(top), Fwd  |             |              |
   |      |             | to R4           |             |              |
   +======+=============+=================+=============+==============+
   | S2   | Push 1005,  | Push 1005, Fwd  | Push 1005,  | Push 1005,   |
   |      | Fwd to R2   | to R2           | Fwd to R2   | Fwd to R2    |
   +======+=============+=================+=============+==============+
   | R1   | Push 1005,  | Push 1005, Fwd  | Push 1005,  | Push 1005,   |
   |      | Fwd to S    | to S            | Fwd to R4   | Fwd to R4    |
   |      +-------------+-----------------+-------------+--------------+
   |      |             |                 | Push 1005,  | Push 1005,   |
   |      |             |                 | Fwd to S1   | Fwd to S1    |
   +======+=============+=================+=============+==============+
   | R2   | Push 1005,  | Push 1005,      | Push 1005,  | Push 1005,   |
   |      | Fwd to S1   | 1003(top), Fwd  | Fwd to R3   | Fwd to R3    |
   |      |             | to S1           |             |              |
   +======+=============+=================+=============+==============+
   | R3   | Push 1005,  | Push 1005,      | Push 1005,  | Push 1005,   |
   |      | Fwd to E    | 1003(top), Fwd  | Fwd to E    | Fwd to E     |
   |      |             | to E            |             |              |
   +======+=============+=================+=============+==============+
   | R4   | Push 1005,  | Push 1005,      | Push 1005,  | Push 1005,   |
   |      | Fwd to R1   | 1003(top), Fwd  | Fwd to S1   | Fwd to S1    |
   |      |             | to R1           |             |              |
   +======+=============+=================+=============+==============+
   | S    | Push 1005,  | Push 1005, Fwd  | Push 1005,  | Push 1005,   |
   |      | Fwd to E    | to R3 *         | Fwd to R3 * | Fwd to R1    |
   |      +-------------+-----------------+-------------+---------- ---+
   |      | Push 1005,  |                 |             | Push 1005,   |
   |      | Fwd to R3 * |                 |             | Fwd to R3 *  |
   +======+=============+=================+=============+==============+
   | E    | Pop, Fwd to | Pop, Fwd to D1  | Pop, Fwd to | Pop, Fwd to  |
   |      | D1          |                 | D1          | D1           |
   +======+=============+=================+=============+==============+

                     * - Indicates backup path.
  
        

Figure 3: Sample MPLS ingress RIB

The corresponding MPLS transit routes at various nodes at specified time interval is shown below.

  

   +======+==========+==========+==============+===========+===========+
   | Node | Incoming | Before   | T0-T1        | T1-T2     | After T2  |
   |      | Label    | T0       |              |           |           |
   +======+==========+==========+==============+===========+===========+
   | S1   | 1005     | Push     | Push 1005,   | Push      | Push      |
   |      |          | 1005,    | 1003(top),   | 1005, Fwd | 1005, Fwd |
   |      |          | Fwd to   | Fwd to R1    | to R2     | to R2     |
   |      |          | R1       |              |           |           |
   |      |          +----------+--------------+-----------+-----------+
   |      |          | Push     | Push 1005,   |           |           |
   |      |          | 1005,    | 1003(top),   |           |           |
   |      |          | Fwd to   | Fwd to R4    |           |           |
   |      |          | R4       |              |           |           |
   |      +----------+----------+--------------+-----------+-----------+
   |      | 1003     | Push     | Push 1003,   | Push      | Push      |
   |      |          | 1003,    | Fwd to R1    | 1003, Fwd | 1003, Fwd |
   |      |          | Fwd to   |              | to R2     | to R2     |
   |      |          | R1       |              |           |           |
   +======+==========+==========+==============+===========+===========+
   | S2   | 1005     | Push     | Push 1005,   | Push      | Push      |
   |      |          | 1005,    | Fwd to R2    | 1005, Fwd | 1005, Fwd |
   |      |          | Fwd to   |              | to R2     | to R2     |
   |      |          | R2       |              |           |           |
   |      +----------+----------+--------------+-----------+-----------+
   |      | 1003     | Push     | Push 1003,   | Push      | Push      |
   |      |          | 1003,    | Fwd to R1    | 1003, Fwd | 1003, Fwd |
   |      |          | Fwd to   |              | to R2     | to R2     |
   |      |          | R1       |              |           |           |
   +======+==========+==========+==============+===========+===========+
   | R1   | 1005     | Push     | Push 1005,   | Push      | Push      |
   |      |          | 1005,    | Fwd to S     | 1005, Fwd | 1005, Fwd |
   |      |          | Fwd to S |              | to R4     | to R4     |
   |      |          +----------+--------------+-----------+-----------+
   |      |          |          |              | Push      | Push      |
   |      |          |          |              | 1005, Fwd | 1005, Fwd |
   |      |          |          |              | to S1     | to S1     |
   |      +----------+----------+--------------+-----------+-----------+
   |      | 1003     | Push     | Push 1003,   | Push      | Push      |
   |      |          | 1003,    | Fwd to S     | 1003, Fwd | 1003, Fwd |
   |      |          | Fwd to S |              | to S      | to S      |
   +======+==========+==========+==============+===========+===========+
   | R2   | 1005     | Push     | Push 1005,   | Push      | Push      |
   |      |          | 1005,    | 1003(top),   | 1005, Fwd | 1005, Fwd |
   |      |          | Fwd to   | Fwd to S1    | to R3     | to R3     |
   |      |          | S1       |              |           |           |
   |      +----------+----------+--------------+-----------+-----------+
   |      | 1003     | Push     | Push 1003,   | Push      | Push      |
   |      |          | 1003,    | Fwd to S1    | 1003, Fwd | 1003, Fwd |
   |      |          | Fwd to   |              | to S1     | to S1     |
   |      |          | S1       |              |           |           |
   +======+==========+==========+==============+===========+===========+
   | R3   | 1005     | Push     | Push 1005,   | Push      | Push      |
   |      |          | 1005,    | 1003(top),   | 1005, Fwd | 1005, Fwd |
   |      |          | Fwd to E | Fwd to E     | to E      | to E      |
   |      +----------+----------+--------------+-----------+-----------+
   |      | 1003     | Push     | Push 1003,   | Push      | Push      |
   |      |          | 1003,    | Fwd to R2    | 1003, Fwd | 1003, Fwd |
   |      |          | Fwd to   |              | to R2     | to R2     |
   |      |          | R2       |              |           |           |
   +======+==========+==========+==============+===========+===========+
   | R4   | 1005     | Push     | Push 1005,   | Push      | Push      |
   |      |          | 1005,    | 1003(top),   | 1005, Fwd | 1005, Fwd |
   |      |          | Fwd to   | Fwd to R1    | to S1     | to S1     |
   |      |          | R1       |              |           |           |
   |      +----------+----------+--------------+-----------+-----------+
   |      | 1003     | Push     | Push 1003,   | Push      | Push      |
   |      |          | 1003,    | Fwd to R1    | 1003, Fwd | 1003, Fwd |
   |      |          | Fwd to   |              | to R1     | to R1     |
   |      |          | R1       |              |           |           |
   +======+==========+==========+==============+===========+===========+
   | S    | 1005     | Push     | Push 1005,   | Push      | Push      |
   |      |          | 1005,    | Fwd to R3 *  | 1005, Fwd | 1005, Fwd |
   |      |          | Fwd to E |              | to R3 *   | to R1     |
   |      |          +----------+--------------+-----------+-----------+
   |      |          | Push     |              |           | Push      |
   |      |          | 1005,    |              |           | 1005, Fwd |
   |      |          | Fwd to   |              |           | to R3 *   |
   |      |          | R3 *     |              |           |           |
   |      +----------+----------+--------------+-----------+-----------+
   |      | 1003     | --       | --           | --        | --        |
   +======+==========+==========+==============+===========+===========+
   | E    | 1005     | Pop, Fwd | Pop, Fwd to  | Pop, Fwd  | Pop, Fwd  |
   |      |          | to D1    | D1           | to D1     | to D1     |
   +======+==========+==========+==============+===========+===========+

        *     - Indicates backup path.
   
        

Figure 4: Sample MPLS transit RIB

3.2. Link-up event

When a new-link is added to the network, the PLR needs to update the FIB before it announces the change. First the PLR converges, updates the FIB as per the new-link based topology and then announces the new-link addition to the rest of the network. The other network nodes SHOULD follow the procedure exactly same as described in sec 3.1. They SHOULD update their FIB to tunnel the traffic to the closest node corresponding to the change.After MAX_CONVERGENCE_DELAY the nodes SHOULD update the FIB with the shortest path next-hops.