Network Working Group Sudharsana Venkataraman (Ed) Internet Draft Juniper Networks Intended status: Informational Expires: January 6, 2016 July 6, 2015 Avoiding Repeated Preemption of Low-Priority TE LSPs - Recommendations draft-sudha-teas-rsvp-preemption-avoidance-00 Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on January 6, 2016. Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Sudharsana, et al. Expires January 6, 2016 [Page 1] Internet-Draft Avoiding Preemption July 2015 Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Abstract When a high priority LSP is being setup, if there is reservation contention on a link along the path to the destination, any low priority LSP taking the link could get preempted and eventually rerouted. Low priority LSPs could suffer preemption repeatedly when they are placed in succession on heavily used links that have very less remaining bandwidth. This document describes a solution to avoid repeated preemption of low priority LSPs in the network. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [RFC2119]. Table of Contents 1. Introduction...................................................3 1.1. Scenarios that can cause repeated preemption of low priority LSPs...........................................................4 2. Recommendations................................................5 2.1. Transit based approach....................................5 2.2. Sample procedure at transit...............................5 2.2.1. Inputs to the procedure:.............................5 2.2.2. Periodic sub-procedure...............................6 2.2.3. Advantages:..........................................7 2.3. Ingress based approach:...................................7 2.4. Sample procedure at ingress...............................7 2.4.1. Inputs to the procedure:.............................7 2.4.2. Procedure at ingress.................................8 2.4.3. Disadvantages over transit based approach............9 2.4.4. Advantages:..........................................9 3. Use cases and applicability of approaches......................9 3.1. Pertaining to a particular link...........................9 3.2. Setting up general LSP behavior in the network...........10 4. Security Considerations.......................................11 5. IANA Considerations...........................................11 6. Normative References..........................................11 7. Acknowledgments...............................................11 Sudharsana, et al Expires January 06, 2016 [Page 2] Internet-Draft Avoiding Preemption July 2015 8. Authors' Addresses............................................11 1. Introduction When an LSP does not have configured hops as constraints, it typically gets setup along the shortest path to the destination as long as the required bandwidth is available. Links along the shortest path are therefore mostly saturated with very less remaining bandwidth to reserve. Low priority LSPs along such paths are very likely to suffer preemption when a higher priority LSP needs to be setup (Hard preemption of TE LSPs is described in [RFC3209] and soft-preemption is described in [RFC5712]). When a low priority LSP reroutes it does not try to avoid links with less remaining bandwidth. It may therefore again be setup on another heavily used path, as the goal is have optimal metric, with a high chance of getting preempted again. So a low priority LSP may suffer preemption multiple times before it settles on a longer and less utilized path, finally owing to bandwidth unavailability on shorter and congested paths, causing lot of control plane churn in the network during the course. +-----------+ +----------+ | | |\ | | | Router A |--------+ \ | Router B | +---------| |--------+ / | |--------+ | | | |/ | | | | +-----------+ +----------+ | +---------+ +---------+ | | | | | Ingress | | Egress | | | | | +---------+ +---------+ | +-----------+ +----------+ | | | | |\ | | | +---------| Router C |--------+ \ | Router D |--------+ | |--------+ / | | | | |/ | | +-----------+ +----------+ Let's say the paths between Router A and B contributes to lesser cost when establishing LSPs from Ingress to Egress, compared to the paths between Router C and D. As a result the links lying along the paths Sudharsana, et al Expires January 06, 2016 [Page 3] Internet-Draft Avoiding Preemption July 2015 from A to B are saturated, with very less remaining bandwidth, as usage increases. Majority of the LSPs set up along the paths from Router A to B are high priority LSPs. As there is bandwidth shortage along these paths, low priority LSPs typically get moved out to the longer paths which are along Router C to D. This happens only when the low priority LSPs cannot be accommodated along any of the paths from Router A to B As long as the low priority LSPs are able to reserve bandwidth along any of the paths from Router A to B they are placed in one of the paths in that bundle from A to B, only to be preempted and moved to another path within the bundle when a new higher priority LSP is signaled. The repeated preemption and rerouting can be avoided if the low priority LSP can be setup along C to D, when the path along A to B gets heavily used. The objective is to avoid hot-links (ones with low remaining bandwidth) when placing low priority LSPs, so that there is less probability of them getting preempted repeatedly. 1.1. Scenarios that can cause repeated preemption of low priority LSPs When a member link is removed from an aggregate bundle, it is expected to create some congestion on the aggregate link. This could increase the probability of preemption for any low priority LSPs taking the aggregate link When a member link is added to an aggregate bundle that is currently congested or when a new link or router is added to a network that is running hot, it could result in immediate spike in the available bandwidth, attracting LSPs waiting to be setup, which could include the low priority LSPs. But such a setup may be short lived. Considering the congestion on the link, they could subsequently get preempted. When a particular link or router is to be taken down for maintenance, it generally reroutes the traffic along other paths causing congestion along those paths owing to the extra load from the path under maintenance. This increases chances of preemption for any low priority LSPs taking the alternate path. It should be noted that in the scenarios specified above in which the probability of repeated preemption of low priority LSP is high, Sudharsana, et al Expires January 06, 2016 [Page 4] Internet-Draft Avoiding Preemption July 2015 the use of resource affinities like link color may not be suitable because the operator may want the low priority LSPs to use the shortest path when the links along the shortest path are not heavily utilized. 2. Recommendations The idea is to avoid placing low priority LSPs on hot links, rather than moving them away (by pre-empting them) when a higher priority LSP needs to be setup. Hot links are the ones that are running almost saturated with little unreserved bandwidth. The shortest path(s) to the destination tend(s) to get hot under scale and heavy usage. The following outlines two possible options to accomplish this. 2.1. Transit based approach A transit router should monitor the remaining bandwidth on all attached links. When it falls below a threshold for a link, it RECOMMENDED that the bandwidth subscription percentage, for low priorities, on that link SHOULD be set to a value (i.e. reduced) such that it prevents further placement of low priority LSPs on that link. This subscription percentage change on the link for low priorities can be reversed when the remaining bandwidth at priority 0 increases by a reasonable amount. 2.2. Sample procedure at transit Here is a sample procedure to set per priority bandwidth subscription on hot links. It is RECOMMENDED that the transit router execute the following procedure on each of its attached links. 2.2.1. Inputs to the procedure: a) Rem_bw_threshold%: When remaining (unreserved) bandwidth at highest priority, on the link, falls below this percentage action is to be taken. b) Input_priority: Sudharsana, et al Expires January 06, 2016 [Page 5] Internet-Draft Avoiding Preemption July 2015 For priorities inferior to this, action is to be taken. c) Subscription_percent%: Subscription percentage to be used for setting per priority subscription which is the action. d) Igp_update_threshold%: Percentage by which bandwidth utilization on a link should change to qualify for an IGP TE update to be sent out of the box. 2.2.2. Periodic sub-procedure a) Find the remaining available bandwidth on the link at priority 0 and see if it is below rem-bw threshold%. Note that the available bandwidth should the actual value that is known on the router and may be different from the value advertised in IGP TE. b) If the remaining bandwidth (at priority 0) is below rem-bw-threshold% of the total link capacity, the link qualifies for action. c) Action: Set the subscription on the link, for priorities in the range between input_priority and 7, to value given by subscription_percent d) If the remaining bandwidth percentage (at priority 0) is above rem-bw-threshold% of the total link capacity by at least igp_update_threshold% of the links capacity, and the subscription is not 100% for lower priorities on that link, it should be set to 100% (or a configured maximum subscription value). Setting per priority bandwidth subscription will result in TE updates being advertised by IGP for the link. It is RECOMMENDED that the value of the subscription percentage SHOULD NOT cause immediate preemption of any of the low priority LSPs already taking the link. The link that gets selected for subscription action, has at least (100 - rem_bw_threshold) % of its capacity reserved. The subscription percentage that is set should be more than current real reservation percentage which is (100 - Sudharsana, et al Expires January 06, 2016 [Page 6] Internet-Draft Avoiding Preemption July 2015 rem_bw_threshold) % so that none of the low priority LSPs that have already reserved bandwidth on the link suffer preemption owing to subscription. 2.2.3. Advantages: a) This approach doesn't rely on IGP TE update, to identify when a link qualifies to be hot or ceases to be one. So this procedure is able to work even when the change in bandwidth usage leading to toggling of links hotness state, is less than igp_update- threshold%. b) When per bandwidth subscription is set, IGP TE update is triggered, and this enables all nodes to avoid placing low priority LSPs on the given link. 2.3. Ingress based approach: The ingress should monitor all the links in its TE database. When the remaining bandwidth at priority 0, for any link falls below a given threshold, it is RECOMMENDED that Ingress SHOULD instrument its view of TE database to reflect a lesser available bandwidth for lower priorities on that link than actually is available. This instrumented view can be reversed for a link when the remaining bandwidth at priority 0 increases by a reasonable amount. 2.4. Sample procedure at ingress Here is a sample procedure local to ingress to create an instrumented view of TE database that helps avoid saturated links when computing path for low priority LSPs. 2.4.1. Inputs to the procedure: a) Rem_bw_threshold%: When remaining (unreserved) bandwidth at highest priority, on the link, falls below this percentage action is to be taken. b) Input_priority: For priorities inferior to this, action is to be taken. Sudharsana, et al Expires January 06, 2016 [Page 7] Internet-Draft Avoiding Preemption July 2015 c) Subscription_percent%: Subscription percentage to be used for arriving at the available bandwidth at low priorities for the link. d) Igp_update_threshold%: Percentage by which bandwidth utilization on a link should change to qualify for an IGP TE update to be sent out of the box. 2.4.2. Procedure at ingress For each link in TE database, the following procedure is executed. a) Find the remaining available bandwidth on the link at priority 0, from the data available in TE database and see if it is below rem- bw threshold% b) If the remaining bandwidth (at priority 0) is below rem-bw- threshold% of the total link capacity, the link qualifies for action. c) Action: Set the available bandwidth on the link in TE database, for priorities in the range between input_priority and 7, to a value derived by application of subscription_percent, on the total link capacity, taking into account the existing reservations at that priority, as can be obtained from the TE database. d) If the remaining bandwidth percentage (at priority 0) is above rem-bw-threshold% of the total link capacity by at least igp_update_threshold% of the links capacity, the instrumented view is reversed and the actual values received from IGP TE updates can be used. Just as the case in transit based approach, it is RECOMMENDED that the value of the subscription percentage SHOULD NOT cause immediate preemption of any of the low priority LSPs already taking the link. Every time a TE update for the link is received, if it ceases to be hot or becomes one owing to the update, link capacity for low priorities can be modified based on subscription-configuration. Sudharsana, et al Expires January 06, 2016 [Page 8] Internet-Draft Avoiding Preemption July 2015 2.4.3. Disadvantages over transit based approach The local instrumented view does not get sent in TE updates. Hence other Ingress routers not following the procedure will still use the congested link for low priority LSPs. When a link qualifies as hot owing to placement of an LSP whose reservation did not cross igp_update_threshold, none of the computing nodes other than the ingress that initiated the concerned LSP, get to know about the hotness of the link, even if we assume all of them follow the same procedure. This problem can be mitigated by having all routers use an adaptive igp_update_threshold rather than using a static one. This requires all the routers to send more frequent updates when link utilization gets closer to a threshold. 2.4.4. Advantages: This being an ingress based solution, transit routers do not have to be configured or upgraded. If owing to the local instrumented view obtained by applying subscription, a certain low priority LSP is not coming up after multiple retries, then the ingress can choose to relax the subscription and try to bring up the low priority LSP. Such a situation cannot be detected and fixed in the transit based approach. 3. Use cases and applicability of approaches The following sections discuss the applicability of transit and ingress based approaches in various situations. 3.1. Pertaining to a particular link When a member link is removed from an aggregate bundle, it is expected to create some congestion on the aggregate link. Before removing the member link, the subscription percentage may be set for low priorities such that the remaining bandwidth is utilized by higher priority LSPs. When a member link is added to an aggregate bundle that is currently congested or when a new link or router is added to a network that is Sudharsana, et al Expires January 06, 2016 [Page 9] Internet-Draft Avoiding Preemption July 2015 running hot, it could result in immediate spike in the available bandwidth, attracting LSPs waiting to be setup, which could include the low priority LSPs. Setting the subscription for low priorities accordingly, before adding the member link, could avoid the situation of low priority LSPs getting placed on the new member link given that the aggregate link is on a congested path. It should be noted that these use cases do not require any monitoring of links and the link(s) requiring action is known in advance. The transit router based approach is better suited for these cases. That is, in the above situations the subscription percentage for lower priorities may be decreased on the link without checking for the conditions described in Section 3.2, and the procedure in Section 3.2 may be executed fully after a user- configured time period. 3.2. Setting up general LSP behavior in the network In a network, an admission control behavior that differentiates between the LSPs based on their priority, overriding actual bandwidth availability at the priority, can be setup in two ways. One approach is to start with a behavior where there is no differentiation based on priorities. Then the low priority LSPs are denied portion of the total capacity when links become hot as is discussed in earlier sections of this document. This approach assumes steady state and no congestion in the network as a whole to start with. Another could be to deny bandwidth for low priority LSPs on any link that comes up. Then gradually with the passage of time, the available bandwidth for low priority LSPs can be increased if the utilization does not spike. With every preemption the gradual increase in available bandwidth for low priorities can be backed off exponentially. This approach assumes congestion and that the entire network is running hot, when any link is brought up. Other situations where the affected link is not precisely known in advance is the maintenance use case. When a particular link or router is to be taken down for maintenance, it generally reroutes the traffic along other paths causing congestion along those paths owing to the extra load from the path under maintenance. The subscription on the links of the alternate paths can be set to limit the low priority LSPs on those paths. For setting up general network behaviors and for handling situations where affected link is not known in advance, the ingress based Sudharsana, et al Expires January 06, 2016 [Page 10] Internet-Draft Avoiding Preemption July 2015 approach is better suited. The ingress could just monitor all network links or a subset of them identified by policy. The transit based approach coupled with a management script also can be used though. 4. Security Considerations The recommendations do not present any new security concerns. 5. IANA Considerations This document makes no request of IANA. 6. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3209] Awduche, D., "RSVP-TE: Extensions to RSVP for LSP Tunnels", RFC 3209, December 2001. [RFC5712] Meyer, M., Ed. and JP. Vasseur, Ed., "MPLS Traffic Engineering Soft Preemption", RFC 5712, January 2010. 7. Acknowledgments We are grateful to Yakov Rekhter for his contributions to the development of the idea and thorough review of content of the draft. Thanks to Vishnu Pavan Beeram and Harish Sitaraman for their comments and inputs. 8. Authors' Addresses Sudharsana Venkataraman Juniper Networks Email: sudharsana@juniper.net Chandra Ramachandran Juniper Networks Email: csekar@juniper.net Raveendra Torvi Juniper Networks Email: rtorvi@juniper.net Sudharsana, et al Expires January 06, 2016 [Page 11] Internet-Draft Avoiding Preemption July 2015 Sudharsana, et al Expires January 06, 2016 [Page 12]