Network Working Group J. Dong Internet-Draft X. Zhang Intended status: Informational Huawei Technologies Expires: September 17, 2016 Z. Li China Mobile March 16, 2016 OSPF Corrupted MaxAge LSA Flushing Problem Statement draft-dong-ospf-maxage-flush-problem-statement-00 Abstract In OSPF protocol, Link State Advertisements (LSAs) are exchanged in Link State Update (LSU) packets to achieve link state database (LSDB) synchronization and consistent route calculation. The "LS age" field is part of the LSA header, which is excluded from the checksum calculation of the LSA. Due to some hardware or software problems, the LS age may be corrupted and reach the MaxAge prematurely. Flushing of the corrupted MaxAge LSA may cause flooding storm of OSPF packets and severely impact the services in the network. This document describes the problem of OSPF corrupted MaxAge LSA flushing, and specifies the requirements on potential solutions. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on September 17, 2016. Dong, et al. Expires September 17, 2016 [Page 1] Internet-Draft OSPF MaxAge Flush Problem Statement March 2016 Copyright Notice Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. LS Age not Protected from Corruption . . . . . . . . . . . . 3 3. Consequence of Corrupted LS Age . . . . . . . . . . . . . . . 3 3.1. LS Age Corrupted to MaxAge . . . . . . . . . . . . . . . 3 3.2. LS Age Corrupted to a Value Close to MaxAge . . . . . . . 4 4. Requirement on Potential Solutions . . . . . . . . . . . . . 4 4.1. Solution for Impact Mitigation . . . . . . . . . . . . . 4 4.2. Solution for Problem Localization . . . . . . . . . . . . 4 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 5 6. Security Considerations . . . . . . . . . . . . . . . . . . . 5 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 5 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 5 8.1. Normative References . . . . . . . . . . . . . . . . . . 5 8.2. Informative References . . . . . . . . . . . . . . . . . 5 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 6 1. Introduction In OSPF protocol [RFC2328], Link State Updates (LSAs) are exchanged in Link State Update (LSU) packets to achieve link state database (LSDB) synchronization and consistent route calculation. The "LS age" field is part of the LSA header, which is excluded from the checksum calculation of the LSA. LSAs having age MaxAge are not used in the routing table calculation and MUST be flooded. Due to some hardware or software problems, the LS age may be corrupted and reach the MaxAge prematurely. Flushing of such corrupted MaxAge LSA may cause flooding storm of OSPF packets and severely impact the services in the network. Since the MaxAge LSA may be flushed by any OSPF router, usually it would take a long time for troubleshooting and could cause huge damage to both the service provider and its customers. Dong, et al. Expires September 17, 2016 [Page 2] Internet-Draft OSPF MaxAge Flush Problem Statement March 2016 2. LS Age not Protected from Corruption As specified in [RFC2328], the LS age field is part of the LSA header, it indicates the age of the LSA. The LS age is set to 0 when the LSA is originated, and must be incremented by InfTransDelay on every hop of the flooding procedure. LSAs are also aged as they are held in each router's database. When a router compares two instances of LSA, both having identical LS sequence numbers and LS checksums, the instance of age MaxAge is always accepted as most recent. Although there is an LS checksum field in LSA header, the LS age field is excluded from the checksum calculation. This makes it possible that the LS age is corrupted but not detected. Since cryptographic authentication is executed at the OSPF packet level, it can only protect the assembled LSU packet for one hop and does not provide any additional protection for the corruption of LS age field. 3. Consequence of Corrupted LS Age This section evaluates the impacts of corruption of LSA LS age field. This may be caused by either hardware of software problems of the router. 3.1. LS Age Corrupted to MaxAge In this case, the LS age of an LSA is corrupted to MaxAge. According to section 14 of [RFC2328], this corrupted MaxAge LSA will be flushed by the router, no matter whether this LSA is self- originated or not. According to the flooding scope of the LSA, this MaxAge LSA would be flooded either in the whole routing domain or in the specific area. On all the routers receiving this corrupted LSA, this would cause the uncorrupted LSA instance being replaced, and consequently triggers route computation and installation. When the corrupted MaxAge LSA is received by the originating router of this LSA, the originating router would increase the LSA's LS sequence number one past the received LS sequence number, and originate a new instance of the LSA. If the corruption is due to systematic problem and cannot recover automatically, this flooding and processing would last forever, which severely impacts network stability and service availability. Dong, et al. Expires September 17, 2016 [Page 3] Internet-Draft OSPF MaxAge Flush Problem Statement March 2016 3.2. LS Age Corrupted to a Value Close to MaxAge In this case, the LS age of an LSA is corrupted to a big value which is close to MaxAge. Before the corrupted LS age reaches MaxAge, the corrupted LSA will not be flushed. During this time, the router may receive an uncorrupted version of this LSA from some other router. According to section 13.1 of [RFC2328], if the LS age fields of the two instances differ by more than MaxAgeDiff, the instance having the smaller LS age is considered to be more recent, then the corrupted LSA will be replaced by the normal version of this LSA. Thus depends on the value of the corrupted LS age and the setting of the MaxAgeDiff, the corrupted LSA may be fixed. However, if the corruption is due to systematic problem, later the LS age will be set to a big value again. If the corrupted LSA does not get fixed by the above procedure, the LS age finally reaches MaxAge, then the corrupted LSA will be flushed according to section 3.1. 4. Requirement on Potential Solutions In networks which uses OSPF as the IGP protocol, the problem of LS age corruption can severely impact both network stability and the services carried in the network, thus it is important to figure out appropriate solutions for this problem. This section classifies the potential solutions into two categories and specifies the requirements on them. 4.1. Solution for Impact Mitigation Since the corrupted MaxAge LSA flushing has severe impact on network stability and services carried in the network, it is critical to reduce such impact even before the root cause of the problem can be identified. Also, the impact mitigation solution needs to support incremental deployment. Preferably, the mitigation solution should not delay the route convergence caused by normal MaxAge LSA flushing. 4.2. Solution for Problem Localization If the corruption of LS age is due to systematic problem, it can not be recovered automatically. And since a router can flush MaxAge LSAs which are originated by other routers, it is necessary to provide a solution which can help operators to identify the problem and locate the corrupted router quickly. Dong, et al. Expires September 17, 2016 [Page 4] Internet-Draft OSPF MaxAge Flush Problem Statement March 2016 [RFC6232] proposes to add the Purge Originator Identification (POI) TLV into IS-IS Purge LSPs to identify the originator of IS-IS Purges. Although a similar TLV can be added into the extended LSAs as defined in [RFC7684] and [I-D.ietf-ospf-ospfv3-lsa-extend], the structure of most the legacy OSPF LSAs as defined in [RFC2328] are not TLV-based. A problem localization solution which is applicable to all the LSA types is preferred. 5. IANA Considerations This document makes no request of IANA. Note to RFC Editor: this section may be removed on publication as an RFC. 6. Security Considerations This document describes the problem of lack of integrity protection of the LS age field. The LS age field may be altered as a result of packet corruption, such modification cannot be detected by LSA checksum nor OSPF packet cryptographic authentication. Corruption of the LS age field could have severe impact on network stability and the services in the network. This may be considered as a security vulnerability. 7. Acknowledgements The authors would like to thank Bruno Decraene, Acee Lindom and Les Ginsberg for the discussion on this topic. 8. References 8.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, DOI 10.17487/RFC2328, April 1998, . 8.2. Informative References Dong, et al. Expires September 17, 2016 [Page 5] Internet-Draft OSPF MaxAge Flush Problem Statement March 2016 [I-D.ietf-ospf-ospfv3-lsa-extend] Lindem, A., Mirtorabi, S., Roy, A., and F. Baker, "OSPFv3 LSA Extendibility", draft-ietf-ospf-ospfv3-lsa-extend-09 (work in progress), November 2015. [RFC6232] Wei, F., Qin, Y., Li, Z., Li, T., and J. Dong, "Purge Originator Identification TLV for IS-IS", RFC 6232, DOI 10.17487/RFC6232, May 2011, . [RFC7684] Psenak, P., Gredler, H., Shakir, R., Henderickx, W., Tantsura, J., and A. Lindem, "OSPFv2 Prefix/Link Attribute Advertisement", RFC 7684, DOI 10.17487/RFC7684, November 2015, . Authors' Addresses Jie Dong Huawei Technologies Huawei Campus, No.156 Beiqing Rd. Beijing 100095 China Email: jie.dong@huawei.com Xudong Zhang Huawei Technologies Huawei Campus, No.156 Beiqing Rd. Beijing 100095 China Email: zhangxudong@huawei.com Zhenqiang Li China Mobile No.32 Xuanwumenxi Ave., Xicheng District Beijing 100032 China Email: li_zhenqiang@hotmail.com Dong, et al. Expires September 17, 2016 [Page 6]