Internet-Draft Node Liveness Protocol February 2022
Li Expires 7 August 2022 [Page]
Workgroup:
LSR Working Group
Internet-Draft:
draft-li-lsr-liveness-02
Published:
Intended Status:
Standards Track
Expires:
Author:
Tony. Li
Juniper Networks

Node Liveness Protocol

Abstract

Prompt notification of the loss of node liveness or reachability is useful for restoring services in tunneled topologies. IGP summarization precludes remote nodes from directly observing the status of remote nodes. This document proposes a service that, in conjunction with the IGP, provides prompt notifications without impacting IGP summarization.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 7 August 2022.

Table of Contents

1. Introduction

Overlay services are increasingly common and are implemented by creating tunnels over a physical infrastructure. The failure of one of the tunnel endpoints implies that the traffic towards that endpoint will be lost until the other endpoint recognizes the situation and takes remedial action. Prompt notification of the failure of the other endpoint is useful in minimizing the duration of the outage.

Some network designs have come to rely on examining the IGP's Link State Database (LSDB) to determine node liveness and, through the IGP SPF computation, the node's reachability. However, if the network is to scale, some form of summarization must be employed, resulting in this information no longer being directly available. This document proposes a protocol that will provide prompt notificaion of changes in node liveness, even in networks that employ IGP summarization.

The service itself runs on OSPF [RFC2328] [RFC5340] Area Border Routers (ABRs) or IS-IS [ISO10589] L1-L2 routers. For brevity, we will use the term 'ABRs' for both cases.

This service uses a simple, hierarchical publish-subscribe architecture. Clients are nodes within non-backbone OSPF areas or L1 IS-IS area. They register with their local ABRs. The ABRs are fully meshed, with the exception that ABRs of the same area need not interact. Notifications initiated by an ABR flow to other ABRs and from there to client nodes.

The availability of this service is advertised as part of the IGP, so that discovery of the service is automatic. Clients can automatically detect their local ABRs and ABRs can detect each other and automatically form the necessary hierarchy.

The protocol runs on top of TCP [RFC0793] and/or QUIC [RFC9000] for reliability. Security is provided by conventional transport protocol mechanisms, such as TLS [RFC5246].

Node liveness should not be confused with service liveness. If a node is alive, then a service may or may not be up. This protocol only tries to convey node liveness.

2. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

3. Node Liveness Capability Advertisement

The Node Liveness Protocol is run by ABRs and is advertised in the IGP for connections by clients and other ABRs. Advertisements are done both into the backbone (L2) and into non-backbone (L1) areas. The advertisements into the backbone allow ABRs to automatically mesh. The advertisements into the non-backbone areas allow clients to automatically determine where the service is available.

3.1. Node Liveness Advertisement in IS-IS

An ABR advertises the IS-IS Node Liveness sub-TLV as part of the IS-IS Router Capability TLV [RFC7981]. This is injected into the ABRs L1 and L2 LSP. The format of the IS-IS Node Liveness sub-TLV is:

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |     Type      |     Length    |O|N|  Reserved |      TPI      |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |           Port Number         |         IPv4 Address          |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |           IPv4 Address        |         IPv6 Address...       |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The advertisement of this capability indicates that the node is providing the Node Liveness service on the designated port using the designated protocol. The TPI indicates the transport protocol to be used and the Port Number indicates the associated port to be used. The TPI and Port Number pair may be included multiple times to indicate that multiple protocols and port numbers are available. The length of the sub-TLV can be used to determine the number of TPI and Port Number pairs.

An IP address for the ABR MUST be included so that correspondents will know how to access the service. An ABR MUST provide an IPv4 address, an IPv6 address, or both.

3.2. Node Liveness Advertisement in OSPF

The availabilty of the Node Liveness service is provided by the OSPF Node Liveness Sub-TLV. The OSPF Node Liveness Sub-TLV is used by both OSPFv2 and OSPFv3. The semantics are the same as the IS-IS Node Liveness Sub-TLV. The format of the OSPF Node Liveness Sub-TLV is:

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |             Type              |             Length            |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |O|N|  Reserved |      TPI      |           Port Number         |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |                          IPv4 Address                         |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |                          IPv6 Address...                      |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The TPI and Port Number fields are used in the same way as for IS-IS.

4. Node Liveness Protocol

4.1. Messages

The Node Liveness Protocol sends messages in a stream inside of the selected transport protocol. The protocol uses two message types: Registration Messages and Notification Messages. Each has a sub-TLV to carry specifics about the relevant prefix.

4.2. Client actions

The client may determine the set of ABRs that it wishes to communicate with by examination of its LSDB. The client SHOULD open connections to at least two ABRs for redundancy. If the client cannot open two connections, then the management system should be informed.

The client MAY send Registration Messages (with a Liveness Registration sub-TLV) on each of its ABR connections. A client MAY register for any number of prefixes, but it is expected that a client will send a registration for each of the tunnel endpoints that it will correspond with. A client may register for a host (a /32 or /128 prefix) or a shorter prefix. A client MUST NOT send overlapping registrations.

Clients never send Notification Messages and never recive Registration Messages.

The actions of the client on receiving a Notification Message are out of scope for this document.

4.3. ABR actions

Each ABR MUST advertise the availability of the Node Liveness service into the backbone (L2) area and into any non-backbone (L1) areas.

Each ABR MUST have a single connection to each other ABR that is part of a different non-backbone (L1) area. To prevent duplicate connections, only one ABR should initiate the connection. For IS-IS, the node with the lowest system ID should initiate the connection. For OSPFv4, the node with the lowest IPv4 router ID should initiate the connection. For OSPFv3, the node with the lowest IPv6 router ID should initiate the connection.

Each ABR may receive Registration Messages, each containing a prefix. These are retained in a Registration Database (RDB) along with its associated connection information. If a transport connection closes, then all registrations associated with the connection should be removed from the RDB. If an ABR receives a Registration Message requesting a prefix be unregistered, then the prefix should be removed from the RDB for that connection.

If an ABR receives a Registration Message for a prefix that is being injected by a non-attached area, then it SHOULD determine the set of ABRs that are advertising that prefix or less specifics and register with only those ABRs. The ABR MAY register for the prefix or any of the less specifics. It is RECOMMENDED that the ABR register for the most specific prefix that is less specific than the original prefix. If the ABR cannot find a matching prefix or less specific prefix, then the ABR MAY register for all of prefixes that are more specific. Extreme caution should be used before registering for 0/0.

If the ABR has registered for a prefix and that prefix is no longer advertised by another ABR then an ABR MAY unregister, re-evaluate its registration and register for a different prefix. In this way, if a summary prefix changes, the ABR can shift to the new summary prefix.

An ABR or client SHOULD NOT send duplicate registrations. If an ABR or client is already registered for a prefix, a duplicate registration SHALL be ignored by the receiving ABR.

Each ABR should monitor its IGP LSDB for changes in node liveness. If an ABR sees an addition to the LSDB, then it is considered an Up Event for that node. If an ABR sees a LSP/LSA time out or become unreachable, then it is considered a Down Event for that node. Up Events and Down Events for non-host prefixes are out of scope for this document.

If an ABR receives a Notification Message with an Up Event for a prefix, then it is considered an Up Event for the prefix. If an ABR receives a Notification Message with a Down Event for a prefix, then it is considered a Down Event for the prefix.

If an ABR observes an Up Event for a host, it examines its RDB for registrations for that node or for any less specific prefixes. If there are any, then the ABR sends a Notification Message (with a Liveness Notification sub-TLV) with an Up Event for that host to each node that registered. If there are no registrations, then the event MUST be ignored.

Similarly, if an ABR observes a Down Event for a host, it examines its RDB for registrations for that node or for any less specific prefixes. If there are any, then the ABR sends a Notification Message (with a Liveness Notification sub-TLV) with a Down Event for that host to each node that registered. If there are no registrations, then the event MUST be ignored.

A client may be co-located with an ABR. In other words, an ABR may create registrations for its own purposes.

4.3.1. Autonomous Notification Mode

This section describes OPTIONAL ABR behavior.

An ABR MAY learn a set of prefixes from its management plane and enter those prefixes into its RDB. Upon an Up or Down Event for such a prefix, the ABR MAY send corresponding notification messages to all other ABRs.

This may cause ABRs to receive unexpected Notification Messages. Since these do not match client registration messages in its own RDB, such messages SHALL be ignored.

4.3.2. Proxy ABRs

Another node may perform ABR functions instead of the ABR itself. The alternate node is a 'proxy ABR' and performs all of the functions of the ABR with respect to this protocol, except for injecting capability advertisements into the LSDB. The proxy ABR should listen to the IGP within the area so that it can correctly generate notifications. One or more ABRs SHOULD advertise the availability of the proxy ABR in its capability advertisements. How the real ABRs learn about the proxy ABR is out of scope for this document.

4.4. Registration Messages

A Registration Message has the following format:

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |     Type      |     Length    |R|  Reserved   | Sub-TLVs ...
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

4.4.1. Liveness Registration sub-TLV

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |     Type      |     Length    |              AFI              |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |  Prefix len   |    Prefix ...
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  • Type: 1 (Registration Message), 1 octet
  • Length: 3 + the number of octets for the prefix, 1 octet
  • AFI: Address Family Identifier [afireg], 2 octets
  • Prefix len: number of significant bits in the prefix, 1 octet
  • Prefix: The prefix to register/unregister, n octets

4.5. Notification Messages

A Notification Message has the following format:

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |     Type      |     Length    |         Sub-TLVs ...
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

4.5.1. Liveness Notification sub-TLV

The Liveness Notification sub-TLV has the format:

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |     Type      |     Length    |              AFI              |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |U|  Reserved   |  Prefix len   |         Prefix ...
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  • Type: 2 (Notification Message), 1 octet
  • Length: 3 + number of octets of prefix, 1 octet
  • AFI: Address Family Identifier [afireg], 2 octets
  • U: 1 bit

    0: Up Event
    1: Down Event
  • Reserved: must be zero and ignored on receipt, 7 bits
  • Prefix len: number of significant bits in the prefix, 1 octet
  • Prefix: The prefix generating the notification, n octets

5. IANA Considerations

5.1. IS-IS

This document requests the following code points from the "IS-IS Sub-TLVs for IS-IS Router CAPABILITY TLV" registry.

5.2. OSPF

This document requests the following code points from the "OSPF Router Information (RI) TLVs" registry:

6. Security Considerations

This document creates no new security issues. Security of transport protocol connections are addressed by the use of conventional transport protocol security techniques, such as TLS. IGP advertisements are not expected to have privacy, so the advertisement of the service is not a security issue.

7. Normative References

[afireg]
IANA, "Address Family Numbers", , <https://www.iana.org/assignments/address-family-numbers/address-family-numbers.xhtml#address-family-numbers-2>.
[ISO10589]
ISO, "Intermediate system to Intermediate system routing information exchange protocol for use in conjunction with the Protocol for providing the Connectionless-mode Network Service (ISO 8473)", , <ISO/IEC 10589:2002>.
[RFC0793]
Postel, J., "Transmission Control Protocol", STD 7, RFC 793, DOI 10.17487/RFC0793, , <https://www.rfc-editor.org/info/rfc793>.
[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC2328]
Moy, J., "OSPF Version 2", STD 54, RFC 2328, DOI 10.17487/RFC2328, , <https://www.rfc-editor.org/info/rfc2328>.
[RFC5246]
Dierks, T. and E. Rescorla, "The Transport Layer Security (TLS) Protocol Version 1.2", RFC 5246, DOI 10.17487/RFC5246, , <https://www.rfc-editor.org/info/rfc5246>.
[RFC5340]
Coltun, R., Ferguson, D., Moy, J., and A. Lindem, "OSPF for IPv6", RFC 5340, DOI 10.17487/RFC5340, , <https://www.rfc-editor.org/info/rfc5340>.
[RFC7981]
Ginsberg, L., Previdi, S., and M. Chen, "IS-IS Extensions for Advertising Router Information", RFC 7981, DOI 10.17487/RFC7981, , <https://www.rfc-editor.org/info/rfc7981>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.
[RFC9000]
Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based Multiplexed and Secure Transport", RFC 9000, DOI 10.17487/RFC9000, , <https://www.rfc-editor.org/info/rfc9000>.

Author's Address

Tony Li,
Juniper Networks
1133 Innovation Way
Sunnyvale, California 94089
United States of America