TOC 
Network Working GroupX. Xu
Internet-DraftHuawei
Intended status: Standards TrackP. Francis
Expires: April 30, 2009Cornell U.
 October 27, 2008


Tunnel Endpoints in BGP
draft-xu-idr-tunnel-00.txt

Status of this Memo

By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.

This Internet-Draft will expire on April 30, 2009.

Abstract

Virtual Aggregation (VA) is a mechanism for shrinking the size of the DFZ FIB in routers [I‑D.francis‑idr‑intra‑va] (Francis, P., Xu, X., and H. Ballani, “FIB Suppression with Virtual Aggregation and Default Routes,” September 2008.). VA can result in longer paths and increased load on routers within the ISP that deploys VA. This document describes a mechanism that allows an AS that originates a route to associate a tunnel endpoint terminating at itself with the route. This allows routers in a remote AS to tunnel packets to the originating AS. If transit ASes between the remote AS and the originating AS install the prefixes associated with tunnel endpoints in their FIBs, then tunneled packets that transit through them will take the shortest path. This results in reduced load for the transit AS, and better performance for the customers at the source and destination.



Table of Contents

1.  Introduction
    1.1.  Requirements notation
2.  Endpoint Address Sub-TLV Definition
3.  Usage of the TE-Attribute with an Endpoint Address Sub-TLV
    3.1.  Originating AS
    3.2.  Non-Originating ASes
4.  IANA Considerations
5.  Security Considerations
6.  Normative References
§  Authors' Addresses
§  Intellectual Property and Copyright Statements




 TOC 

1.  Introduction

Virtual Aggregation (VA) [I‑D.francis‑idr‑intra‑va] (Francis, P., Xu, X., and H. Ballani, “FIB Suppression with Virtual Aggregation and Default Routes,” September 2008.) is a mechanism for reducing FIB size for routers within the AS that deploys VA. This is done through "FIB Suppression", where certain routers in the AS may not install routes to certain prefixes in their FIB. The downside of using VA is that packets addressed to suppressed prefixes transiting the AS may take a longer path than otherwise necessary.

For instance, imagine a packet traversing AS-path S-A-B-C-D, where ASes S and D are the service providers for their respective customers. Further, assume that ASes A, C, and D are using VA, and that A and C are FIB-suppressing the prefix associated with the packet. In this case, when the packet transits A and C, there is a good chance that it will take an extra router hop within A and C. This increases load for A and C, and degrades performance for S's and D's customers.

The mechanism described in this draft allows D, for instance, to associate a tunnel endpoint with the prefixes that it originates. The tunnel endpoint is an anycasted address that terminates at all of D's routers. If A and C FIB-install the route to the prefix associated with the tunnel endpoint, then packets tunneled to the FIB-suppressed prefix will take the shortest path.

This draft describes a mechanism for advertising the tunnel endpoint address in BGP. It does so without changes to how BGP computes routes, and in such a way that packets always follow the expected AS path. In other words, a tunnel T to a prefix P is not used unless the AS-path of the tunnel route and the AS-path of the prefix route are the same.

This draft uses the Tunnel Encapsulation Attribute (TE-Attribute) defined in [I‑D.ietf‑softwire‑encaps‑safi] (Mohapatra, P. and E. Rosen, “BGP Encapsulation SAFI and BGP Tunnel Encapsulation Attribute,” June 2008.) to encode the tunnel information. However, whereas [I‑D.ietf‑softwire‑encaps‑safi] (Mohapatra, P. and E. Rosen, “BGP Encapsulation SAFI and BGP Tunnel Encapsulation Attribute,” June 2008.) couples the TE-Attribute with the "Encapsulation SAFI", this draft uses the TE-Attribute in normal BGP updates transmitted over multiple ASes across the Internet.

This draft extends the use of the optional, transitive TE-Attribute defined in section 4 of [I‑D.ietf‑softwire‑encaps‑safi] (Mohapatra, P. and E. Rosen, “BGP Encapsulation SAFI and BGP Tunnel Encapsulation Attribute,” June 2008.). Its purpose, as defined in [I‑D.ietf‑softwire‑encaps‑safi] (Mohapatra, P. and E. Rosen, “BGP Encapsulation SAFI and BGP Tunnel Encapsulation Attribute,” June 2008.), is to allow a router acting as a tunnel endpoint to signal its tunnel type and tunnel parameters. The TE-Attribute does not convey the actual IPv4 or IPv6 address of the tunnel endpoint. Rather, this information is carried in the NEXT_HOP field of BGP. As such, the scope of [I‑D.ietf‑softwire‑encaps‑safi] (Mohapatra, P. and E. Rosen, “BGP Encapsulation SAFI and BGP Tunnel Encapsulation Attribute,” June 2008.) is only within the set of routers that do not change the NEXT_HOP field.

This draft extends the use of the TE-Attribute so that it can be passed from AS to AS in normal BGP reachability updates.



 TOC 

1.1.  Requirements notation

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119] (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.).



 TOC 

2.  Endpoint Address Sub-TLV Definition

This draft defines a new sub-TLV to be used with the TE-Attribute, the "Endpoint Address" sub-TLV. The sub-TLV Type is TBD. The sub-TLV Value field is defined as:

        +---------------------------------------------------------+
        | Address Family Identifier (2 octets)                    |
        +---------------------------------------------------------+
        | Reserved  (1 octet)                                     |
        +---------------------------------------------------------+
        | Length of Autonomous System Number (1 octet)            |
        +---------------------------------------------------------+
        | Autonomous System Number (Variable)                     |
        +---------------------------------------------------------+
        | Endpoint Address (variable)                             |
	+---------------------------------------------------------+

The Autonomous System (AS) Number is that of the AS originating the route. The Endpoint Address is that of the tunnel endpoint.



 TOC 

3.  Usage of the TE-Attribute with an Endpoint Address Sub-TLV

The following usage rules apply only to TE-Attributes that are NOT associated with an encapsulating SAFI (i.e. as defined by [I‑D.ietf‑softwire‑encaps‑safi] (Mohapatra, P. and E. Rosen, “BGP Encapsulation SAFI and BGP Tunnel Encapsulation Attribute,” June 2008.)) and that include an Endpoint Address Sub-TLV.



 TOC 

3.1.  Originating AS

Only the router originating a route may include a TE-Attribute. In other words, the TE-Attribute MUST NOT be added to received routes. The AS number in the Endpoint Address Sub-TLV MUST match that used as the first AS in the AS path. The Endpoint Address itself does not have to be from the same AF as the reachable NLRI in the update. The reachable NLRI may be both IPv4 and IPv6. However, there MUST be an NLRI in the UPDATE that contains the endpoint address.

An originating AS as defined here may be an AS that receives a route from a customer that uses a private AS number.

If a tunnel endpoint router receives a packet on the tunnel, and the only known route to the destination is via routes originated by other ASes (not including private ASes of customers), then the packet must be dropped. This prevents transient loops whereby the ASes of a multi-homed customer both think that the other AS can reach the customer. Once the route withdraw reaches all other ASes, no more packets will be received via the tunnel.

All routers in the origin AS MUST use the same Endpoint Address, which is anycasted across all routers. The reason for imposing this restriction is as follows. Say that an origin AS used different endpoint addresses for different routers, and that an upstream AS that does not recognize the TE-Attribute decided to aggregate two UPDATEs with different endpoint addresses. The aggregating AS might drop one of the TE-Attributes but include the other, with the result that the tunnel endpoint in the resulting UPDATE would be undetectably incorrect with respect to some of the NLRI in the UPDATE.

The complete TE-attribute produced by all routers in the originating AS MUST be identical. The Protocol and Color Sub-TLV types are not used. If the encapsulation technique is GRE, and no key value is used, then the Endpoint Address Sub-TLV is the only one required. If the key value is used, or L2TPv3 is the tunnel type, then the Encapsulation Sub-TLV associated with the tunnel type is included.

Note that even though the above paragraph states that the TE-attribute produced by all routers must be identical, in practice this is not strictly possible. If an AS decides to modify the endpoint address it uses, or decides to modify the tunnel type or tunnel parameters it uses, then for some period of time different routers will in fact be producing different TE-Attributes (i.e. while routers are being reconfigured). When this is the case, all routers MUST be able to receive tunneled packets for every TE-Attribute being produced by any router in the AS. For example, assume that an AS wants to modify its TE-Attributes from tunnel A to tunnel B (where A and B have different endpoint address, different tunnel types, or different tunnel parameters). The network administrator would first configure all routers to accept both tunnels A and B. He or she would then modify routers to produce TE-Attributes for tunnel B. After that was complete, he or she would delete tunnel A from all routers.

It is for further study whether IP-in-IP encapsulation is required. It is also for further study whether multiple encapsulation types are required for the same UPDATE (i.e. to allow a remote router with limited encapsulation types to be able to select an encapsulation type that works for it.)



 TOC 

3.2.  Non-Originating ASes

ASes that have deployed VA SHOULD FIB-install routes containing the Endpoint Address. This will prevent packets tunneled to Endpoint Addresses from taking any extra hops.

When a router in a non-originating AS receives a route with an associated Endpoint Address, it must decide whether or not to use the tunnel. The router always has the option of ignoring the tunnel (and will do so by default if it does not recognize the TE-attribute). This section describes the criteria that determines when the router may use the tunnel.

The router MUST NOT use the tunnel UNLESS the following criteria are met:

  1. The AS_PATH to the tunnel endpoint matches the AS path to the reachable prefix.
  2. The AS_PATH advertised by the AS, for all NLRI for which a tunnel is used, matches that of the tunnel.
  3. The first AS in the AS_PATH Attribute is in an AS-SEQUENCE (not an AS-SET), and this AS matches the AS in the TE-attribute. This prevents an error whereby an aggregating AS combines NLRI from different originating ASes, and throws away all but one of the TE-attributes, thus resulting in an Endpoint Address that is incorrect.
  4. If there are multiple TE-attributes in the update, they MUST all be identical. In this case, the AS SHOULD delete all but one of the TE-attributes from UPDATEs it passes on. If they are not all identical, then the AS MUST ignore them and remove all of them from any UPDATES that it passes on.

Note that the above rules have the characteristic that, if a transit AS decides to use one AS path to some prefixes from an origin AS, and another AS path to other prefixes from the origin AS, then only one of these paths can have a valid endpoint address associated with it. Packets transmitted to the other path cannot be tunneled. One way to fix this, that would require changes to this draft, would be to encode the tunnel endpoint as a block of addresses. In this case, the transit AS that wishes to use multiple paths to different prefixes from an origin AS can deaggregate the block of addresses, and associate one tunnel endpoint block deaggregate with each selected path. Whether this is a good idea is for further study.



 TOC 

4.  IANA Considerations

IANA must issue a new Sub-TLV type for the Tunnel Encapsulation Attribute for the Endpoint Address Sub-TLV.



 TOC 

5.  Security Considerations

Because there are no changes in the BGP route selection process, there are no changes to the security properties of BGP as a result of this draft.



 TOC 

6. Normative References

[I-D.francis-idr-intra-va] Francis, P., Xu, X., and H. Ballani, “FIB Suppression with Virtual Aggregation and Default Routes,” draft-francis-idr-intra-va-01 (work in progress), September 2008 (TXT).
[I-D.ietf-softwire-encaps-safi] Mohapatra, P. and E. Rosen, “BGP Encapsulation SAFI and BGP Tunnel Encapsulation Attribute,” draft-ietf-softwire-encaps-safi-03 (work in progress), June 2008 (TXT).
[RFC2119] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML).


 TOC 

Authors' Addresses

  Xiaohu Xu
  Huawei Technologies
  No.3 Xinxi Rd., Shang-Di Information Industry Base, Hai-Dian District
  Beijing, Beijing 100085
  P.R.China
Phone:  +86 10 82836073
Email:  xuxh@huawei.com
  
  Paul Francis
  Cornell University
  4108 Upson Hall
  Ithaca, NY 14853
  US
Phone:  +1 607 255 9223
Email:  francis@cs.cornell.edu


 TOC 

Full Copyright Statement

Intellectual Property