Network Working Group Nishit Vasavada INTERNET DRAFT Amber Networks, Inc. July 2001 Layer 3 VPNs using Encapsulation Services Protocol 1. Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC 2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 2. Abstract [RFC2547bis] defines a way to implement Layer 3 VPNs using BGP and MPLS. [GRE_IP_MPLS] shows a method to implement RFC 2547 style VPNs across a non-MPLS network. This document shows an alternative way of implementing Layer 3 VPNs in a non-MPLS network. Unlike [RFC2547bis], it does not require BGP either to be running on the PE. 3. Specification of Requirements The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [6]. 4. Introduction [RFC2547bis] discusses in great detail the motivation and requirements for a Layer 3 Virtual Private Network (VPN). The goal of this document is to accomplish the same as that of [RFC2547bis], and therefore the common details are not repeated here. [RFC2547bis] requires that the Service Provider(SP)'s network be MPLS-enabled. This means all the routers in the SP's core network MUST be able to support MPLS. [RFC2547bis] uses BGP for route distribution, which requires BGP to be running in the SP's core network at least with an overlay topology. While this may be the Vasavada [Page 1] INTERNET DRAFT July 2001 case in fair number of networks, a technology not requiring to run BGP for route distribution may be more suitable for networks not running BGP. [ES] defines a generic protocol to emulate and encapsulate Layer 1 and Layer 2 circuits over a core network. We extend [ES] to carry VPN Discovery Protocol (VDP) and Route Distribution Protocol (RDP). VDP is an auto-discovery mechanism for discovering other Provider Edge (PE) routers which are connected to a site belonging to a VPN that has a site connected to the PE router running VDP. RDP is an extensible mechanism to distribute route information for each VPN to all other PEs with sites belonging to that specific VPN. A special ES tunnel is set up between two PEs to carry L3 VPN traffic. It carries control and data traffic for all VPNs which are common to the two PEs. Inside each tunnel, each ES session represents a specific VPN. -------- -------- | | ES/L2TP Tunnel | | | |_______________________________________________| | | | | | | | Control Channel: VDP | | | |<--------------------------------------------->| | | | | | | | Session 1: VPN A: RDP + data | | | PE 1 |<--------------------------------------------->| PE 2 | | | : | | | | : | | | | : | | | | Session n: VPN N: RDP + data | | | |<--------------------------------------------->| | | | | | | |_______________________________________________| | | | | | -------- -------- Figure 1. Two PEs running ES based L3 VPNs The ES tunnel is set up by following the process outlined in [ES]. The access link type is "L3VPN". Service attributes are chosen according to the tunnel guiding parameters - e.g. it may indicate the remote PE. The service type capability negotiated is ES, again as specified in [ES]. Session 0 is reserved for carrying VDP traffic and will be referred to as control session in rest of the document. The control session is set up as soon as the tunnel comes up. The numerically lower IP address initiates and numerically higher IP address passively awaits for the session. VPN related route information (through RDP) and data traffic is carried in individual sessions - one session per VPN. Thus, one session of VPD runs once per pair of PE (one per tunnel), while one session of RDP runs once per VPN per tunnel. The sessions map VPNs to ES tunnel/session with the use of VPN-IDs. Each VPN is assigned an 8-byte ID known as VPN-ID. The VPN-ID is Vasavada [Page 2] INTERNET DRAFT July 2001 passed to the remote PE during L2TP session set up through the end-identifier AVP specified in [L2TPES]. VPN-ID with all zeros is reserved, while all 1's is used for the control session. Rest of the sessions carry a specific VPN-ID during session set-up, and all the traffic through that session is mapped to the VPN represented by that VPN-ID. 5. VPN Discovery Protocol (VDP) The aim of VPN Discovery Protocol is to dynamically determine VPN membership at the remote end of IP-VPN tunnels. The protocol communicates with its remote peer using the VPN control session (a special session) inside a VPN tunnel. The protocol is a simple, reasonably state-less Query-Response based protocol. It makes the following assumptions: - It runs over an unreliable transport (an ES/L2TP session in this case). - Fragmentation of protocol PDUs are handled at underlying IP layer - L2TP signaling is asymmetric in nature. VPN Discovery protocol assumes that the PE with numerically lower IP address will always initiate the establishment of underlying tunnels and sessions. The other end (with numerically higher IP address) will passively wait for the incoming tunnel/session requests. 5.1. VDP Operation - The protocol gets triggered as soon as the VPN control session becomes active between two PEs. - For each remote PE, the local PE maintains a state for each configured VPN on the system. The state of the VPN for the remote PE changes through the operation of the VDP (the state transition is described in the next section). - Initially, all the VPNs on all remote PEs are in a "Dirty" state. This state means that the membership information has not been conveyed to the remote PE. - A periodic timer, with a configurable timeout value, is used to send the list of dirty VPNs to the remote PE, as a VPN-Query-Request PDU. - The remote PE, upon receipt of the query request, sends back a VPN-Query-Response indicating against each VPN listed, whether it has been configured on the remote PE end or not. - When the query response comes back to the local PE, each of the VPN contained in the response message go to a "Clean" state (from a previous Dirty state). A Clean state means that its VPN membership has been conveyed to the remote PE. - All the VPNs in the query response, which are not configured on the remote system remain in the clean state, until the VPN membership is removed from the local PE. - For each VPN in the query response which is configured on the remote PE, either a VPN session request is initiated or is passively awaited, depending upon the relative numerical relationship between the IP addresses of the two PEs. - As stated earlier, it is assumed that the underlying transport does not guarantee any reliable delivery. Hence, every time a new query Vasavada [Page 3] INTERNET DRAFT July 2001 request is sent, a (4-byte) sequence number is incremented and is included in the PDU message. - The receiver end copies the sequence number in the query request to the query response. The sender of the query request does not accept any query response which has a sequence number different from the one that was included in the last query request sent. - A checksum is included to protect the integrity of the PDU. At the end of initial run of VDP, both PEs know whether each of the VPN on the local side has a member VPN site on the remote PE as well. The idea is to have a mechanism where VPN memberships of a specific PE need not be configured at all other PEs in the network. 5.2. Message format The VDP message format is as shown below: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Version | Message Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | VPN Membership Entries (variable number) +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Version (one byte): Set to 1 Message Type (one byte): 0 for VDP Request 1 for VDP Response Length (two bytes): length of the entire PDU beginning version through the checksum field Sequence Number (four bytes): Starts from 0 and wraps over after reaching the maximum value. The sender of VDP Request MUST increment the number by one every time it sends a new request. The sender of VDP Response MUST copy the Sequence number from the Request it is responding to. VPN Membership Entries (nine bytes per entry): There can be multiple entries in this field. Each entry consists of two parts: - An 8-byte VPN ID - A 1-byte value that shows whether the VPN is present or not: 1 denotes Present, 0 denotes Not Present. This field SHOULD be ignored on receiving the Request. The sender of Response MUST set it to proper value based on whether the VPN represented by the corresponding VPN ID is present or not. Checksum (two bytes): This is an IP-style checksum over the VDP message beginning the Version field through the Checksum field. Vasavada [Page 4] INTERNET DRAFT July 2001 5.3. Example For example, suppose PE1 in Figure 1 has member VPNs A, B, C and D, and PE2 has member VPNs B, D, E and F. The following exchange will take place, assuming PE2 receives PE1's query before it sends its own query: - PE1 sends a request with VPNs A, B, C and D in the VPN Membership Entries. - PE2 sends a response with VPNs A, B, C and D in the VPN Membership Entries. The Present/Not Present field will reflect the membership status of the corresponding VPN at PE2. This means VPNs B and D will be marked Present, while VPNs A and C will be marked as Not Present. The sequence number will be the same as in the request received from PE1. Checksum is recomputed. - PE2 sends a request for VPNs E and F to PE1. - PE1 sends a response showing VPNs E and F Not Present. - PE1 and PE2 have one ES session set up for each of the VPNs B and D. VDP does not retransmit a PDU if no response is received. However, it periodically scans the database for "Dirty" entries, and sends a new VDP Request message if one or more such entries are found. These may be older entries which were never acknowledged through VDP Response by the peer PE, or they may newly configured VPNs since the last scan. If a VPN membership is removed from a PE, the PE tears down the ES/L2TP sessions corresponding to the VPN from each PE which had that VPN as a member. Thus, there is no need for an explicit VDP message for informing VPN membership removal. 6. Route Distribution Protocol (RDP) RDP is used to distribute subscriber addresses to other sites in the VPN. RDP is VPN specific, and therefore is carried in the session created for the specific VPN. In the case of ES over L2TP, the RDP for VPN A is sent to PE X via the L2TP tunnel set up with PE X inside the ES/L2TP session set up for VPN A. This is shown in Figure 1. 6.1. Message format The RDP message format is as shown below: Vasavada [Page 5] INTERNET DRAFT July 2001 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Version | Message Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Route Entries (variable number) +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Version (one byte): Set to 1 Message Type (one byte): 0 for Route Advertise 1 for Route Advertise Response Length (two bytes): length of the entire PDU beginning version through checksum Sequence Number (four bytes): Starts from 0 and wraps over after reaching the maximum value. The sender of Route Advertise message MUST increment the number by one every time it sends a new message. The sender of Route Advertise Response MUST copy the Sequence number from the Advertise message it is responding to. Route Entries (five bytes per entry): Each route entry is a 3-tuple: - A 32-bit prefix - A 4-bit mask-length, and - A 4-bit field that shows whether the route is being Added or Withdrawn. 0 denotes ### and 1 denotes ### Checksum (two bytes): This is an IP-style checksum over the RDP message beginning the Version field through the Checksum field. 6.2. RDP Operation RDP maintains a "dirty" list of routes for each VRF. The list is maintained per peer PE, and tells the local PE that the status of that specific route has not been confirmed by the peer PE. A route is added in the dirty list and put in Dirty/Add state when the route was recently added. The route is sent to the peer PE in an RDP Advertise message with the add/withdraw value set to Add. When the peer PE acknowledges the message after processing the routes in the Advertise message, it includes the route in its Route Entries list. The local PE at this point removes the route from the dirty list. Similarly, when a route is deleted from the VRF for the VPN, it is added to the dirty list and put in Dirty/Withdraw state. It retained in the list till the peer PE acknowledges the Route Advertise message that announced the withdrawal of the route. When a PE receives an RDP Advertise message, it identifies the corresponding VRF by the ES tunnel/session. This ensures that the routes belonging to a specific VPN are injected only into the VRF Vasavada [Page 6] INTERNET DRAFT July 2001 corresponding to that VPN. This is essential for an L3 VPN, since it allows SP's customers to have overlapping private address space without causing any confusion in the core. Once a VRF is identified by the receiving PE, the routes are added or deleted based on the Add/Withdraw field. When the PE is done processing the Route Advertise message, it sends the packet back to the PE which sent the Route Advertise message. This serves as an acknowledgement of Route Advertise. The Message Type field is set to Route Advertise Response, and the checksum is recomputed. The PE receiving the Route Advertise Response compares the sequence number of the Response message with the last Route Advertise message sent to the peer PE. If the sequence numbers do not match, the Route Advertise Response is silently discarded. If the sequence numbers match, the receiving PE finds all the routes listed in the Route Advertise Response and removes them from the dirty list. 7. Data plane operation When L3 VPN data is received from a CE, the VRF is chosen based on the interface. The Destination IP address in the VRF tells the PE the peer PE, as well as the ES tunnel/session corresponding to the peer PE and the VPN. The customer data packet is encapsulated in ES (and the lower transport protocol such as L2TP) and sent to the peer PE. The peer PE identifies the outbound interface based on the ES tunnel/ session information in the packet from the sending PE. The ES encapsulation is removed and the packet is sent out on the outbound interface. 8. Interface with ES VDP, RDP and data plane traffic is encapsulated in ES [ES]. If ES runs over L2TP as shown in [ES], all the sessions inside each tunnel between the PEs will need to negotiate ES as the L2TP service type, as defined in [L2TPES]. 8. Future Work Modify ES header to accommodate multiple types of traffic inside ES. Assigning unique VPN-ID for inter-SP VPNs. 9. Security Considerations All the underlying ES Security considerations remain, though no new ones are introduced. 10. IANA Considerations None at present. Vasavada [Page 7] INTERNET DRAFT July 2001 11. Intellectual Property Considerations Amber Networks may seek patent or other intellectual property protection for some of all of the technologies disclosed in this document. If any standards arising from this document are or become protected by one or more patents assigned to Amber Networks, Amber intends to disclose those patents and license them on reasonable and non-discriminatory terms. 12. Acknowledgments Many thanks to Himansu Sahu, Danny McPherson, Stanley Fong and Indira Mitchell for their help in reviewing this draft. 13. References [RFC2547bis] Rosen, et. al., "BGP/MPLS VPNs", Work in Progress, February 2001 [GRE_IP_MPLS] Rekhter, et. al., "Use of PE-PE GRE or IP in RFC2547 VPNs", Work in Progress, June 2001 [ES] Vasavada, N., "ESP: Encapsulation Services Protocol", Work in Progress, July 2001 [L2TPES] Vasavada N., "Encapsulation Services Protocol Service Type for L2TP", draft-vasavada-l2tpext-es-svctype-00.txt, Work in Progress, July 2001 14. Author's Address Nishit Vasavada Amber Networks, Inc. 48664 Milmont Drive Fremont, CA 94538 Phone: +1 510.687.5200 Email: nishit@ambernetworks.com Vasavada [Page 8]