Network Working Group X. Cui Internet Draft Huawei Intended status: Informational X. Chen Expires: August 2010 China Mobile February 11, 2010 SCTP Association Changeover Guideline draft-cui-tsvwg-assoc-changeover-00.txt Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on August 10, 2010. Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the BSD License. Cui, et al. Expires August 10, 2010 [Page 1] Internet-Draft Association Changeover Framework February 2010 Abstract Sigtran specifies association-level reliable transport for signaling using SCTP, but in some scenarios a single association failure's reliability mechanism is not sufficient to achieve telco reliability. This document specifies procedures for an SCTP association changeover solution which will enable applications to meet a higher degree of availability. Two generic changeover solutions are presented in this document. One is implemented inside the SCTP protocol stack and the other requires SCTP and ULP to work in collaboration. Conventions used in this document In examples, "C:" and "S:" indicate lines sent by the client and server respectively. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. Cui, et al. Expires August 10, 2010 [Page 2] Internet-Draft Association Changeover Framework February 2010 Table of Contents 1. Introduction and Problem Statement.............................4 2. Terminology....................................................6 3. SCTP Association Failure Analysis..............................7 3.1. Communication Loss........................................7 3.2. Endpoint Terminates Association...........................8 4. Applicability Consideration for Association Changeover.........8 5. Generic Requirements for SCTP Association Changeover..........10 6. New Chunks and Options........................................11 6.1. Exchange TSN Notify (ETN) Chunk..........................11 6.2. Exchange TSN Acknowledgement (ETA) Chunk.................14 6.3. SCTP-Organizing Mode Negotiation Options.................17 7. Configuration Variables.......................................17 8. Deployment Scenarios and Solution Approach....................17 8.1. Association Changeover Initiation........................18 8.2. ETN Processing...........................................19 8.3. ETA Processing...........................................20 8.4. Error Processing.........................................21 8.5. Association Failure with Cached ETN......................21 9. Solution Guideline............................................21 9.1. ULP-Organizing Changeover................................21 9.2. SCTP-Organizing Changeover...............................21 9.3. Negotiation for Association Changeover...................22 9.4. Alternative Association Selection........................22 9.5. Comparison with RFC4960..................................23 10. Security Considerations......................................23 11. IANA Considerations..........................................23 12. Acknowledgments..............................................24 13. References...................................................25 13.1. Normative References....................................25 13.2. Informative References..................................25 Authors' Addresses...............................................25 Cui, et al. Expires August 10, 2010 [Page 3] Internet-Draft Association Changeover Framework February 2010 1. Introduction and Problem Statement Sigtran is designed for the transport of signaling in IP based network. The goal of Sigtran is to provide reliable transport service to application signaling. Most applications expect Sigtran to provide a non-duplication and lossless bearer transport. Applications built on top of Sigtran do not usually take any additional reliablility mechanism in and of themselves. But in fact, a single SCTP association cannot meet crucial requirement of some telcom environments. As stated in section 3.1 and 3.3 of [I-D.sigtran- network-management-ps], a single SCTP association cannot meet the requirements of current telecommunication networks. If an application uses multiple SCTP associations to compensate for this deficiency, the lack of inter-association changeover messages will lead to some explicit issues. For example, in the following scenario: +-------+--------+ +--------+-------+ | | Assoc1 |---------------------| Assoc1 | | | |--------| |--------| | | eNB | Assoc2 |---------------------| Assoc2 | MME | | S1-AP |--------| |--------| S1-AP | | | . . . |---------------------| . . . | | +-------+--------+ +--------+-------+ Figure 1 S1-AP over SCTP Changeover Issue In this example, S1-AP is applied in the S1-MME interface for E-UTRAN to access the EPC network and SCTP provides the signaling bearer for S1-AP. If we establish multiple associations (Assoc1 and Assoc2) and when one of them is interrupted, we would meet some troubles. When Assoc1 is interrupted, S1-AP (working as the ULP) may retrieve the unsent and unacknowledged signaling messages and retransmit these messages to MME over the other association. But as stated in section 3.1 of [I-D.sigtran-network-management-ps], existing mechanisms can not answer this situation. A detailed example is the eNodeB initiated E-RAB release procedure, which is specified in section 8.2.3.2.2 of [3GPP-TS-36413]. The elements of UE ID and E-RAB ID are included in the message. The message flow is as follows: Cui, et al. Expires August 10, 2010 [Page 4] Internet-Draft Association Changeover Framework February 2010 eNB MME +----------------+ +----------------+ | S1-AP | E-RAB RELEASE INDICATION | S1-AP | |----------------| |----------------| | SCTP |------------------------->| SCTP | +----------------+ +----------------+ Figure 2 eNB Initiated E-RAB Release Indication If eNB resends all the retrieved messages to MME, MME may receive duplicate signaling messages. These duplicated messages go beyond the tolerance of MME. Considering that the association is interrupted after the S1-AP E-RAB RELEASE INDICATION message arrives at MME, the eNB doesn't receive the corresponding SACK packet. A few seconds later, the eNB would detect the association is lost and resend the unacknowledged messages when the interrupted association is reestablished or over the other association. But from the standpoint of the MME, after receiving the first S1-AP E-RAB RELEASE INDICATION message the MME would release the corresponding E-RAB and maybe later allocate the E-RAB for a new session for the same UE. If the MME receives the second E-RAB RELEASE INDICATION message (i.e., the duplicated one), the MME would wrongly release the bearer, which is used by the new session. Because bearer resource is only identified by UE ID and E-RAB ID, the MME can not distinguish the two messages. If the eNB doesn't resend the retrieved messages to the MME, some messages would be lost. Considering that the association is interrupted before the S1 E-RAB RELEASE INDICATION message arrives at MME, the MME would not receive this message and the bearer resource would not be released as needed. We can find that all signaling transactions without handshake are affected by this problem and other signaling transactions with handshake may meet similar troubles. For example, Cui, et al. Expires August 10, 2010 [Page 5] Internet-Draft Association Changeover Framework February 2010 Endpoint A Endpoint B ULP SCTP SCTP ULP | Request | | | |-----------| DATA (Request) | | | | ---------------------> | Request | | | SACK |-----------| | | (lost) <------- | Response | | | DATA (Response) |-----------| | | <----------------- | | | Response | | | |-----------| | | Figure 3 Transaction Acknowledgement Issue In this case, ULP of endpoint B receives the ULP Request message and responds ULP Response message. But in the IP layer, endpoint A maybe receives the DATA chunk containing ULP Response but doesn't receive the SACK chunk for the ULP Request chunk. (There is no sequence guarantee in the IP layer.) The transaction of ULP is accomplished successfully but SCTP still sees the Request message as unacknowledged. If endpoint A resends the ULP Request in SCTP layer, the confusion would happen in the peer endpoint because of the duplicated Request message. If the ULP of endpoint A retrieves the unacknowledged messages from SCTP, the SCTP layer will provide this unacknowledged message, which is in fact transmitted successfully. The ULP also can not recognize this Request message from the finished transaction. So, the existing sigtran solutions cannot meet the network requirements and the SCTP association changeover mechanism SHOULD be provided. 2. Terminology All the Sigtran related terms used in this document are to be interpreted as defined in Stream Control Transmission Protocol [RFC4960]. This document also provides the following context-specific explanation to the following terms used in this document. These terms are defined in 3GPP specifications. eNodeB (eNB) Cui, et al. Expires August 10, 2010 [Page 6] Internet-Draft Association Changeover Framework February 2010 The eNodeB is the radio station of 3GPP's future LTE wireless communication standard. Evolved Packet Core (EPC) EPC is the core network architecture of 3GPP's future LTE wireless communication standard. E-RAB An E-RAB uniquely identifies the concatenation of an S1 Bearer and the corresponding Data Radio Bearer. E-UTRAN E-UTRAN is the Evolved Universal Terrestrial Radio Access Network in 3GPP standard. Mobility Management Entity (MME) MME is the key control plane entity within Evolved Packet Core. MME functions include Non-Access-Stratum signaling, mobility management, Bearer management, etc. S1 Interface between an eNB and an EPC, providing an interconnection point between the E-UTRAN and the EPC. S1-MME The S1-MME is a reference point for the control plane protocol between E-UTRAN and MME in 3GPP standard. 3. SCTP Association Failure Analysis Many reasons can induce failure of SCTP associations and this document is expected to provide robust association changeover solutions for most association failures, to avoid message loss and duplication. 3.1. Communication Loss Communication loss can be detected by SCTP endpoints. The failure detection mechanism can use Data and Heartbeat chunks to achieve this function. The reason for communication loss may be IP network trouble or endpoint issues. Since SCTP uses multihoming, IP network trouble Cui, et al. Expires August 10, 2010 [Page 7] Internet-Draft Association Changeover Framework February 2010 is usually the common bar for all associations. Association changeover can only overcome some communication loss cases. 3.2. Endpoint Terminates Association Sometimes an endpoint terminates the association by graceful (i.e., shutdown) or ungraceful (i.e., abort) manner. The ungraceful termination could bring message loss. The following issues may lead to ungraceful association termination. o The endpoint host is out of resource. When the endpoint host runs into overloading status, it may send ABORT chunk to the peer endpoint to terminate this association and offload the traffic. o There is a protocol violation i.e., a disturbing packet. When the endpoint receives a disturbing packet, such as a DATA chunk without user data or a SACK acknowledging an invalid TSN, it will send an ABORT chunk to the peer endpoint. o Some maintenance mistakes happen. For example, if the administrator reconfigures and restarts the association with new addresses when the association is in ESTABLISHED state, the peer endpoint would send an ABORT chunk to abort the association. o User initiates abort in one endpoint. When SCTP layer gets this indication it will send an ABORT chunk to the peer endpoint. The terminating endpoint may have prepared for the ungraceful termination but the peer endpoint does not, the peer endpoint needs some repair mechanism. 4. Applicability Consideration for Association Changeover This section is intended to present some consideration on the applicability of association changeover mechanism. Association changeover mechanism is expected to enhance the reliability of signaling transport even in some extreme situations, and further to improve the performance of signaling network. However, SCTP association changeover runs in the SCTP layer or upper layers, so it hardly conquers the troubles of IP layer. SCTP association changeover is designed for the unexpected troubles of SCTP layer, which are introduced in the development practice. For example, in the basic scenario where "M3UA+SCTP" protocol stack is used, many ASP sites may connect to the SGP site by multiple SCTP associations, and "n+k" model is applied in M3UA layer. The network architecture is as follows: Cui, et al. Expires August 10, 2010 [Page 8] Internet-Draft Association Changeover Framework February 2010 --------------Site C (ASP) / +---------+---------+ / | | Assoc x |--- active +---------+--------+ | | Assoc 1 |---------------------| Assoc 1 | | | |---------| active |---------| | | | Assoc 2 |---------------------| Assoc 2 | | | Site A |---------| active |---------| Site B | | (SGP) | . . . |---------------------| . . . | (ASP) | | |---------| inactive |---------| | | | Assoc k |---------------------| Assoc k | | | |---------| |---------| | | | . . . |---------------------| . . . | | +---------+---------+ +---------+--------+ Figure 4 Association Overload Issue In existing devices, associations are usually implemented in distributed board/host. For example, in site A, association A1 is in board/host A1, association A2 is in board/host A2, etc.. In site B, association B1 is in board/host B1, association B2 is in board/host B2, and so on. Furthermore, association x of site A is also in board/host A1. (This is a very usual case, where multiple associations are provided in one board/host). From the standpoint of site A, site B and site C may be connected in different board/host sets, because it is very unreasonable that requiring same board/host and association amount configured for different peer sites. In the setup stage, M3UA of Site B uses association B1~Bn (i.e., the active ASPs) in load balancing, and M3UA of site A also takes those. Note the association set is selected by site B since site B is in ASP role and sends ASP UP / ASP ACTIVE messages. And Site C also connects to Site A as Site B does. Board B1~Bn of Site B are balanced at this time but unfortunately that is not true in site A, because there are more traffic in board/host A1 (from and to Site C). So board/host A1 may be overloaded. This result is because two intercrossed sets lead to a non-balanced situation, although each set is balanced. (For example, in site A, board #1, #2, #3 and #4 are for Site B, while board #1, #5, #6 and #7 are for Site C, board #1 may be overloaded.) In this situation (i.e., board/host A1 is overloaded), association A1 SHOULD send an "Out of Resource" indication to the peer node to offload part traffic from board/host A1. Cui, et al. Expires August 10, 2010 [Page 9] Internet-Draft Association Changeover Framework February 2010 If we use rwnd==0 indication (no receiving buffer resource), the association of #1 is still established, the corresponding M3UA ASP is still active, and the traffic of user part is still partly delivered to the association in both sites of A and B. The endpoints will buffer the messages more and more (at least in B to A direction), until message buffer overflow or association termination. This is not the desired result. If we use Abort indication (Out of Resource, which means no resource of CPU, memory or others), the association of #1 between site A and site B is terminated immediately and the corresponding M3UA ASP becomes inactive too. Then M3UA layer would activate ASP k (e.g. the association which is board/host #8), which is standby, to active status. So the active bearer role is transferred from #1 to #k, and the traffic is transferred in the same time. As a result, the load of board/host A1 is decreased while association #x to site C is still active. This provides a dynamic and robust load balancing mechanism for site A. 5. Generic Requirements for SCTP Association Changeover As analyzed in [I-D.sigtran-network-management-ps], the essential requirement is that the endpoint pair can exchange the accurate transmission information, so each endpoint can know which messages are really not received by the peer endpoint and later the unacknowledged messages can be retransmitted to the peer endpoint. Two modes of association changeover are considered in this document. The first mode is ULP-Organizing mode, the requirements of this mode include following: o The ULP can retrieve the exactly unsent and unacknowledged messages in sequence. o The ULP/SCTP can find out the alternative association(s) to retransmit the unsent and unacknowledged messages. o If more than one alternative associations are available for retransmission, only one association can be selected for the messages requiring sequence delivery. o For the messages that don't require sequence delivery, all the available alternative association(s) may be used and load balancing may be considered. The second mode is SCTP-Organizing mode, the requirements of this mode include following: Cui, et al. Expires August 10, 2010 [Page 10] Internet-Draft Association Changeover Framework February 2010 o The ULP need not to retrieve the messages that have been transferred to SCTP layer. The ULP may take SCTP layer as an entirely reliable transport. o The SCTP layer can find out the alternative association(s) to retransmit the unsent and unacknowledged messages. o The SCTP layer can exchange the acknowledgement information with the peer endpoint, retrieve the unsent and acknowledged messages from the interrupted association and resend these messages in the alternative association(s). o If more than one alternative associations are available for retransmission, only one association can be selected for the messages requiring sequence delivery. o For the messages that don't require sequence delivery, all the available alternative association(s) may be used and load balancing may be considered. The ULP and SCTP layer of the endpoint can negotiate the expected association changeover mode by existing primitives and the endpoint can also negotiate that with the peer endpoint during the association initiation. An issue in SCTP-Organizing mode needs more consideration. During the SCTP layer is implementing the association changeover, the unsent and unacknowledged messages (earlier generated by ULP) are retransmitted in the alternative association(s), but the interrupted association is reestablished and the ULP begin to transmit new messages in the association. So there is such a possibility that the ULP messages which are transported in the reestablished association arrive at the peer endpoint earlier than the ULP messages which are transported in the alternative association(s). This mis-sequence delivery problem may be avoided if the ULP doesn't begin to transmit ULP messages until the association has already completed a period of transport test. The details of this case are for further study. 6. New Chunks and Options 6.1. Exchange TSN Notify (ETN) Chunk The Exchange TSN Notify chunk is used for the following purposes: To notify the sender's data acknowledgement information. Cui, et al. Expires August 10, 2010 [Page 11] Internet-Draft Association Changeover Framework February 2010 To request retransmission of the unsent and unacknowledged messages in the SCTP-Organizing mode. The receiver of ETN SHOULD retransmit the unsent and unacknowledged messages in the SCTP-Organizing mode. To indicate the available alternative association(s) in the SCTP- Organizing mode. The receiver of ETN SHOULD retransmit the unsent and unacknowledged messages in the association(s) from which the Exchange TSN Notify chunk is received. The format of Exchange TSN Notify chunk is shown below: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = TBD1 | Reserved |R| Chunk Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Address Type = 5/6 | Address Length = 8/20 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Primary Source IP Address | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Address Type = 5/6 | Address Length = 8/20 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Primary Destination IP Address | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Port Number | Destination Port Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Corresponding Verification Tag | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Cumulative TSN Ack | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Number of Gap Ack Blocks = N | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Gap Ack Block #1 Start | Gap Ack Block #1 End | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / / \ . . . \ / / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Gap Ack Block #N Start | Gap Ack Block #N End | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type (Mandatory): 8 bits (unsigned integer) Cui, et al. Expires August 10, 2010 [Page 12] Internet-Draft Association Changeover Framework February 2010 Set to TBD1 (allocated by IANA). The highest-order 2 bits of TBD1 SHOULD be '11'. When the receiver receives this chunk but does not support it, the receiver SHOULD skip this chunk and continue processing, but report in an ERROR chunk using the 'Unrecognized Chunk Type' cause of error. Chunk Flags (Mandatory): 8 bits (unsigned integer) Reserved: 7 bits Should be set to all '0's and ignored by the receiver. R bit: 1 bit The (R)etransmission bit, if set to '1', indicates that the SCTP layer of the receiver is requested to retransmit the unsent and unacknowledged messages which are belong to the interrupted association. The value of '0' is used for ULP-Organizing mode and the value of '1' is used for SCTP-Organizing mode. Chunk Length (Mandatory): 16 bits (unsigned integer) This field indicates the length of the Exchange TSN Notify chunk in bytes from the beginning of the type field to the end of the chunk. In IPv4 network environment, an Exchange TSN Notify chunk without Gap Ack Blocks will have Length set to 36 and an Exchange TSN Notify chunk with N Gap Ack Blocks will have Length set to 36 + N*4. In IPv6 network environment, an Exchange TSN Notify chunk without Gap Ack Blocks will have Length set to 60 and an Exchange TSN Notify chunk with N Gap Ack Blocks will have Length set to 60 + N*4. Address Type (Mandatory): 16 bits (unsigned integer) Set to 5 if the IP address is IPv4 address or set to 6 if the IP address is IPv6 address. Primary Source IP Address (Mandatory): 32 bits or 128 bits (unsigned integer, depending on the address Type) Set to the primary source IP address of the interrupted association which needs changeover. Primary Destination IP Address (Mandatory): 32 bits or 128 bits (unsigned integer, depending on the address Type) Cui, et al. Expires August 10, 2010 [Page 13] Internet-Draft Association Changeover Framework February 2010 Set to the primary destination IP address of the interrupted association which needs changeover. Source Port Number (Mandatory): 16 bits (unsigned integer) Set to the source port number of the interrupted association which needs changeover. Destination Port Number (Mandatory): 16 bits (unsigned integer) Set to the destination port number of the interrupted association which needs changeover. Corresponding Verification Tag (Mandatory): 32 bits (unsigned integer) Set to the Verification Tag of the interrupted association, which is specified in [RFC4960]. The format and meaning of the Cumulative TSN Ack (Mandatory), Number of Gap Ack Blocks (Mandatory) and Gap Ack Block parameters (Conditional) are the same as for the Selective Acknowledgement chunk defined in section 3.3.4 of [RFC4960]. The exception is that some TSNs that were previously acknowledged via SACK Gap Ack Block MAY be reneged by the sender of ETN. (See Section 8 for information on reneging.) Note that both endpoints of the interrupted association may send Exchange TSN Notify chunk and may send the packet in different alternative association(s). The retransmission of the unacknowledged messages in different direction may be retransmitted in different association(s). 6.2. Exchange TSN Acknowledgement (ETA) Chunk The Exchange TSN Acknowledgement chunk is used for the following purposes: To acknowledge the corresponding Exchange TSN Notify chunk. To notify the ETA sender's data acknowledgement information. To request the retransmission of the unsent and unacknowledged messages in SCTP-Organizing mode. The receiver of ETA SHOULD retransmit the unsent and unacknowledged messages in the SCTP- Organizing mode. Cui, et al. Expires August 10, 2010 [Page 14] Internet-Draft Association Changeover Framework February 2010 To indicate the alternative association(s) in the SCTP-Organizing mode. The receiver of ETA SHOULD retransmit the unsent and unacknowledged messages in the association(s) from which the ETA chunk is received. The format of Exchange TSN Acknowledgement chunk is shown below: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = TBD2 | Status |R| Chunk Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Address Type = 5/6 | Address Length = 8/20 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Primary Source IP Address | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Address Type = 5/6 | Address Length = 8/20 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Primary Destination IP Address | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Port Number | Destination Port Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Corresponding Verification Tag | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Cumulative TSN Ack | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Number of Gap Ack Blocks = N | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Gap Ack Block #1 Start | Gap Ack Block #1 End | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / / \ . . . \ / / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Gap Ack Block #N Start | Gap Ack Block #N End | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type (Mandatory): 8 bits (unsigned integer) Set to TBD2 (Allocated by IANA). The highest-order 2 bits of TBD2 SHOULD be '11'. When the receiver receives this chunk but does not support it, the receiver SHOULD skip this chunk and continue Cui, et al. Expires August 10, 2010 [Page 15] Internet-Draft Association Changeover Framework February 2010 processing, but report in an ERROR chunk using the 'Unrecognized Chunk Type' cause of error. Status (Mandatory): 7 bits (unsigned integer) A 7-bit unsigned integer indicating the disposition of the TSN exchange and association changeover. Values of the Status field less than 64 indicate that the corresponding Exchange TSN Notify was accepted by the sender. Values greater than or equal to 64 indicate that the corresponding Exchange TSN Notify was rejected by the sender. 0000000: Exchange TSN Notify chunk is received. 0000001: Exchange TSN Notify chunk is received and local data acknowledgement information is provided. 1xxxxxx: Reject acknowledgement. Other values: Reserved. R bit (Mandatory): 1 bit (unsigned integer) The (R)etransmission bit, if set to '1', indicates that the SCTP layer of the receiver is requested to retransmit the unsent and unacknowledged messages which are belong to the interrupted association. The sender of ETA SHOULD set 'R' bit to '1' only when the value of Status field of the ETA chunk is 0000001 (in current version), that means the endpoint can request the peer's SCTP layer retransmission by ETA only when it accepts the peer's ETN and provides the data acknowledgement information of itself. The value of '0' is used for ULP-Organizing mode and the value of '1' is used for SCTP-Organizing mode. The format and description of Chunk Length, Address Type, Primary Source IP Address, Primary Destination IP Address, Source Port Number, Destination Port Number, Corresponding Verification Tag, Cumulative TSN Ack, Number of Gap Ack Blocks and Gap Ack Block parameters are the same as for the Exchange TSN Notify Chunk (section 6.1). The exceptions include: Cumulative TSN Ack, Number of Gap Ack Blocks and Gap Ack Block parameters are Conditional parameters. These parameters can be included in ETA chunk only when the Status value of ETA chunk is 0000001 (in current version). Cui, et al. Expires August 10, 2010 [Page 16] Internet-Draft Association Changeover Framework February 2010 6.3. SCTP-Organizing Mode Negotiation Options TBD. 7. Configuration Variables Changeover Timer The SCTP layer SHOULD start this timer after the association is terminated unexpectedly or ungracefully. During this timer is running, the SCTP layer SHOULD keep all the parameters about the terminated association and all the unsent and unacknowledged messages. The default value of Changeover Timer is 10 seconds. 8. Deployment Scenarios and Solution Approach In this deployment scenario, more than one associations are established between the endpoint pair and the SCTP layer is supposed to be able to provide local management function (e.g. alternative association selection). +-------+--------+ +--------+-------+ | | Assoc1 |---------------------| Assoc1 | | | |--------| |--------| | | ULP | Assoc2 |---------------------| Assoc2 | ULP | | |--------| |--------| | | | . . . |---------------------| . . . | | +-------+--------+ +--------+-------+ Cui, et al. Expires August 10, 2010 [Page 17] Internet-Draft Association Changeover Framework February 2010 Endpoint A IP network Endpoint B ULP SCTP | SCTP ULP | SEND (data) | | | SEND (data) | |------------->|> | |<-------------| | | \ | data | | | | >-----|------------------->| | | Assoc#1 LOST | | | Assoc#1 LOST | |<-------------| | |------------->| | /| | |\ | | / | | | \ | | || | | | || | | |\ | Exchange TSN of #1 in #2 | /| | unacked data | \|<-------|------------------->|/ | unacked data |<-------- | | | | | -------->| | \ | | | / | | \| (reSEND unacknowledged data)|/ | |<------------>|<-------|------------------->|<------------>| Figure 5 SCTP TSN Exchange and Association Changeover The basic purpose of ETN/ETA chunk exchange is restoring the accurate data transmission information, so the SCTP layer can update the status and provide the right unacknowledged messages to ULP in ULP- Organizing mode or resend the messages in SCTP-Organizing mode. In SCTP-Organizing mode, the ETN/ETA can additionally request the peer endpoint to retransmit the unsent and unacknowledged messages. In this case, the SCTP layer may transfer all unsent and unacknowledged messages from the interrupted association to the alternative association(s) and resend these messages after receiving the retransmission request in ETN/ETA chunk. The principle of alternative association selection is specified in section 9.4. 8.1. Association Changeover Initiation When SCTP layer detects association interruption, SCTP layer sends notification to ULP as specified in [RFC4960]. If the association is terminated unexpectedly or ungracefully, and there is not any cached ETN for the association, the SCTP layer SHOULD simultaneously implement the following operations: o SCTP layer starts the Changeover Timer and caches the data transmission information of the interrupted association (e.g. Last Rcvd TSN and Mapping Array of the association TCB), holds the unsent and unacknowledged messages until the unsent and Cui, et al. Expires August 10, 2010 [Page 18] Internet-Draft Association Changeover Framework February 2010 unacknowledged messages are retrieved/retransmitted or the Changeover procedure is finished. o SCTP layer releases all the messages buffered in the Receiving Queue and Reasm Queue and reneges them as unacknowledged. o SCTP layer chooses alternative association(s) for the interrupted association as specified in section 9.4. o SCTP layer broadcasts Exchange TSN Notify chunk to the peer endpoint in the selected alternative association(s). The IP address, port number and Corresponding Verification Tag of the interrupted association MUST be included in the Exchange TSN Notify message to identify the corresponding association. The cumulative acknowledged TSN and gap acknowledged TSN block are included in the message to indicate which DATA chunks have been successfully received by the message sender. In ULP-Organizing mode the Chunk Flags of ETN SHOULD be set to 00000000 (i.e., 'R' bit is set to '0'), and in SCTP-Organizing mode the Chunk Flags of ETN SHOULD be set to 00000001 (i.e., 'R' bit is set to '1'). o The ETN chunk is control chunk but a separate T3-rtx SHOULD be applied. The applied T3-rtx timer is stopped when the corresponding ETA or ERROR chunk is received. o When SCTP layer receives the corresponding ETA or ERROR chunk it MUST stop the running Changeover Timer. o All the buffered parameters and messages for the interrupted association MUST be released when the Changeover procedure is finished. 8.2. ETN Processing When SCTP layer receives Exchange TSN Notify message and the endpoint supports the SCTP Exchange/Changeover function, SCTP layer SHOULD implement the following operations: o SCTP layer swaps the Primary Source IP Address and the Primary Destination Address included in the received ETN chunk. o SCTP layer swaps the Source Port Number and Destination Port Number included in the received ETN chunk. o SCTP layer checks whether the Primary Source IP Address, Primary Destination Address, Source Port Number and Destination Port Number identify an existing association and whether the Cui, et al. Expires August 10, 2010 [Page 19] Internet-Draft Association Changeover Framework February 2010 Verification Tag matches. If there is no corresponding association, the SCTP layer MUST respond a reject ETN and finish the changeover procedure. o SCTP layer chooses alternative association(s) for the interrupted association as specified in section 9.4. Only the association(s) that can meet both remote endpoint (the associations carrying ETN chunk) and local endpoint can be selected as the alternative association(s). o SCTP layer caches the ETN chunk and checks the status of the corresponding association. o If the corresponding association is in ESTABLISHED status, the SCTP layer responds ETA chunk with Status value 0000000 for each received ETN chunk. No Cumulative TSN Ack, Number of Gap Ack Blocks and Gap Ack Block parameters are included in the ETA chunk. The 'R' bit of ETA MUST be set to '0' in this case. o If the corresponding association has been recognized as interrupted, the SCTP layer SHOULD respond ETA chunk with Status value 0000001, and additionally set 'R' bit of the ETA chunk to '1' in SCTP-Organizing mode, for each received ETN chunk. Cumulative TSN Ack, Number of Gap Ack Blocks and Gap Ack Block parameters (if there are Gap Ack Blocks) SHOULD be included in the ETA chunk in this case. The receiver of ETN SHOULD release the acknowledged messages (both cumulative acknowledged and gap acknowledged) in the unacknowledged message buffer queue. The receiver of ETN SHOULD immediately retransmit the unsent and unacknowledged messages when the 'R' bit of the received ETN chunk is set to '1'. 8.3. ETA Processing When SCTP layer receives Exchange TSN Acknowledgement message and the endpoint supports the SCTP Exchange/Changeover function, SCTP layer SHOULD implement the following operations: o SCTP layer checks whether there is corresponding ETN chunk. If yes, the SCTP layer MUST release the buffered ETN chunk and stop the Changeover Timer. Note the unsent and unacknowledged messages that are buffered for the changeover is not released at this time. o If the Status value of received ETA is 0000001, the received ETA SHOULD be buffered. Cui, et al. Expires August 10, 2010 [Page 20] Internet-Draft Association Changeover Framework February 2010 o If the 'R' bit of the received ETA is set to '1', the receiver of ETA SHOULD immediately update the unacknowledged message queue of the interrupted association and retransmit the unsent and unacknowledged messages. o SCTP layer MUST NOT respond any chunk for the received ETA chunk. 8.4. Error Processing When the SCTP layer receives ERROR chunk whose Cause Code equal to 6 and the included Unrecognized Chunk is ETN, the SCTP layer SHOULD immediately finish the changeover procedure and release all related resource. 8.5. Association Failure with Cached ETN When SCTP layer detects association failure and a corresponding ETN chunk has been cached, the SCTP layer SHOULD continue the changeover operations as the ETN chunk is just received (e.g. responding ETA chunk, releasing acknowledged messages and retransmitting the unsent and unacknowledged messages in SCTP-Organizing mode). Note in this case the Status of ETA SHOULD be set to 0000001 and the acknowledgement information SHOULD be included in ETA. 9. Solution Guideline 9.1. ULP-Organizing Changeover ULP-Organizing mode is an assuasive changeover solution. The ULP MAY negotiate with SCTP layer for this mode or even take the basic implementation specified in [RFC4960] and other specifications. The ULP SHOULD utilize the retrieve primitives and manage the retrieved messages. When one association is interrupted, the ULP SHOULD retrieve the unsent and unacknowledged messages as normal specifications. The SCTP layer would provide the exactly unsent and unacknowledged messages to ULP. The only difference for ULP is that there is a little delay during the retrieve procedure. The delay time is about one RTT of the alternative association. 9.2. SCTP-Organizing Changeover SCTP-Organizing mode is an intense changeover solution. The endpoint SHOULD negotiate not only between local inner layers but also with the peer endpoint for this mode. When one association is terminated, the SCTP layer can exchange the acknowledgement information with the Cui, et al. Expires August 10, 2010 [Page 21] Internet-Draft Association Changeover Framework February 2010 peer endpoint, retrieve the unsent and acknowledged messages from the interrupted association and resend these messages in the alternative association(s). Local management functionality of SCTP layer is needed in this mode. The ULP need not care about the messages have been transferred to SCTP layer but SHOULD consider the relative pace between ULP and the SCTP layer. 9.3. Negotiation for Association Changeover TBD. 9.4. Alternative Association Selection When an abnormal association failure is detected, the available alternative association(s) may be searched for changeover purpose. The alternative association(s) MUST meet the following principle: 1, The alternative association(s) MUST connect the same endpoint pair as the interrupted association. The source IP address and destination IP address of the associations MUST be matched. 2, The alternative association(s) MUST provide service to the same ULP process as the interrupted association. Note the endpoint can only check locally that the alternative association(s) and interrupted association provide service to same ULP process. 3, The initiator of changeover broadcasts ETN chunk in all of its candidate association(s), which means all the association(s) delivering ETN are available for the initiator endpoint. 4, The receiver of ETN selects alternative association(s) as bullet 1 and 2 of this section among the associations which the ETN chunk is delivered in, and additionally respond ETA in the final alternative association(s). 5, The exchange of ETN/ETA can find out the right alternative association(s) for both endpoints. Each endpoint can only use the association(s) that delivering both ETN and ETA chunk as the alternative association(s). 6, For the messages requiring sequence delivery only one alternative association can be selected and for other messages all alternative association(s) may be used. Cui, et al. Expires August 10, 2010 [Page 22] Internet-Draft Association Changeover Framework February 2010 9.5. Comparison with RFC4960 This document specifies some enhancement for basic SCTP specification. The extensions consist of following: The status and buffered outbound messages (in outbound transmission queue and retransmission queue) of an abnormal interrupted association SHOULD be kept for certain duration. The buffered inbound messages (in inbound receiving queue and reassembly queue) of an abnormal interrupted association SHOULD be released (same to [RFC4960]) and explicitly reneged. The SCTP layer SHOULD be able to run a timer for the changeover procedure. The SCTP layer SHOULD be able to exchange the accurate data acknowledgement status by ETN/ETA chunks in changeover procedure. The SCTP layer SHOULD be able to select the alternative association(s) in changeover procedure. In ULP-Organizing mode, when the SCTP layer receives retrieve primitives while the ETN is not acknowledged, the SCTP layer SHOULD be able to hold the retrieve request and respond after the ETA chunk is received. In SCTP-Organizing mode, the SCTP layer SHOULD be able to retrieve the unsent and unacknowledged messages from the interrupted association and resend these messages in the alternative association(s). 10. Security Considerations Security considerations regarding changeover are needed. The security solution SHOULD fulfill the requirements of all involved nodes and the signaling traffic. 11. IANA Considerations TBD1 and TBD2 are new SCTP chunk type. IANA is requested to assign the new type values for these two types of SCTP chunk. Cui, et al. Expires August 10, 2010 [Page 23] Internet-Draft Association Changeover Framework February 2010 12. Acknowledgments The authors would like to especially thank Randall Stewart for his review and valuable comments. Cui, et al. Expires August 10, 2010 [Page 24] Internet-Draft Association Changeover Framework February 2010 13. References 13.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC4960] Stewart, R., "Stream Control Transmission Protocol", RFC 4960, September 2007. 13.2. Informative References [I-D.sigtran-network-management-ps] Xiangsong, C. and X. Chen, "Problem Statement for Sigtran Network Management", draft-cui-tsvwg-snm-ps-00.txt, (work in progress), January 2010. [3GPP-TS-36413] 3GPP, Evolved Universal Terrestrial Radio Access Network (E-UTRAN); S1 Application Protocol (S1AP) (Release 9), December 2009. Authors' Addresses Xiangsong Cui Huawei Technologies KuiKe Bld., No.9 Xinxi Rd., Shang-Di Information Industry Base, Hai-Dian District, Beijing, P.R. China, 100085 Email: Xiangsong.Cui@huawei.com Xu Chen China Mobile 53A XiBianMennei Ave, XuanWu District, Beijing, China Phone: 86 10 15801696688 3163 Email: chenxu@chinamobile.com Cui, et al. Expires August 10, 2010 [Page 25]