Session Initiation Protocol (SIP) Overload Control

By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

Abstract

Overload occurs in Session Initiation Protocol (SIP) networks when SIP servers have insufficient resources to handle all SIP messages they receive. Even though the SIP protocol provides a limited overload control mechanism through its 503 (Service Unavailable) response code, SIP servers are still vulnerable to overload. This document proposes new overload control mechanisms for SIP.

1. Introduction
2. Terminology
3. Design Considerations
    3.1. System Model
    3.2. Degree of Cooperation
        3.2.1. Local Overload Control
        3.2.2. Hop-by-Hop
        3.2.3. End-to-End
    3.3. Topologies
    3.4. Overload Control Method
        3.4.1. Rate-based Overload Control
        3.4.2. Loss-based Overload Control
        3.4.3. Window-based Overload Control
    3.5. Overload Control Algorithms
    3.6. Self-Limiting
    3.7. Load Status
    3.8. SIP Mechanism
        3.8.1. SIP Response Header
        3.8.2. SIP Event Package
    3.9. Backwards Compatibility
    3.10. Interaction with Local Overload Control
4. SIP Application Considerations
    4.1. Responding to an Overload Indication
    4.2. Message Prioritization
    4.3. Privacy Considerations
5. Via Header Parameters for Overload Control
    5.1. The 'oc_accept' Parameter
    5.2. Creating the 'oc' Parameter
    5.3. Determining the 'oc' Parameter Value
    5.4. Processing the 'oc' Parameter
    5.5. Using the 'oc' Parameter Value
    5.6. Rejecting Requests
    5.7. Self-Limiting
    5.8. Syntax
6. 'Overload-Control' Event Package
    6.1. Event Package Name
    6.2. Event Package Parameters
    6.3. SUBSCRIBE Bodies
    6.4. Subscription Duration
    6.5. NOTIFY Bodies
    6.6. Subscriber generation of SUBSCRIBE requests
    6.7. Notifier processing of SUBSCRIBE requests
    6.8. Notifier generation of NOTIFY requests
    6.9. Subscriber processing of NOTIFY requests
    6.10. Handling of forked requests
    6.11. Rate of notifications
    6.12. State Agents
    6.13. Examples
7. Security Considerations
8. IANA Considerations
Appendix A. Acknowledgements
9. References
    9.1. Normative References
    9.2. Informative References
§ Authors' Addresses
§ Intellectual Property and Copyright Statements

1. Introduction

Overload is said to occur if a SIP server does not have sufficient resources to process all incoming SIP messages. These resources may include CPU processing capacity, memory, network bandwidth, input/output, or disk resources.

The SIP protocol provides a limited mechanism for overload control through its 503 (Service Unavailable) response code. However, this mechanism cannot prevent overload of a SIP server and it cannot prevent congestion collapse. In fact, the use of the 503 (Service Unavailable) response code may cause traffic to oscillate and to shift between SIP servers and thereby worsen an overload condition. A detailed discussion of the SIP overload problem, the problems with the 503 (Service Unavailable) response code and the requirements for a SIP overload control mechanism can be found in [I‑D.rosenberg‑sipping‑overload‑reqs] (Rosenberg, J., “Requirements for Management of Overload in the Session Initiation Protocol,” October 2006.).

2. Terminology

3. Design Considerations

This section discusses key design considerations for a SIP overload control mechanism. The goal for this mechanism is to enable a SIP server to control the amount of traffic it receives from its upstream neighbors.

3.1. System Model

The type of feedback (F) conveyed from the receiving to the sending entity depends on the overload control method used (i.e., loss-based, rate-based or window-based overload control; see Section 3.4 (Overload Control Method)), the overload control algorithm Section 3.5 (Overload Control Algorithms) as well as other design parameters. In any case, the feedback (F) enables the sending entity to adjust the amount of traffic forwarded to the receiving entity to a level that is acceptable to the receiving entity without causing overload.

       Sending                Receiving
        Entity                  Entity
  +----------------+      +----------------+
  |    Server A    |      |    Server B    |
  |  +----------+  |      |  +----------+  |    -+
  |  | Control  |  |  F   |  | Control  |  |     |
  |  | Function |<-+------+--| Function |  |     |
  |  +----------+  |      |  +----------+  |     |
  |     T |        |      |       ^        |     | Overload
  |       v        |      |       | S      |     | Control
  |  +----------+  |      |  +----------+  |     |
  |  | Actuator |  |      |  | Monitor  |  |     |
  |  +----------+  |      |  +----------+  |     |
  |       |        |      |       ^        |    -+
  |       v        |      |       |        |    -+
  |  +----------+  |      |  +----------+  |     |
<-+--|   SIP    |  |      |  |   SIP    |  |     |  SIP
--+->|Processor |--+------+->|Processor |--+->   | System
  |  +----------+  |      |  +----------+  |     |
  +----------------+      +----------------+    -+

3.2. Degree of Cooperation

A SIP request is often processed by more than one SIP server on its path to the destination. Thus, a design choice for overload control is where to place the components of overload control along the path of a request and, in particular, where to place the Monitor and Actuator. This design choice determines the degree of cooperation between the SIP servers on the path. Overload control can be implemented locally on a SIP server if Monitor and Actuator reside on the same server. Overload control can be implemented hop-by-hop with the Monitor on one server and the Actuator on its direct upstream neighbor. Finally, overload control can be implemented end-to-end with Monitors on all SIP servers along the path of a request and one Actuator on the sender. In this case, Monitors have to cooperate to jointly determine the current resource usage on this path. These three configurations are shown in Figure 2 (Degree of Cooperation between Servers).


                      +-+                    +---------+
                      v |           +------+ |         |
 +-+      +-+        +---+          |      | |        +---+
 v |      v |    //=>| C |          v      | v    //=>| C |
+---+    +---+ //    +---+       +---+    +---+ //    +---+
| A |===>| B |                   | A |===>| B |
+---+    +---+ \\    +---+       +---+    +---+ \\    +---+
                 \\=>| D |                   ^    \\=>| D |
                     +---+                   |        +---+
                      ^ |                    |         |
                      +-+                    +---------+

        (a) local                      (b) hop-by-hop

   +------(+)---------+
   |       ^          |
   |       |         +---+
   v       |     //=>| C |
+---+    +---+ //    +---+
| A |===>| B |
+---+    +---+ \\    +---+
   ^       |     \\=>| D |
   |       |         +---+
   |       v          |
   +------(+)---------+

      (c) end-to-end

 ==> SIP request flow
 <-- Overload feedback loop

3.2.1. Local Overload Control

Servers can implement SIP overload control locally. This does not require any cooperation with neighboring SIP servers. All overload control components (Monitor, Control Function, Actuator) reside on the same SIP element. The idea of local overload control is to determine when a SIP server reaches a high load and to start rejecting requests with as little effort as possible, i.e., early in the processing, if overload occurs. Since rejecting these messages requires less processing capacity than fully processing them, a server is able to gracefully reject excess messages instead of simply dropping them. However, once the number of incoming requests exceeds the server's capacity to reject them, the server will still become overloaded.

Local overload control does not require protocol support and is out of scope for this document.

3.2.2. Hop-by-Hop

The idea of hop-by-hop overload control is to instantiate a separate control loop between all neighboring SIP servers that directly exchange traffic. I.e., the Actuator is located on the SIP server that is the direct upstream neighbor of the SIP server that has the corresponding Monitor. Each control loop between two servers is completely independent of the control loop with other servers further up- or downstream. In the example in Figure 2 (Degree of Cooperation between Servers)(b), three independent overload control loops are instantiated: A - B, B - C and B - D. Each loop only controls a single hop. Overload feedback received from a downstream neighbor is not forwarded further upstream. Instead, a SIP server acts on this feedback, for example, by re-routing or rejecting traffic if needed. If the upstream neighbor of a server also becomes overloaded, it will report this problem to its upstream neighbors, which again take action based on the reported feedback. Thus, in hop-by-hop overload control, overload is always resolved by the direct upstream neighbors of the overloaded server without the need to involve entities that are located multiple SIP hops away.

Hop-by-hop overload control reduces the impact of overload on a SIP network and, in particular, can avoid congestion collapse. In addition, hop-by-hop overload control is simple and scales well to networks with many SIP entities. It does not require a SIP entity to aggregate a large number of overload status values or keep track of the overload status of SIP servers it is not communicating with.

3.2.3. End-to-End

End-to-end overload control implements an overload control loop along the entire path of a SIP request, from UAC to UAS. An end-to-end overload control mechanism consolidates overload information from all SIP servers on the way including all proxies and the UAS and uses this information to throttle traffic as far upstream as possible. An end-to-end overload control mechanism has to be able to frequently collect the overload status of all servers on the potential path(s) to a destination and combine this data into meaningful overload feedback.

A UA or SIP server only needs to throttle requests if it knows that these requests will eventually be forwarded to an overloaded server. For example, if D is overloaded in Figure 2 (Degree of Cooperation between Servers)(c), A should only throttle requests it forwards to B when it knows that they will be forwarded to D. It should not throttle requests that will eventually be forwarded to C, since server C is not overloaded. In many cases, it is difficult for A to determine which requests will be routed to C and D since this depends on the local routing decision made by B.

The main problem of end-to-end path overload control is its inherent complexity since UAC or SIP servers need to monitor all potential paths to a destination in order to determine which requests should be throttled and which requests may be sent. In addition, the routing decisions of a SIP server depend on local policy, which can be difficult to infer for an upstream neighbor. Therefore, end-to-end overload control is likely to only work well in simple, well-known topologies (e.g., a server that is known to only have one downstream neighbor) or if a UA/server sends many requests to the exact same destination.

3.3. Topologies

The following topologies describe four generic SIP server configurations, which each poses specific challenges for an overload control mechanism.

In the "load balancer" configuration shown in Figure 3 (Topologies)(a) a set of SIP servers (D, E and F) receives traffic from a single source A. A load balancer is a typical example for such a configuration. In this configuration, overload control needs to prevent server A (i.e., the load balancer) from sending too much traffic to any of its downstream neighbors D, E and F. If one of the downstream neighbors becomes overloaded, A can direct traffic to the servers that still have capacity. If one of the servers serves as a backup, it can be activated once one of the primary servers reaches overload.

If A can reliably determine that D, E and F are its only downstream neighbors and all of them are in overload, it may choose to report overload upstream on behalf of D, E and F. However, if the set of downstream neighbors is not fixed or only some of them are in overload then A should not use overload control since A can still forward the requests destined to non-overloaded downstream neighbors. These requests would be throttled as well if A would use overload control towards its upstream neighbors.

In the "multiple sources" configuration shown in Figure 3 (Topologies)(b), a SIP server D receives traffic from multiple upstream sources A, B and C. Each of these sources can contribute a different amount of traffic, which can vary over time. The set of active upstream neighbors of D can change as servers may become inactive and previously inactive servers may start contributing traffic to D.

If D becomes overloaded, it needs to generate feedback to reduce the amount of traffic it receives from its upstream neighbors. D needs to decide by how much each upstream neighbor should reduce traffic. This decision can require the consideration of the amount of traffic sent by each upstream neighbor and it may need to be re-adjusted as the traffic contributed by each upstream neighbor varies over time.

An important goal for overload control is to achieve fairness across upstream neighbors. I.e., no upstream neighbor should be required to throttle more than another neighbor. In a fair system, each request that is routed to D has an equal chance of being processed, independent of the upstream neighbor it is coming from. A SIP server may have local policies that prefers some sources over others. For example, it can throttle a less preferred upstream neighbor more or earlier than a preferred neighbor.

In many configurations, SIP servers form a "mesh" as shown in Figure 3 (Topologies)(c). Here, multiple upstream servers A, B and C forward traffic to multiple alternative servers D and E. This configuration is a combination of the "load balancer" and "multiple sources" scenario.


                +---+              +---+
             /->| D |              | A |-\
            /   +---+              +---+  \
           /                               \   +---+
    +---+-/     +---+              +---+    \->|   |
    | A |------>| E |              | B |------>| D |
    +---+-\     +---+              +---+    /->|   |
           \                               /   +---+
            \   +---+              +---+  /
             \->| F |              | C |-/
                +---+              +---+

    (a) load balancer             (b) multiple sources

    +---+
    | A |---\                        a--\
    +---+=\  \---->+---+                 \
           \/----->| D |             b--\ \--->+---+
    +---+--/\  /-->+---+                 \---->|   |
    | B |    \/                      c-------->| D |
    +---+===\/\===>+---+                       |   |
            /\====>| E |            ...   /--->+---+
    +---+--/   /==>+---+                 /
    | C |=====/                      z--/
    +---+

          (c) mesh                   (d) edge proxy

Overload control that is based on reducing the number of messages a sender is allowed to send is not suited for servers that receive requests from a very large population of senders, each of which only infrequently sends a request. This scenario is shown in Figure 3 (Topologies)(d). An edge proxy that is connected to many UAs is a typical example for such a configuration.

Since each UA typically only contributes a few requests, which are often related to the same call, it can't decrease its message rate to resolve the overload. In such a configuration, a SIP server can resort to local overload control by rejecting a percentage of the requests it receives with 503 (Service Unavailable) responses. Since there are many upstream neighbors that contribute to the overall load, sending 503 (Service Unavailable) to a fraction of them can gradually reduce load without entirely stopping all incoming traffic. Using 503 (Service Unavailable) towards individual sources can, however, not prevent overload if a large number of users places calls at the same time.

3.4. Overload Control Method

The method used by an overload control mechanism to limit the amount of traffic forwarded to an element is an important aspect of the design. We discuss the following three different types of overload control: rate-based, loss-based and window-based overload control.

3.4.1. Rate-based Overload Control

The key idea of rate-based overload control is to limit the request rate at which an upstream element is allowed to forward to the downstream neighbor. If overload occurs, a SIP server instructs each upstream neighbor to send at most X requests per second. Each upstream neighbor can be assigned a different rate cap.

The rate cap ensures that the number of requests received by a SIP server never increases beyond the sum of all rate caps granted to upstream neighbors. It can protect a SIP server against overload even during load spikes if no new upstream neighbors start sending traffic. New upstream neighbors need to be factored into the rate caps assigned as soon as they appear. The current overall rate cap used by a SIP server is determined by an overload control algorithm, e.g., based on system load.

An algorithm for the sending entity to implement a rate cap of a given number of requests per second X is request gapping. After transmitting a request to a downstream neighbor, a server waits for 1/X seconds before it transmits the next request to the same neighbor. Requests that arrive during the waiting period are not forwarded and are either redirected, rejected or buffered.

The main drawback of this mechanism is that it requires a SIP server to assign a certain rate cap to each of its upstream neighbors based on its overall capacity. Effectively, a server assigns a share of its capacity to each upstream neighbor. The server needs to ensure that the sum of all rate caps assigned to upstream neighbors is not (significantly) higher than its actual processing capacity. This requires a SIP server to continuously evaluate the amount of load it receives from each upstream neighbor and assign a rate cap that is suitable for this neighbor without limiting it too much. For example, in a non-overloaded situation, it could assign a rate cap that is 10% higher than the current number of requests received from this neighbor. This rate cap needs to be adjusted if the number of requests generated by the upstream neighbor changes (e.g., the server wants to contribute a higher amount of traffic). The cap also needs to be adjusted if a new upstream neighbors appears or an existing neighbor stops transmitting. If the cap assigned to an upstream neighbor is too high, the server may still experience overload. However, if the cap is too low, the upstream neighbors will reject requests even though they could be processed by the server.

3.4.2. Loss-based Overload Control

A loss percentage enables a SIP server to ask an upstream neighbor to reduce the number of requests it would normally forward to this server by a percentage X. For example, a SIP server can ask an upstream neighbor to reduce the number of requests this neighbor would normally send by 10%. The upstream neighbor then redirects or rejects X percent of the traffic that is destined for this server. The loss percentage is determined by an overload control algorithm, e.g., based on current system load.

An algorithm for the sending entity to implement a loss percentage is to draw a random number between 1 and 100 for each request to be forwarded. The request is not forwarded to the server if the random number is less than or equal to X.

An advantage of loss-based overload control is that, the receiving entity does not need to track the request rate it receives from each upstream neighbor. It is sufficient to monitor the overall system utilization. To reduce load, a server can ask its upstream neighbors to lower the traffic forwarded by a certain percentage. The server calculates this percentage by combining the loss percentage that is currently in use (i.e., the loss percentage the upstream neighbors are currently using when forwarding traffic), the current system utilization and the desired system utilization. For example, if the server load approaches 90% and the current loss percentage is set to a 50% traffic reduction, then the server can decide to increase the loss percentage to 55% in order to get to a system utilization of 80%. Similarly, the server can lower the loss percentage if permitted by the system utilization. This requires that system utilization can be accurately measured and that these measurements are reasonably stable. Loss-based overload control achieves fairness among incoming requests if all upstream neighbors are throttled by the same percentage. In this case, each request destined for an overloaded server has the same chance of being rejected by overload control.

The main drawback of percentage throttling is that the throttle percentage needs to be adjusted to the current number of requests received by the server. This is in particular important if the number of requests received fluctuates quickly. For example, if a SIP server sets a throttle value of 10% at time t1 and the number of requests increases by 20% between time t1 and t2 (t1<t2), then the server will see an increase in traffic by 10% between time t1 and t2. This is even though all upstream neighbors have reduced traffic by 10% as told. Thus, percentage throttling requires an adjustment of the throttling percentage in response to the traffic received and may not always be able to prevent a server from encountering brief periods of overload in extreme cases.

3.4.3. Window-based Overload Control

The key idea of window-based overload control is to allow an entity to transmit a certain number of messages before it needs to receive a confirmation for the messages in transit. Each sender maintains an overload window that limits the number of messages that can be in transit without being confirmed.

Each sender maintains an unconfirmed message counter for each downstream neighbor it is communicating with. For each message sent to the downstream neighbor, the counter is increased by one. For each confirmation received, the counter is decreased by one. The sender stops transmitting messages to the downstream neighbor when the unconfirmed message counter has reached the current window size.

A crucial parameter for the performance of window-based overload control is the window size. The windows size together with the round-trip time between sender and receiver determines the effective message rate that can be achieved. Each sender has an initial window size it uses when first sending a request. This window size can be changed based on the feedback it receives from the receiver. The receiver can require a decrease in window size to throttle the sender or allow an increase to allow an increasing message rate.

The sender adjusts its window size as soon as it receives the corresponding feedback from the receiver. If the new window size is smaller than the current unconfirmed message counter, the sender stops transmitting messages until more messages are confirmed and the current unconfirmed message counter is less than the window size.

A sender should not treat the reception of a 100 Trying response as an implicit confirmation for a message. 100 Trying responses are often created by a SIP server very early in processing and do not indicate that a message has been successfully processed and cleared from the input buffer. If the downstream neighbor is a stateless proxy, it will not create 100 Trying responses at all and instead pass through 100 Trying responses created by the next stateful server. Also, 100 Trying responses are typically only created for INVITE requests. Explicit message confirmations via an overload feedback mechanism do not have these problems.

The behavior and issues of window-based overload control are similar to rate-based overload control, in that the total available receiver buffer space needs to be divided among all upstream neighbors. However, unlike rate-based overload control, window-based overload control can ensure that the receiver buffer does not overflow under normal conditions. The transmission of messages by senders is effectively clocked by message confirmations received from the receiver. A buffer overflow can occur if a large number of new upstream neighbors arrives at the same time.

3.5. Overload Control Algorithms

An important aspect of the design of overload control mechanism is the overload control algorithm. The control algorithm determines when the amount of traffic a SIP server receives needs to be decreased and when it can be increased.

Overload control algorithms have been studied to a large extent and many different overload control algorithms exist. This specification does not mandate the use or implementation of a specific algorithm. However, algorithms that are used MUST be compliant with the semantics for overload feedback and the behavior for the upstream node defined in this specification.

3.6. Self-Limiting

An important design aspect for an overload control mechanism is that it is self limiting. I.e., an overload control mechanism should stop a sender if the sender does not receive any feedback from the receiver. This avoids that an overloaded server, which has become unable to generate overload control feedback, will be overwhelmed with requests.

Window-based overload control is inherently self-limiting since a sender cannot continue without receiving confirmations. Servers using Rate- or Loss-based overload control need to be configured to stop transmitting if they do not receive any feedback from the receiver.

3.7. Load Status

It may be useful for a SIP server to frequently report its current load status to upstream neighbors. The load status indicates to which degree the resources needed by a SIP server to process SIP messages are utilized. An upstream neighbor can use load status to balance load between alternative SIP servers and to find under-utilized servers. Reporting load is not intended to replace specialized load balancing mechanisms.

3.8. SIP Mechanism

A SIP mechanism is needed to convey overload feedback from the receiving to the sending SIP entity. A number of different alternatives exist to implement such a mechanism.

3.8.1. SIP Response Header

Overload control information can be transmitted using a new Via header field parameter for overload control. A SIP server can add this header parameter to the responses it is sending upstream to inform its upstream neighbors about the current overload status. A detailed description of this header is provided in Section 5 (Via Header Parameters for Overload Control).

3.8.2. SIP Event Package

Overload control information can also be conveyed from a receiver to a sender using a new event package. This event package enables a sending entity to subscribe to the overload status of its downstream neighbors and receive notifications of overload control status changes in NOTIFY requests. A detailed description of this event package is provided in Section 6 ('Overload-Control' Event Package).

3.9. Backwards Compatibility

An new overload control mechanism needs to be backwards compatible so that it can be gradually introduced into a network and functions properly if only a fraction of the servers support it.

Hop-by-hop overload control does not require that all SIP entities in a network support it. It can be used effectively between two adjacent SIP servers if both servers support overload control and does not depend on the support from any other server or user agent. The more SIP servers in a network support hop-by-hop overload control, the better protected the network is against occurrences of overload.

In topologies such as the ones depicted in Figure 3 (Topologies)(b) and (c), a SIP server has multiple neighbors from which only some may support overload control. If a server would simply use this overload control mechanism, only those that support it would reduce traffic. Others would keep sending at the full rate and benefit from the throttling by the servers that support overload control. In other words, upstream neighbors that do not support overload control would be better off than those that do.

A SIP server should therefore use 5xx responses towards upstream neighbors that do not support overload control. The server should reject the same amount of requests with 5xx responses that would be otherwise be rejected/redirected by the upstream neighbor if it would support overload control.

3.10. Interaction with Local Overload Control

Local overload control can be used in conjunction with the mechanisms defined in this specification. It provides an additional layer of protection against overload, for example, when upstream servers do not support overload control. In general, servers should start using the mechanisms described here to throttle upstream neighbors before using local overload control to reject messages as a mechanism of last resort.

4. SIP Application Considerations

4.1. Responding to an Overload Indication

An element may receive overload control feedback indicating that it needs to reduce the traffic it sends to its downstream neighbor. An element can accomplish this task by sending some of the requests that would have gone to the overloaded element to a different destination. It needs to ensure, however, that this destination is not in overload and capable of processing the extra load. An element can also buffer requests in the hope that the overload condition will resolve quickly and the requests still can be forwarded in time. Finally, it can reject these requests.

4.2. Message Prioritization

Overload control can require a SIP server to prioritize messages and select messages that need to be rejected or redirected. The selection is largely a matter of local policy.

4.3. Privacy Considerations

Providers can set up boundaries in their networks, which enforce topology hiding, header filtering and other functions. These boundaries are often realized as proxies, back-to-back user agents (B2BUA), or session border controllers. These devices may have policies for disclosing overload control information based on location and level of privacy desired.

                               |
      External                 |                 Internal
     +--------+           +---------+           +--------+
     | ProxyA +-----------+  B2BUA  +-----------+ ProxyB |
     +--------+           +---------+           +--------+
                               |
                               | domain border

It should be noted that changing overload control feedback can have a significant adverse effect on the overload control mechanism. For example, the policy in a border device might be to remove overload control feedback until the feedback reaches a certain threshold. However, this intervention in the overload control feedback loop can cause an overload control algorithm to overreact, since the algorithm would not see any effects of the feedback generated. Once the feedback passes through the filter, it would likely reduce traffic too much and causing the control algorithm to again steer into the opposite direction. For this reason, it is NOT RECOMMENDED that a border device changes or partially removes overload control feedback.

A SIP service provider may choose to remove all overload control information to the upstream external proxy. This is NOT RECOMMENDED as it will disable protection against overload.

5. Via Header Parameters for Overload Control

This section defines new parameters for the SIP Via header for overload control. These parameter provide a SIP mechanism for conveying overload control information between SIP entities.

5.1. The 'oc_accept' Parameter

A SIP server that supports this specification MUST add an "oc_accept" parameter to the Via headers it inserts into SIP requests. This provides an indication to downstream neighbors that this server supports overload control.

5.2. Creating the 'oc' Parameter

A SIP server can provide overload control feedback to its upstream neighbors by adding the 'oc' parameter to the topmost Via header field of a SIP response. The 'oc' parameter is a new Via header parameter defined in this specification. When an 'oc' parameter is added to a response, it MUST be inserted into the topmost Via header. It MUST NOT be added to any other Via header in the response. The topmost Via header is determined after the SIP server has removed its own Via header. It is the Via header that was generated by the next upstream neighbor.

Since the topmost Via header of a response will be removed by an upstream neighbor after processing it, overload control feedback contained in the 'oc' parameter will not travel beyond the next SIP server. A Via header parameter therefore provides hop-by-hop semantics for overload control feedback even if the next hop neighbor does not support this specification.

A SIP server SHOULD add an 'oc' parameter to those responses, that contain an 'oc_accept' parameter in the topmost Via header. In this case, the SIP server MUST remove the 'oc_accept' parameter from the Via header and replace it with an 'oc' parameter.

The 'oc' parameter can be used in all response types, including provisional, success and failure responses. A SIP server MAY generally add the 'oc' parameter to all responses it is sending. A SIP server MUST add an 'oc' parameter to responses when the transmission of overload control feedback is required by the overload control algorithm to limit the traffic received by the server. I.e., a SIP server MUST insert the 'oc' parameter when the overload control algorithm sets the 'oc' parameter to a value different from the default value.

A SIP server that has added an 'oc' parameter to Via header SHOULD also add a 'oc_validity' parameter to the same Via header. The 'oc_validity' parameter defines the time in milliseconds during which the content (i.e., the overload control feedback) of the 'oc' parameter is valid. The default value of the 'oc_validity' parameter is 500. A SIP server SHOULD use a shorter 'oc_validity' time if its overload status varies quickly and MAY use a longer 'oc_validity' time if this status is more stable. If the 'oc_validity' parameter is not present, its default value is used. The 'oc_validity' parameter MUST NOT be used in a Via header without an 'oc' parameter and MUST be ignored if it appears in a Via header without 'oc' parameter.

A SIP server MAY forward the content of an 'oc' parameter it has received from a downstream neighbor on to its upstream neighbor. However, forwarding the content of the 'oc' parameter is generally NOT RECOMMENDED and should only be performed if permitted by the configuration of SIP servers. For example, a SIP server that only relays messages between exactly two SIP servers could forward an 'oc' parameter. The 'oc' parameter is forwarded by copying it from the Via in which it was received into the next Via header (i.e., the Via header that will be on top after processing the response). If an 'oc_validity' parameter is present, MUST be copied along with the 'oc' parameter.

The 'oc' and 'oc_validity' Via header parameters are only defined in SIP responses and MUST NOT be used in SIP requests. These parameters are only useful to the upstream neighbor of a SIP server (i.e., the entity that is sending requests to the SIP server) since this is the entity that can offload traffic by redirecting/rejecting new requests. If requests are forwarded in both directions between two SIP servers (i.e., the roles of upstream/downstream neighbors change), there are also responses flowing in both directions. Thus, both two SIP servers can exchange overload information. While adding 'oc' and 'oc_validity' parameters to requests may increase the frequency with which overload information is exchanged in these scenarios, this increase will rarely provide benefits and does not justify the added overhead and complexity needed.

A SIP server MAY decide to add 'oc' and 'oc_validity' parameters only to responses that are sent via a secured transport channel such as TLS. The SIP server can use transport level authentication to identify the SIP servers, to which responses with these parameters are sent. This enables a SIP server to protect overload control information and ensure that it is only visible to trusted parties. Since overload control protects a SIP server from overload, it is RECOMMENDED that a SIP server generally inserts 'oc' and 'oc_validity' parameters into responses to all SIP servers.

5.3. Determining the 'oc' Parameter Value

The value of the 'oc' parameter is determined by an overload control algorithm (see Section 3.5 (Overload Control Algorithms)). This specification does not mandate the use of a specific overload control algorithm. However, the output of an overload control algorithm MUST be compliant to the semantics of this header.

The 'oc' parameter value specifies the percentage by which the load forwarded to this SIP server should be reduced. Possible values range from 0 (the traffic forwarded is reduced by 0%, i.e., all traffic is forwarded) to 100 (the traffic forwarded is reduced by 100%, i.e., no traffic forwarded). The default value of this parameter is 0. The 'oc' parameter value is determined by the overload control algorithm of the SIP server generating the 'oc' parameter.

5.4. Processing the 'oc' Parameter

A SIP entity compliant to this specification SHOULD remove 'oc' and 'oc_validity' parameters in all Via headers of a response received, except for the topmost Via header. This prevents 'oc'/'oc_validity' parameters that were accidentally or maliciously inserted into Via headers by a downstream SIP server from traveling upstream.

A SIP server maintains the 'oc' parameter values received along with the address of the SIP servers from which they were received for the duration specified in the 'oc_validity' parameter or the default duration. Each time a SIP server receives a response with an 'oc' parameter from a SIP server, it overwrites the 'oc' value it has currently stored for this server with the new value received. The SIP server restarts the validity period of an 'oc' parameter each time a response with an 'oc' parameter is received from this server. A stored 'oc' parameter value MUST be discarded once it has reached the end of its validity.

5.5. Using the 'oc' Parameter Value

A SIP server compliant to this specification MUST honor 'oc' parameter values it receives from downstream neighbors. The SIP server MUST NOT forward more messages to a SIP server than allowed by the current 'oc' parameter value from this server.

The SIP server MAY use the following algorithm to determine if it can forward the request. The SIP server draws a random number between 1 and 100 for the current request. If the random number is less than or equal to the 'oc' parameter value, the request is not forwarded. Otherwise, the request is forwarded as usual. Another algorithm for SIP entities that processes a large number of requests is to reject/redirect the first X of every 100 requests processed. Other algorithms that lead to the same result may be used as well.

The treatment of SIP requests that cannot be forwarded to the selected SIP Server is a matter of local policy. A SIP entity MAY try to find an alternative target or it MAY reject the request (see Section 5.6 (Rejecting Requests)).

5.6. Rejecting Requests

A SIP server that is under overload and has started to throttle incoming traffic SHOULD use 5xx response to reject a fraction of requests from upstream neighbors that do not include the 'oc_accept' parameter in their Via headers. These neighbors do not support this specification and will not respond to overload control feedback in the 'oc' parameter. The fraction of requests rejected SHOULD be equivalent to the fraction of requests the upstream server would reject/redirect if it did support this specification. This is to ensure that SIP servers, which do not support this specification, don't receive an unfair advantage over those that do.

A SIP server that has reached overload (i.e., a load close to 100) SHOULD start using 5xx responses in addition to using the 'oc' parameter for all upstream neighbors. If the proxy has reached a load close to 100, it needs to protect itself against overload. Also, it is likely that upstream proxies have ignored overload feedback and do not support this specification.

5.7. Self-Limiting

In these cases, a SIP server SHOULD stop sending requests to this server. The SIP server SHOULD occasionally forward a single request to probe if the downstream neighbor is alive. Once a SIP server has successfully transmitted a request to the downstream neighbor, it can resume normal transmission of requests. It should, of course, honor an 'oc' parameters it may receive. This avoids that a SIP server, which is unable to respond to incoming requests, is overloaded with additional requests.

5.8. Syntax

This section defines the syntax of three new Via header parameters: 'oc', 'oc_validity' and 'oc_accept'. These Via header parameters are used to implement an overload control feedback loop between neighboring SIP servers.

The 'oc' and 'oc_validity' parameters are only defined in the topmost Via header of a response. They MUST NOT be used in the Via headers of requests and MUST NOT be used in other Via headers of a response. The 'oc' and 'oc_validity' parameters MUST be ignored if received outside of the topmost Via header of a response. The 'oc_accept' parameter MAY appear in all Via headers.

The 'oc' Via header parameter contains a number between 0 and 100. It describes the percentage by which the traffic to the SIP server from which the response has been received should be reduced. The default value for this parameter is 0.

The 'oc_validity' Via header parameter contains the time during which the corresponding 'oc' Via header parameter is valid. The 'oc_validity' parameter can only be present in a Via header in conjunction with an 'oc' parameter.

The 'oc_accept' Via header parameter indicates that the SIP server, which has created this Via header, supports overload control.

This extends the existing definition of the Via header field parameters, so that its BNF now looks like:

  via-params        =  via-ttl / via-maddr
                      / via-received / via-branch
                      / oc-throttle / oc-validity
                      / oc-accept / via-extension

  Via: SIP/2.0/TCP ss1.atlanta.example.com:5060;branch=z9hG4bK2d4790.1
    ;received=192.0.2.111
    ;oc=20;oc_validity=500

6. 'Overload-Control' Event Package

This section defines a new SIP event package for overload control. This event package provides a SIP mechanism for conveying overload control information between SIP entities.

6.1. Event Package Name

6.2. Event Package Parameters

No package specific Event header field parameters are defined for this event package.

6.3. SUBSCRIBE Bodies

A SUBSCRIBE request for overload control information MAY contain a body. This body would serve the purpose of filtering the overload control subscription. The definition of such a body is outside the scope of this specification. For example, the body might provide a threshold for reporting overload control information or it might indicate that overload control information should be reported as a loss-percentage or a request rate.

A SUBSCRIBE request for the overload control package MAY be sent without a body. This implies that the default subscription filtering policy as described in Section 6.8 (Notifier generation of NOTIFY requests) has been requested.

6.4. Subscription Duration

A subscription to the overload control event package is usually established when a SIP server first sends a request to another SIP server and terminated when this server stops sending requests and overload control is not needed any more.

The duration of a subscription is related to the time a signaling relationship exists between two servers. In a static SIP server configuration (e.g., two SIP servers are configured to exchange messages in a service provider's network) this relationship can last for days or weeks as long as both servers are running. In this scenario, the subscription duration is largely irrelevant.

In a dynamic configuration (e.g., two SIP servers in different domains) the duration of the signaling relationship can be in the range of minutes or hours and might only last for the duration of a single session. Since it is unknown a priori when the next SIP request will be transmitted from the subscriber to the notifier, subscriber and notifier MAY terminate a subscription to overload control after a period of inactivity.

The duration of a subscription to the overload control event package SHOULD be longer than the duration of a typical session. The default subscription duration for this event package is set to two hours.

6.5. NOTIFY Bodies

In this event package, the body of a notification contains the current overload status of the notifier.

All subscribers and notifiers MUST support the format application/overload-info+xml. The SUBSCRIBE request MAY contain an Accept header field. If no such header field is present, it has a default value of application/overload-info+xml. If the header field is present, it MUST include application/overload-info+xml, and MAY include any other MIME type capable of representing overload status information. As defined in RFC 3265 (Roach, A., “Session Initiation Protocol (SIP)-Specific Event Notification,” June 2002.) [RFC3265], the body of notifications MUST be in one of the formats defined in the Accept header of the SUBSCRIBE request or in the default format.

6.6. Subscriber generation of SUBSCRIBE requests

6.7. Notifier processing of SUBSCRIBE requests

It is RECOMMENDED that a notifier provides overload control status information to all subscribers and that the notifier accepts all subscriptions to this event package. By denying a subscription to overload control, a notifier would disable overload control to this subscriber. Since this subscriber would not know the current overload status of the notifier, it would not reduce the traffic forwarded when the notifier enters an overload condition. Thus, denying a subscription to this event package can leave the notifier vulnerable to SIP overload.

A notifier MAY authenticate and authorize subscriptions to this event package. This is useful if the notifier wants to provide extended overload status information to certain subscribers. For example, a notifier can provide detailed resource usage information to authenticated subscribers and only provide the current throttle status to all other subscribers. The details of the authorization policy are at the discretion of the administrator.

6.8. Notifier generation of NOTIFY requests

A notifier sends a notification in response to SUBSCRIBE requests as defined in RFC 3265 (Roach, A., “Session Initiation Protocol (SIP)-Specific Event Notification,” June 2002.) [RFC3265]. In addition, a notifier MAY send a notification at any time during the subscription. Typically, the notifier will send a notification every time the overload control status has changed. For example, the notifier can create a notify every time the overload control value (e.g., the rate limit) changes.

Overload status information is expressed in the format negotiated for the NOTIFY body (e.g., "application/overload-info+xml"). The overload status in a NOTIFY body MUST be complete. Notifications that contain the deltas to previous overload status or a partial overload status are not supported in this event package.

It is RECOMMENDED that the notifier returns an initial NOTIFY that contains at least the current overload control value immediately after receiving a SUBSCRIBE request. It is RECOMMENDED that the notifier returns such an initial NOTIFY even if the notifier is still waiting for an authorization decision. Once the subscription is authorized, the notifier MAY send another notification that then contains all information the subscriber is authorized to receive. It is RECOMMENDED that the notifier accepts a subscription and creates a NOTIFY with at least the current overload control value even if the subscriber is not authorized to receive more information.

The timely delivery of overload control notifications is important for overload control. It is therefore RECOMMENDED that NOTIFY messages for this event package are sent with highest priority. I.e., the transmission of NOTIFY messages for this event package ought not to be delayed by other tasks.

6.9. Subscriber processing of NOTIFY requests

A subscriber MUST use the overload control state contained in a NOTIFY body and apply this state to all subsequent SIP messages it is intending to send to the respective SIP server. The subscriber MUST NOT forward a higher number of SIP messages to the server than allowed by the current overload control state. Details of how to apply overload control are discussed in Section 3.4 (Overload Control Method)

A subscriber MUST use the overload state it has received for a SIP server until the subscriber receives another NOTIFY with an updated state or until the subscription is terminated. The subscriber SHOULD stop using the reported overload state once the subscription is terminated.

It is RECOMMENDED that the subscriber processes incoming NOTIFY messages for this event package with highest priority. I.e., NOTIFY messages for this event package ought to be processed before other messages are processed. This is to ensure that a subscriber can react quickly to changes in the overload control status even if the subscriber is currently receiving a high volume of messages.

6.10. Handling of forked requests

6.11. Rate of notifications

Keeping the rate of notifications low is important for an overload control mechanism to avoid creating additional traffic in an overload condition. However, it is also important that an overload control algorithm can quickly adjust the overload control value as needed. Ideally, the overload control algorithm would generate a stable control value that rarely needs to be adjusted.

The notifier SHOULD NOT generate NOTIFY messages at a rate faster once every 1 second for notifications that are triggered by a change in the control value. The notifier SHOULD NOT generate a NOTIFY message at a rate faster than once every 5 seconds for all other notifications (i.e., for any additional information included in the subscription).

6.12. State Agents

6.13. Examples

The following message flow illustrates how proxy A can subscribe to overload control status of proxy B. The flow assumes that proxy A does not have an active subscription to the overload control status of proxy B and has received an INVITE request it needs to forward to B.


  Proxy A             Proxy B
     |                   |
     |(1) SUBSCRIBE      |
     |------------------>|
     |(2) 200 OK         |
     |<------------------|
     |(3) NOTIFY         |
     |<------------------|
     |(4) 200 OK         |
     |------------------>|
     |(5) INVITE         |
     |------------------>|
     |(6) 200 OK         |
     |<------------------|
     |(7) ACK            |
     |------------------>|
     |                   |

   Message Details

      TBD.

7. Security Considerations

Overload control mechanisms can be used by an attacker to conduct a denial-of-service attack on a SIP entity if the attacker can pretend that the SIP entity is overloaded. When such a forged overload indication is received by an upstream SIP entity, it will stop sending traffic to the victim. Thus, the victim is subject to a denial-of-service attack.

An attacker can create forged overload feedback by inserting itself into the communication between the victim and its upstream neighbors. The attacker would need to add overload feedback indicating a high load to the responses passed from the victim to its upstream neighbor. Proxies can prevent this attack by communicating via TLS. Since overload feedback has no meaning beyond the next hop, there is no need to secure the communication over multiple hops.

Another way to conduct an attack is to send a message containing a high overload feedback value through a proxy that does not support this extension. If this feedback is added to the second Via headers (or all Via headers), it will reach the next upstream proxy. If the attacker can make the recipient believe that the overload status was created by its direct downstream neighbor (and not by the attacker further downstream) the recipient stops sending traffic to the victim. A precondition for this attack is that the victim proxy does not support this extension since it would not pass through overload control feedback otherwise.

A malicious SIP entity could gain an advantage by pretending to support this specification but never reducing the amount of traffic it forwards to the downstream neighbor. If its downstream neighbor receives traffic from multiple sources which correctly implement overload control, the malicious SIP entity would benefit since all other sources to its downstream neighbor would reduce load.

8. IANA Considerations

Appendix A. Acknowledgements

Many thanks to Rich Terpstra, Jonathan Rosenberg and Charles Shen for their contributions to this specification.

9. References

9.1. Normative References

9.2. Informative References

Authors' Addresses

Full Copyright Statement

This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights.

This document and the information contained herein are provided on an “AS IS” basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Intellectual Property

The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79.

Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr.

The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org.

[I-D.hilt-sip-correction-503]	Hilt, V. and I. Widjaja, “Essential Correction to the Session Initiation Protocol (SIP) 503 (Service Unavailable) Response,” draft-hilt-sip-correction-503-01 (work in progress) (TXT).
[RFC2119]	Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML).
[RFC3261]	Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, “SIP: Session Initiation Protocol,” RFC 3261, June 2002 (TXT).
[RFC3263]	Rosenberg, J. and H. Schulzrinne, “Session Initiation Protocol (SIP): Locating SIP Servers,” RFC 3263, June 2002 (TXT).
[RFC3265]	Roach, A., “Session Initiation Protocol (SIP)-Specific Event Notification,” RFC 3265, June 2002 (TXT).
[RFC4412]	Schulzrinne, H. and J. Polk, “Communications Resource Priority for the Session Initiation Protocol (SIP),” RFC 4412, February 2006 (TXT).

	Volker Hilt
	Bell Labs/Alcatel-Lucent
	791 Holmdel-Keyport Rd
	Holmdel, NJ 07733
	USA
Email:	volkerh@bell-labs.com

	Indra Widjaja
	Bell Labs/Alcatel-Lucent
	600-700 Mountain Avenue
	Murray Hill, NJ 07974
	USA
Email:	iwidjaja@alcatel-lucent.com

	Daryl Malas
	Level 3 Communications
	1025 Eldorado Blvd.
	Broomfield, CO
	USA
Email:	daryl.malas@level3.com

	Henning Schulzrinne
	Columbia University/Department of Computer Science
	450 Computer Science Building
	New York, NY 10027
	USA
Phone:	+1 212 939 7004
Email:	hgs@cs.columbia.edu
URI:	http://www.cs.columbia.edu

Session Initiation Protocol (SIP) Overload Controldraft-hilt-sipping-overload-04

Status of this Memo