IP Performance Metrics Working Group T. Anjali Internet Draft C. Scoglio Expiration Date: April 2003 I. F. Akyildiz Georgia Institute of Technology G. Uhl A. Sciuto Swales Aerospace J. A. Smith NASA Goddard Available Bandwidth Measurement in IP Networks draft-anjali-ippm-avail-band-measurement-00.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or made obsolete by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract Available bandwidth along a path is an important metric that can be useful to provide insight into the state and performance of the network. Some methods have been proposed in literature for measurement of available bandwidth but they face the problems of scalability and high intrusiveness. In this draft, we propose a method to measure path available bandwidth using an active approach that probes the path. The probing mechanism used is based on TCP New Reno. Once the samples for the available bandwidth measurement are obtained, more accurate results are calculated using a prediction filter. Anjali et al Expires April 2003 Page 1 Available Bandwidth Measurement October 2002 Table of Contents Abstract 1. Introduction 2 1.1 Overview of available bandwidth measurement 3 1.2 Available tools 3 2. Description 4 2.1 TCP measurement phase 4 2.2 Estimation and prediction phase 5 3. Enhancements and Evaluations 6 4. References 7 5. Authors' Addresses 7 1. Introduction The efficiency of the QoS provided by an IP network depends on effective traffic management, performed at both small and large time-scales. Traffic management encompasses the need for dynamic configuration of network devices to adjust to the evolving traffic. This feedback loop is based on measurement of network characteristics. But observability of Internet traffic is difficult because of the network size, large traffic volumes, and distributed administration. Some QoS metrics can be measured in the core of the network and others at the edges. Some have local significance at each router while others are end-to-end metrics. They can be obtained by measurements from the various network elements. To obtain measured statistics from each network element is possible if individual users can monitor each such device. Due to security reasons, this is not possible in a network. Thus, common users can only measure the end-to-end metrics. The approaches to monitor a network are active or passive. The active approach has the advantage of measuring the desired quantities at the desired time. Passive measurements are carried out by observing normal network traffic, without the extra load. But the amount of data accumulated can be substantial because the network will be polled often for information. Available bandwidth in a network is an important metric that can be indicative of the network performance. The available bandwidth of a link can be defined as the unused portion of the link capacity. Then, the available bandwidth along a path is defined as the minimum of the available bandwidths of the links in the path. Based on the bandwidth available in various segments of the network, the network operator can obtain information about the congestion in the network, perform the admission control, routing, capacity provisioning etc. Anjali et al Expires April 2003 Page 2 Available Bandwidth Measurement October 2002 1.1 Overview of available bandwidth measurement Most approaches to available bandwidth measurement are active. However, SNMP based approaches are passive since they obtain router statistics from the router and do not observe the treatment received by the injected pseudo traffic. The active approaches have the advantage of being faster and they operate independently of any network support. SNMP based approaches need cooperation from the network as they need access to the network element MIBs. The available bandwidth can be measured for individual links of the network. The end-to-end performance of the network is what the users and network operators are really interested in. The end-to-end available bandwidth information can be obtained by a concatenation of the available bandwidth measurements of the individual links comprising the path. However, this approach can be very inefficient as the amount of data collected grows as the path size increases and a central data analysis station will be required. Thus, tools have to be devised that can measure the end-to-end available bandwidth directly and accurately from the path. 1.2 Available tools Currently, various tools and products are available that can be used to measure the total capacity in the links of a path in the network. They can be split into two families: pathchar-based and packet pair-based. Examples include Nettimer [1], Network Weather Service approach [2] etc. Among the tools to measure the available bandwidth of a path is an active approach given in [3]. It is based on transmission of self-loading periodic measurement streams. This scheme sends traffic at increasing rates from the source to the destination until the rate finally reaches the available bandwidth of the tight link after which the packets start experiencing increasing amounts of delay. Thus this scheme can be highly intrusive. Also, it takes a long time to converge to a measurement estimate. Another active approach to measure the throughput of a path is Iperf from NLANR that sends streams of TCP/UDP flows. It is another highly intrusive approach that sends traffic at high speeds and displays the throughput achieved. Cisco has introduced the NetFlow technology that provides IP flow information for a network. NetFlow provides detailed data collection with minimal impact on the performance on the routing device and no external probing device. But in a DiffServ environment, the core of a network is interested in aggregate rather than per-flow statistics, due to the scalability issues. MRTG is a tool, based on SNMP, that gives periodic measurements of the utilization of a particular link along the path. To obtain statistics of a link via MRTG, the SNMP query needs access to the router. Also, MRTG obtains available bandwidth estimates over periods of length 5 minutes. Anjali et al Expires April 2003 Page 3 Available Bandwidth Measurement October 2002 Most of the current bandwidth measurement techniques face the problems of poor accuracy, scalability, statistical robustness, agility in adapting to bandwidth changes and flexibility. In this draft, we propose a method to measure the available bandwidth along a path that is accurate, scalable, easy to implement and does not need any network support. The method can be enhanced under scenarios where network support is available to a highly accurate measurement procedure. 2. Description Our proposed method for available bandwidth measurement can be spilt into two phases. In the first phase, a TCP application is deployed to obtain rough estimates of the available bandwidth along the path. We call this the TCP measurement phase. The second phase analyzes the gathered estimates of the available bandwidth to obtain a more accurate and reliable measurement value. This is called the prediction phase. Next, we explain the individual phases in detail. 2.1 TCP measurement phase Our method employs a TCP application to detect and estimate the congestion level in the network. The basic functioning of TCP is described in [4] with modifications in [5]. TCP starts in slow start where the congestion window size is increased by one segment for each ACK for a successful packet delivery. The sender can transmit segments upto the congestion window size. Once the sender observes a packet loss (from a missing acknowledgement), the sender enters congestion avoidance state. A fast retransmit of the missing segment is performed if triple duplicate ACKs are received in a row and then the sender enters fast recovery (and not slow start). TCP NewReno suggests a modification to the fast recovery algorithm to incorporate a response to partial acknowledgements received during fast recovery. Thus, TCP constantly probes the network to find out the available capacity, and when it encounters congestion, it backs off. We use this feature of TCP in our proposed available bandwidth measurement method. If the TCP session is allowed to run for long durations, it fully occupies the unused portion of the path bandwidth, thus changing the network state. Thus, we define a time interval T for which the TCP session is run and then terminated. During this time interval, we gather various samples of bandwidth, which are then fed to the prediction phase. The bandwidth samples are calculated as follows. When the TCP session is started, it starts with the slow start algorithm. When the first signs of congestion are discovered, the TCP session goes into congestion avoidance. We note the value of congestion window just before it is reduced. That determines the rate at which the sender was transmitting and constitutes the first Anjali et al Expires April 2003 Page 4 Available Bandwidth Measurement October 2002 sample of available bandwidth estimate. Then the sender goes into congestion avoidance and reduces its transmission rate. The transmission rate increases again, faster due to the NewReno modifications to fast retransmit and fast recovery algorithms, till the next sign of congestion. At this point, the value of the congestion window parameter is noted again, along with the timestamp at which this happens. Continuing like this, we gather multiple samples during the time T of the TCP session. These samples provide an insight into the unused capacity on the path that is available to be used by new applications. These samples will be input to the prediction phase where we try to derive accurate measurements of the path available bandwidth. Note that this TCP measurement phase relies only on information readily available at the sender, using the current TCP header, and does not require any support from receivers or any network component. 2.2 Estimation and prediction phase Once the bandwidth estimates are obtained by the TCP measurement phase, we have to obtain more accurate values for the bandwidth samples. For this purpose, we propose to use a filter. Note that the samples gathered during the TCP measurement phase are not uniformly spaced in time. This is because the samples were gathered whenever the congestion was detected by the TCP sender. Assume that we obtained N samples of bandwidth estimates from the first phase. To use these samples in our filter, we interpolate them to obtain a sample distribution of M samples that is uniform in time. We denote by T_s the desired time between bandwidth samples. The first sample is assumed to be at time 0 and the successive (interpolated) samples at time i*T_s are obtained by linear interpolation between the two actual samples around the timestamp i*T_s, i = 1, ..., M. For example, suppose that we have a bandwidth estimate of x at timestamp 3*T_s - 5 and another estimate of y at timestamp 3*T_s + 10. Then the sample at 3*T_s will be interpolated as x + 5*(y-x)/15. The number of samples M after this interpolation will be less than or equal to N depending on the timestamps of the actual samples collected during the TCP measurement phase. These M samples will correspond to timestamps 0 to (M-1)*T_s and are denoted by U(i), i = 1 to M, respectively. Once we have these uniformly distributed samples, we apply a linear filter to them to forecast the next M samples. They are given as follows. U(M+i) is given as the sum of U(M-n)*c_i(n) for n from 0 to M-1, for i = 1 to M. We find the filter coefficients c_i(n) by using the covariance equations method. The covariance of the sequence U(i), i = 1 to M can be estimated from the samples. The covariance equations can be solved by Gaussian elimination or some other faster methods such as triangular decomposition etc. Substituting the c_i's back into the linear prediction, we find the next M samples, U(M+i), i = 1 to M. From these samples, we can obtain the estimate of the available bandwidth on the path. The estimate depends on the conservativeness Anjali et al Expires April 2003 Page 5 Available Bandwidth Measurement October 2002 requirements of the application needing the measurement. If the application is highly conservative, it needs an upper bound on the available bandwidth and then we estimate the available bandwidth as the minimum among U(M+i), i = 1 to M. On the other hand, if the application is not very conservative, we estimate the available bandwidth as the mean of U(M+i), i = 1 to M. This single value is the final measurement provided by our scheme. It is not just a measurement but also incorporates a prediction and this value will be valid for sometime in the future. An extended version of this prediction method based on MRTG measurements is given in [6]. 3. Enhancements and Evaluations Our proposed method, detailed in the previous section, does not need any support from the receiver and the network. It probes the network on its own. On the other hand, if network support is available for a path, we can modify our method to obtain more accurate measurements. The method can become less intrusive and more fast and accurate. If network support is available, we replace the TCP measurement phase with a network probe phase. In this phase, we send measurement packets from the source of the path to the destination that collect hop information along the path. More specifically, they collect the counter information about the number of octets that have crossed that hop till then, for each hop. This information can be obtained from the MIBs stored in the routers on the path. In future networks, information sharing seems desirable as the networks can provide the QoS support that users request by cooperation. We need to send a small number of such packets to obtain repeated samples to calculate the path available bandwidth. These samples can then be input to the prediction phase. Only modification is that the samples represent the utilization of the bottleneck link in the path, unlike the available bandwidth samples in the original proposal. Thus, for a conservative application, we estimate the available bandwidth of the path as the minimum among the set of link available bandwidths. The link available bandwidths can be calculated as the difference between the link capacity and the maximum among the predicted utilization samples for that link. The measurement samples can be obtained by utilizing MRTG instead of probing the network. But MRTG has the problem that it obtains a utilization sample every 5 minutes. This can be too long an averaging interval for some applications. For such cases, we have developed MRTG++ [6], our own enhanced version of MRTG. MRTG++ obtains utilization estimates by averaging over intervals of length anytime above 10 seconds. The proposed method is accurate, scalable, easy to implement and does not need any network support. It is accurate as it obtains measurements from the network using a TCP session. If the application desiring the measurement needs highly accurate information, the TCP measurement phase can be repeated a few times Anjali et al Expires April 2003 Page 6 Available Bandwidth Measurement October 2002 to obtain more samples. Our approach is scalable as the measurement samples are obtained by a simple TCP session between the end-points. The overhead will not be different if the number of hops in the path is increased. Our scheme is easy to implement as it involves a simple TCP session initiation, followed by the linear prediction. Also, our scheme does not require any network support for its operation. It has to be noted that our scheme will have difficulty in measuring bandwidth for paths with high capacity links. This is because of the limitations in TCP operation like buffer size, segment size etc. For paths with high capacity links, our method can be extended to include multiple TCP sessions to be initiated in parallel. These multiple sessions can probe the network and obtain fair share estimates of the path available bandwidth. These sub-estimates can then be added to obtain proper estimates for the path. 4. References [1] Lai, K. and Baker, M., "Nettimer: a tool for measuring bottleneck link bandwidth," USENIX symposium on Internet technologies and systems, 2001. [2] Wolski, R., "Dynamically forecasting network performance using the network weather service," Journal of cluster computing, January 1998. [3] Jain, M. and Dovrolis, C., "End-to-end available bandwidth: measurement methodology, dynamics and relation with TCP throughput," ACM Sigcomm, 2002. [4] Stevens, W., "TCP slow start, congestion avoidance, fast retransmit, and fast recovery algorithms," RFC 2001, January 1997. [5] Allman, M., Paxson, V. and Stevens, W., "TCP congestion control," RFC 2581, April 1999. [6] Anjali, T., Scoglio, C., Chen, L. C., Akyildiz, I. F. and Uhl, G., "ABEst: an available bandwidth estimator within an autonomous system," IEEE Globecom 2002. 5. Author's Addresses Tricha Anjali Georgia Institute of Technology 250 14th Street, Suite 556 Atlanta, GA 30318 USA Phone: +1 404 894 6616 Anjali et al Expires April 2003 Page 7 Available Bandwidth Measurement October 2002 Caterina Scoglio Georgia Institute of Technology 250 14th Street, Suite 556 Atlanta, GA 30318 USA Phone: +1 404 894 6616 Ian F. Akyildiz Georgia Institute of Technology 250 14th Street, Suite 568 Atlanta, GA 30318 USA Phone: +1 404 894 5141 George Uhl Swales Aerospace Phone: +1 301 614 5155 Agatino Sciuto Swales Aerospace Phone: +1 301 614 5155 Jeffrey A. Smith NASA Goddard Space Flight Center Phone: +1 301 614 5155 Full Copyright Statement Copyright (C) The Internet Society (2002). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Anjali et al Expires April 2003 Page 8