February 3, 2003 Louis Breit SNMP Overhead and Performance Impact draft-breit-snmp-overhead-00.txt Status of This Document This document is an Internet-Draft and is NOT offered in accordance with Section 10 of RFC2026, and the author does not provide the IETF with any rights other than to publish as an Internet-Draft Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Abstract This memo provides an overview of the network and device overhead associated with SNMP polling and common network management platforms. In particular, it describes a means to calculate this overhead and determine the level of system impact prior to deployment, or as part of a planning process. Body SNMP - How Much Overhead? Many questions and concerns have been raised about the actual overhead SNMP polling places on network hardware, servers, and other components. In this draft, I hope to clear up some of the confusion. A polling cycle can best be defined as the time in which each element on every device is monitored once. A shorter polling cycle will yield greater accuracy for forecasting, while a longer cycle is adequate for trending and fault monitoring. What is an element? An element on a server may be CPU utilization, memory available, disk space, etc. A single device may also be an element, such as a modem on/off, or a power supply temperature indicator. Any SNMP compliant variable of interest can be an element. Each element polled may require anywhere from 1 to 10 SNMP OIDs or Object Identifiers to complete the request. This number may vary based on the management package, configuration, and the device being monitored. For example, a disk array may take one OID per disk, while a typical MIB2 poll of interface utilization (packets per sec) takes 2 to 4 OIDs. The OID identifies the information we wish to retrieve from the device; in other words, it defines the element. While the number of OIDs required may vary, all are typically sent within a single SNMP GET request. This GET request is contained within a single packet, with the response from the device is almost always contained within a single packet as well. Therefore, one packet sent yields one packet back to the poller or management station. The cycle can best be described as follows: An SNMP GET request is sent to the target device. The device accepts and processes the GET request via one interrupt. The stored value for the specific OID requested is inserted. A response packet is sent back to the poller or SNMP management station. Network equipment, servers, and dedicated devices are typically able to handle thousands of requests per second without degradation (the hardware manufacturer can provide exact numbers). Therefore, the effect of these GET requests and the associated interrupts are minimal. Calculating Network Bandwidth Overhead Let's assume a 60 second polling cycle, with 500 polled elements contained in 50 polled devices. These could be 50 database servers, each with 10 elements we wish to collect information on (processor utilization, idle time, block size, memory reads/writes, disk i/o. etc.), or something else entirely. We can assume an IP packet is approximately 1500 bytes (headers may change this a bit, but not by much). Assume a 100Mbps Ethernet connection for our example. The following formula is used: %UTIL = #ELEMENTS (X) PACKET_SIZE (X) 8 BITS/BYTE NETWORK_BANDWIDTH (X) POLLING_INTERVAL So for our example: 500 (X) 1500 (X) 8 100,000,000 (X) 60 =.001% or approximately .001% of the available bandwidth. Calculating Device Overhead The best way to determine the effect on individual devices is to calculate the number of requests per second. REQ/SEC = TOTAL ELEMENTS PER DEVICE POLLING_INTERVAL In our example: 10/60 = .16 or less than One Request Per Second on Average. A more likely scenario in a larger network is to have a much larger number of elements per device. Let's assume 100 elements per device with a polling interval of 20 seconds. 100/20 = 5 Requests Per Second on Average. Now we can apply this to a particular device. If a router can handle 5000 requests per second, then 5 requests per second adds approximately 0.1% of overhead to the router. Conclusion While it is necessary to balance the polling interval with the number of elements, the actual overhead on network bandwidth and hardware in most SNMP implementations is minimal and should be little cause for concern. Author's Contact Information Lou Breit Seaford, NY 11783 Email: l.breit@att.net END DRAFT