February 3, 2003
Louis Breit


                SNMP Overhead and Performance Impact

                  draft-breit-snmp-overhead-00.txt

                     Status of This Document

 
This document is an Internet-Draft and is NOT offered in accordance with 
Section 10 of RFC2026, and the author does not provide the IETF with any 
rights other than to publish as an Internet-Draft

Task Force (IETF), its areas, and its working groups.  Note that other groups 
may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may 
be updated, replaced, or obsoleted by other documents at any time.  It is 
inappropriate to use Internet- Drafts as reference material or to cite them 
other than as "work in progress."


     The list of current Internet-Drafts can be accessed at
     http://www.ietf.org/1id-abstracts.html


     The list of Internet-Draft Shadow Directories can be accessed at
     http://www.ietf.org/shadow.html


Abstract

This memo provides an overview of the network and device overhead associated 
with SNMP polling and common network management platforms. In particular, it 
describes a means to calculate this overhead and determine the level of system 
impact prior to deployment, or as part of a planning process.

Body

SNMP - How Much Overhead?

Many questions and concerns have been raised about the actual overhead SNMP 
polling places on network hardware, servers, and other components.  In this 
draft, I hope to clear up some of the confusion.

A polling cycle can best be defined as the time in which each element on every 
device is monitored once.   A shorter polling cycle will yield greater 
accuracy for forecasting, while a longer cycle is adequate for trending and 
fault monitoring.

What is an element?  An element on a server may be CPU utilization, memory 
available, disk space, etc.  A single device may also be an element, such as a 
modem on/off, or a power supply temperature indicator.  Any SNMP compliant 
variable of interest can be an element.

Each element polled may require anywhere from 1 to 10 SNMP OIDs or Object 
Identifiers to complete the request.  This number may vary based on the 
management package, configuration, and the device being monitored.  For 
example, a disk array may take one OID per disk, while a typical MIB2 poll of 
interface utilization (packets per sec) takes 2 to 4 OIDs.

The OID identifies the information we wish to retrieve from the device; in 
other words, it defines the element.

While the number of OIDs required may vary, all are typically sent within a 
single SNMP GET request.  This GET request is contained within a single 
packet, with the response from the device is almost always contained within a 
single packet as well.  Therefore, one packet sent yields one packet back to 
the poller or management station.

The cycle can best be described as follows:

An SNMP GET request is sent to the target device.
The device accepts and processes the GET request via one interrupt.
The stored value for the specific OID requested is inserted.
A response packet is sent back to the poller or SNMP management station. 

Network equipment, servers, and dedicated devices are typically able to handle 
thousands of requests per second without degradation (the hardware 
manufacturer can provide exact numbers).  Therefore, the effect of these GET 
requests and the associated interrupts are minimal.

Calculating Network Bandwidth Overhead

Let's assume a 60 second polling cycle, with 500 polled elements contained in 
50 polled devices.  These could be 50 database servers, each with 10 elements 
we wish to collect information on (processor utilization, idle time, block 
size, memory reads/writes, disk i/o. etc.), or something else entirely.

We can assume an IP packet is approximately 1500 bytes (headers may change 
this a bit, but not by much).
Assume a 100Mbps Ethernet connection for our example.

The following formula is used:

	%UTIL = #ELEMENTS (X) PACKET_SIZE (X) 8 BITS/BYTE  
		NETWORK_BANDWIDTH (X) POLLING_INTERVAL

So for our example:

500 (X) 1500 (X) 8	
100,000,000 (X) 60	=.001% or approximately .001% of the available 
bandwidth.

Calculating Device Overhead

The best way to determine the effect on individual devices is to calculate the 
number of requests per second.

REQ/SEC = 	TOTAL ELEMENTS PER DEVICE
		POLLING_INTERVAL

In our example:	10/60 = .16 or less than One Request Per Second on Average.

A more likely scenario in a larger network is to have a much larger number of 
elements per device.  Let's assume 100 elements per device with a polling 
interval of 20 seconds.

100/20 = 5 Requests Per Second on Average.

Now we can apply this to a particular device.  If a router can handle 5000 
requests per second, then 5 requests per second adds approximately 0.1% of 
overhead to the router.

Conclusion

While it is necessary to balance the polling interval with the number of 
elements, the actual overhead on network bandwidth and hardware in most SNMP 
implementations is minimal and should be little cause for concern.


Author's Contact Information

Lou Breit
Seaford, NY 11783
Email:	l.breit@att.net

END DRAFT