INTERNET DRAFT S. Donelan DRA October 1996 Expire in six months Responsible Network Management Guidelines Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as ``work in progress.'' To learn the current status of any Internet-Draft, please check the ``1id-abstracts.txt'' listing contained in the Internet- Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Abstract This document provides Responsible Network Management personnel of Internet Service Providers (ISPs) and Internet Service Customers (ISCs) with guidelines for network management when the following conditions arise: Routine Maintenance Activity, Problem Reporting and Referral, Escalation, End-to-End Testing, Customer Notification, Emergency Communications, Network Outage Measurement. Specific procedures will require negotiations between the organizations involved. These guidelines do not replace or supersede contracts or any other legally binding documents. Responsible Internet Service Provider A more familar term in Internet Standards is an Autonomous System. Since this document has additional requirements than an entity represented by an Autonomous System or Systems, this document creates a new entity. The Responsible Internet Service Provider (RISP) has overall responsibility for Internet service between its Internet Service Customers and other Internet Service Providers making up the Internet. An Internet Network, Autonomous System or group of Autonomous Systems may designate another entity to act on its behalf as its Responsible Internet Service Provider. In this document, Internet Service Customer (ISC) shall refer to the collective network, Autonomous System or Systems which designated the Responsible Internet Service Provider as their agent. The Responsible Internet Service Provider is responsible for: -- Providing a contact that is readily accessible 24 hours a day, 7 days a week. -- Providing trained personnel. -- Acting as the Internet Service Customer's primary contact in all matters involving Internet Service between Internet Providers. -- Accept problem reports from Internet Service Customers and casual end users or other parties receiving Internet Service problem reports. The RISP may prioritize problem reports from its own ISCs, or refer casual end users to their primary RISP, if known. Nevertheless the RISP should accept problem referrals from other sources. -- Advising the ISC when there is an ISP failure affecting the ISC -- Isolating problems to determine if the reported trouble is in the ISP's facilities or in other providers' service. -- Testing cooperatively, when necessary, with other providers to further identify a problem when it has been isolated to another provider's service. -- Keeping its Internet Service Customer advised of the status of the trouble repair. -- Maintaining complete and accurate records of its own customers and inter-provider gateways. Routine Maintenance Activity Responsible Internet Service Providers should perform routine maintenance work during hours of minimum traffic to impact the least number of customers. In most areas, the period of lowest Internet traffic is between Midnight and 6am local time. Trans-contential and inter-contential connections should consider the local time on each end of the connection. Activities which may affect other Internet Service Providers should be coordinated with the affected providers. Problem Reporting and Referral The Responsible Internet Service Provider is responsible for performing all the necessary tests to determine the nature of the problem detected, or reported by its customers or by referral from other ISPs. If the trouble is isolated to an ISC or another ISP, the RISP will report the trouble to the appropriate ISC or ISP point of contact. An example of the information exchanged in the problem report: -- Description of the problem, and any other useful information such as source and destination IP numbers, circuit numbers, etc. -- The name and contact information of the person referring the problem -- The date and time of the report -- Trouble ticket number and the name or initials of the person accepting the report Periodic status reports shall occur when the problem has been isolated, when there is a significant change in the status of the problem, and when negotiated time intervals expire. Escalation will be according to negotiated procedures. Problem isolation may require cooperative testing between the ISC and ISP(s), which shall be provided when requested. The provider making the test is responsible for coordination. When the problem has been cleared, the ISP/ISP or ISP/ISC shall advise the other the problem has been cleared. When closing a problem report between ISP/ISP or ISP/ISC, the disposition should be furnished by the organization closing the ticket. An example of the information exchanged in the problem disposition: -- Trouble ticket number -- Referral datetime -- Returned datetime -- Trouble identified as -- Resolution details -- Service charges, if the ticket resulted in a service charge If there is a disagreement about the disposition of a problem ticket, the parties involved should document their respective positions and the names of the individuals involved. Escalation will be made according to each organizations escalation procedures. Escalation Each ISP and ISC shall establish procedures for timely escalation of problems to successive levels of management. The procedures should include the provision of status reports to the other provider or customer regarding the ticket status. Both technical and management contacts should be included in the escalation procedures. End-to-End Testing Networks may experience problems which cannot be isolated by each provider individually testing and maintaining its own services. Each providers' service may appear to perform correctly, but trouble appears on an end-to-end service. The ISC's RISP should coordinate end-to-end testing with each sectional provider by problem referral through their Responsible Internet Service Provider. Each Internet Service Provider should accept the referral request for end-to-end testing coordination, and provide the contact information for the next sectional provider to the original requestor. Customer Notification During a major outage a potential concern is customer goodwill and network congestion caused by repeated customer attempts to access the down network. An informed customer can reduce customer frustration, and network congestion. Pre-planning for quick notification can be most beneficial in alerting customers. Some example methods to notify customers include: -- If operational, network access equipment can display an alert when customers connect. The alert should be displayed before the customer logs into the network. If the network fails during or after attempting to validate the access information, the alert should not compromise any authentication done. -- Customer service calls increase dramatically during network failures. An informed customer representative can advise the customer on the best course of action. A method to quickly instruct customer service representatives on the options available is needed. -- The media, radio or television, can be used to inform the public. Pre-arrangements, and planning are needed to ensure only designated contacts are made with the media. -- Other automated announcements, such as World Wide Web pages or e- mail distribution lists with backup through other providers, recorded telephone status lines, or broadcast FAX notifications. Public notifications, when utilized, should not make reference by name to a problem causing network or organization unless the network causing the problem has been identified. Internet network troubles can be difficult to isolate, and can give misleading indications to their true origin. Emergency Communications Recognizing that all Responsible Internet Service Providers have a responsibility to provide an adequate level of support for their service and/or products, it is recommended they participate in an emergency communications system. Each RISP is responsible for providing a Emergency Point Of Contact. It is recommended each Emergency POC have at least one out-of-band contact method, such as an internationally dialable (non 800) voice and/or fax telephone number. Each RISP shall update the Emergency POC information whenever it changes. Each RISP shall test and verify its own emergency POC procedures are accurate and functioning on a regular basis, no less than once a year. Network Outage Measurement Each ISP/ISC should maintain accurate records about network outages to measure, analyze and develop trend analysis of their network outages. Security Considerations Security relevant information may be reported via a wide variety of contacts with the ISP, calls to the NOC, calls to customer service, and even calls to the provider's general business office. Each responsible Internet service provider is responsible for training all its personnel on its internal guidelines for reporting security relevant information to its security point of contact. Emergency points of contact should exchange procedures to verify each other's identity which don't depend on access to the Internet. Author's Address Sean Donelan Data Research Associates, Inc. 1276 North Warson Road Saint Louis, MO 63132 Phone: +1-314-432-1100 EMail: sean@DRA.COM