The Devil is in the Deployment
draft-hallambaker-iab-deployment-00

Abstract

The defining feature of a standard is that it be widely, preferably ubiquitously used. The deployment strategies of previous protocol standardization efforts are compared and best practice suggested for application and infrastructure protocol deployment strategies are described. Recommendations for enabling deployment of specific protocols and for future IETF working practices are made.

This draft is a generalization of the principles used to develop the deployment strategy for the Mathematical Mesh. Many documents describing deployment considerations have been developed during the development of the Mesh and these have motivated many changes to the design during the course of development.

The Mesh is consciously and deliberately modeled on the same strategies that succeeded in the Web. Some of these strategies are well known:

Other parts of the Web strategy have not been widely discussed. This paper presents some parts of the strategy most relevant to the IAB workshop program.

This document is also available online at http://mathmesh.com/Documents/draft-hallambaker-iab-deployment.html .

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on November 4, 2019.

Copyright Notice

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

1. Lessons from History

1.1. The World Wide Web
1.2. IPv6

1.2.1. Recommendations

1.3. DNSSEC and DANE

1.3.1. DANE
1.3.2. DPRIV

2. Recommendations

2.1. Purpose of the IETF
2.2. Design for Deployment
2.3. Identify Stakeholders and Gatekeepers
2.4. Realistic Schedules
2.5. Eliminate Deployment Dependencies
2.6. Recognize Failure

Author's Address

1. Lessons from History

When the Internet was first being developed, the number of hosts was zero and the user community was highly motivated to adopt new technologies because they were developing them. Today the Internet has four billion users and forty years of legacy infrastructure. If we are going to improve our record of deploying new developments we must look past the earliest pioneering days and focus on deployment of technologies developed since the Internet had grown in size to the point where deployment was a primary design constraint.

If we are going to succeed in being relevant, we must design for deployment. We also have to have the courage to learn from past mistakes. Reminding people of past mistakes is never popular but learning from them is the only way to avoid them.

1.1. The World Wide Web

Contrary to subsequent histories, the success of the World Wide Web was neither inevitable nor an accidental consequence of the design. The Web was neither the first network hypertext system proposed or the best funded. Then the Web was demonstrated in public for the first time late in 1992, there were two dozen competing schemes and the only developers paid to work on the Web full time were Tim Berners-Lee himself and one intern.

Having read and re-read the www-talk archives many times during research of prior art, it is clear that deployment was of paramount concern in the design of URLs, HTTP and HTML. The scheme prefix was added to the original HTTP locator in response to Dan Connolly's suggestion that the Web should permit access to any resource regardless of the access protocol. The port field was added after developers complained that they could not run a Web server because they lacked the system privileges then required to bind to port numbers lower than 1024. SGML was adopted as the basis for the markup language in spite of rather than because of its technical merits.

The success of the Web was in part due to the fact that it was designed to solve a specific set of problems rather than to realize Ted Nelson's vision. Stripping out difficult to implement features (search, payments, referential transparency) from the core allowed them to be solved separately as demand and resources permitted.

But more important than the design was the fact that the Web offered a dramatically lower cost of deployment than any of its rivals. At the time 'free software' generally came at the cost of several days effort trying to get the source code to compile. Pricing for commercial software was based on a fraction of the price of the machine on which it ran which ranged from the tens of thousands to the millions of dollars.

The Web clients and servers were free for non-commercial use and the implementations developed by NCSA had been developed with US government grants. This last consideration was of key importance in the Clinton administration decision to use the Web as the basis for realizing Al Gore's vision of an 'Information superhighway'.

Work on gaining the endorsement of the White House began before the 1992 presidential election. The MIT AI lab had begun its Intelligent Information Infrastructures project which put material from all the campaigns online a year earlier. I made contact with Jock Gill who was then in charge of the Clinton-Gore campaign and proposed that the White House deploy a Web site.

The launch of the IBM PC a decade earlier had validated the use of the 'microcomputer' as a business tool. Before the IBM PC, the priority of most MIS departments was to maintain their monopoly. As an intern at ICI, I spent four months writing code to screen scrape reports from the IBM mainframe so that my boss could analyze them with Lotus 123. My predecessor had had to work with a machine hidden in a cupboard so the MIS department couldn't find and confiscate it. The IBM endorsement of the microcomputer legitimized the microcomputer and I wanted a similar endorsement for the Web.

The strategy paid off. Before the launch of whitehouse.gov, we could persuade almost no businesses outside the computing industry to adopt the Web. This was not for lack of effort and despite the fevered press reporting on the thousand percent per week growth rate of the Web. After whitehouse.gov was launched, there was no need for persuasion. The Web was growing of its own accord.

Another endorsement that was aggressively pursued was Microsoft. This initiative was taken by Robert Cailliau over the course of many months in 1993. Subsequent attacks on Microsoft for 'stealing' the Web technologies has always rankled me. The truth of the matter is that we gave them the technology and pleaded with them to distribute it.

In summary, the Web succeeded because it was designed for deployment and we aggressively pursued key endorsements to market it. A necessary part of the design for deployment approach was abandoning approaches that were regarded as sacrosanct in the field for reasons that turned out to be grounded in ideology rather than technology.

1.2. IPv6

Despite work on IPv6 starting at the same time as the Web and despite the fact that the projected growth rate of the Internet was projected to exhaust the IPv4 address space in 1998, deployment of IPv6 continues to fall short of expectations.

It is arguable that the delay in adoption of IPv6 is in part due to the success of the Web. Wide Area Networking was growing at an exponential rate before the Web appeared on the scene. But the Web caused the Internet to kill off the competing WAN protocols. Had that occurred in 1997, the transition to IPv6 might conceivably have been completed first. But Internet supremacy became inevitable in 1995 instead meaning that it was IPv4 not IPv6 that became ubiquitous.

Deployment of the Internet has been driven by two killer applications: Email and the Web. And here there is an ironic twist. Over 1993 the proportion of Internet users who were Web users rose from ~0% to ~100% because the URL scheme field delivered interoperability. But from the start of 1994 to the end of 1995 the percentage of global WAN traffic that was Internet traffic grew from under 20% to over 80% and this change was likely driven by the Web's lack of interoperability.

In 1992, the primary applications for academic computer networks were remote access, email and file transfer. As a graduate student in the UK the primary WAN networks I made use of were HEPNET which ran on DECNET phase 4 and JANET which ran a protocol stack called coloured books. I did not have direct access to the Internet from my Oxford University machine but I could access Internet machines in Germany via HEPNET. I could also exchange email with Internet users via a mail gateway which used the heuristic that email addresses beginning with com. or edu. were Internet addresses and reverse the big-endian addressing convention adopted by JANET. Remote access and file transfer could be achieved using similar (but less reliable) techniques.

Before the Web, attempts to secure funding for access to any computer network other than JANET were an exercise in futility. The only person with decision making power was the Secretary of State for Education who was most unlikely to fund a rival to the system his own department was paying to build. Not least because the advisory board was staffed by the people who were developing it. Any attempt to propose use of a rival technology was easily defeated by pointing out that JANET afforded access to the exact same resources.

The Web changed this calculus because even though the Web itself could run on other protocols, it was the content that the users wanted access to and that was only available on the Internet (at least as far as the Minister was aware).

The first point of this apparent digression is that while these political considerations may sound petty and short sighted, they are the exact considerations that gave rise to a particular interpretation of the 'end to end' principle which insists that the IP address remains constant end-to-end, an interpretation that appears nowhere in the original papers. The point of 'IP end-to-end' was that applications work a lot better without unnecessary translations from one protocol stack to another and back. Government sponsorship can be a powerful driver of early adoption. It can also lead to a situation in which the market is deadlocked in a format war.

The second point of the digression is that 'IP end-to-end' became ideology, a slogan. Any suggestion that NAT was beneficial was attacked using exclusionary tactics. I was attacked in very unpleasant terms for suggesting that I had no intention of paying my ISP $10/month for each device added to my home network when I could do the same thing for free using NAT. Since I now have 200 devices on my home network, this would cost me $24,000 a year.

To the extent there was a deployment strategy for IPv6, it was to raise concern that exhaustion of the address space was imminent and make dire predictions of the consequences of this happening while discouraging the use of NAT or other techniques that might mitigate the problem. This approach was unsustainable. The main party that would suffer if the IPv4 address poll was exhausted was the ISPs. In 1998, I was amused when a neighbor reported that the broadband provider I had abandoned because their TOS did not permit use of NAT had shipped him a new router with NAT enabled by default. I was further amused when I read the new TOS to find that use of NAT was still prohibited.

The argument made against NAT was that users were not going to move to IPv6 unless the new protocol offered new features. The reverse is in fact the case. Even today, IPv6 is not a choice for most Internet users. It is a feature their ISP does or does not provide. Or more accurately, it is a feature that some of the multiple ISPs that a user might rely on in a single day might provide. Differentiating IPv6 by offering additional features to application developers is doomed to failure because the IP is a network layer capability that the application protocol designer does not and indeed cannot rely on.

The OSI layered model is a poor guide to the Internet protocol stack but the principles of abstraction and encapsulation are not. An application layer protocol has no business dealing in network layer addresses. We should regard application protocols that rely on IP addresses being constant end-to-end as being poorly architected rather than attempting to police global provision of Internet services to enforce a misfeature.

1.2.1. Recommendations

Rather than trying to drive deployment of IPv6 by limiting the functionality of the IPv4 Internet, we should attempt to eliminate as many differences as possible. Our end goal should be to improve network capabilities as quickly as possible rather than to achieve IPv4 sunset as quickly as possible.

What is important is that we have enough addresses to allow the Internet to continue to grow. NAT has allowed the number of devices connected to the Internet to exceed the number of IPv4 addresses by at least an order of magnitude by allowing multiple devices at the same site to share the same address. The effectiveness of this strategy will inevitably decline as the number of sites begins to approach the number of available addresses.

The priority therefore should be to make access to the IPv6 backbone as ubiquitous as possible, including access using devices that are not IPv6 capable and never will be.

I have 200 devices on my home network of which only 20 are configured for use of IPv6. I am not replacing my 36" plotter just so that I can connect via IPv6. Nor do I plan to open up walls or climb ladders to do so. Nor is anyone else in a similar position going to do so. But every one of those devices could function as if it were IPv6 capable as far as the rest of the Internet was concerned if the NAT box connecting my home network to the Inter-network was appropriately configured.

Deployment cannot be advanced by withholding features, but it can be advanced by offering better performance. Users demand gigabit connection speeds because they believe they will deliver performance. Users are likely to demand an IETF specified suite of RFC compliances if they believe that this will provide better performance. But the industry can only follow the IETF lead if the IETF recommendation is actionable. 'Stop using IPv4' is not actionable today and will not be actionable as far as the home user is concerned within our lifetimes. A recommendation that ISPs provide IPv6 to the home/enterprise and that every home router support a feature set that allows every device connected to the local network to make full and transparent use of that capability is actionable.

1.3. DNSSEC and DANE

Like IPv6, DNSSEC was also proposed at roughly the same time as the Web and it is generally argued that deployment of DNSSEC was eclipsed by the rise of a Web technology. The introduction of SSL (now TLS) led to the deployment of what is now known as the WebPKI, one of only two Internet security protocols that has approached ubiquitous deployment in its field of use (the other being the closely related SSH).

This is a misconception as the WebPKI was developed to provide a sufficient accountability infrastructure to enable Internet commerce. The DNSSEC was never intended to provide a form of authentication that was sufficient for accountability. But neither is the WebPKI capable of supporting what should have been the primary objective of DNSSEC: Authenticated distribution of security policy.

One of the chief problems faced in deployment of DNSSEC was that until critical mass is reached, the network effect works against deployment. DNS services were typically consumed through operating system services and no major operating platform provider was going to provide support for DNSSEC until there was customer demand. No customer was going to demand support for DNSSEC in the operating system until they could register their keys in their domain and the registries were not going to support registration of keys until there was some means of using them.

The first time the CEO of any major Internet technology provider mentioned DNSSEC was when Stratton Sclavos gave the potential for deployment of DNSSEC as one of the major potential benefits of the acquisition of Network Solutions in 2000. At that point, VeriSign was the only major stakeholder in the DNS infrastructure endorsing deployment of DNSSEC.

The endorsement was not appreciated by the DNS community. One of the DNSEXT chairs who was also a member of the IESG repeatedly demonstrated open hostility to the name 'VeriSign' and anyone associated with the company.

In 2001, detailed examination of the DNSSEC deployment requirements revealed that the implementation of the NSEC record as it was then specified would require every zone to be signed, increasing the size of the zone file by more than an order of magnitude. This would in turn require substantial changes to the architecture of the ATLAS infrastructure then being developed. Storing the complete zone file at every node would require more than 4GB of memory and thus require the use of 64 bit machines which would add an estimated $30 million to the cost. Re-engineering the system to partition the database would delay deployment by a year at least.

These facts and a technical proposal that addressed the issue were presented. One of the responses to the proposal was that if the .com zone was too large to be signed using DNSSEC, the correct solution was to reduce the size of .com, not to change DNSSEC. While I was not surprised the statement was made, it should perhaps have been surprising that nobody laughed.

Besides delaying the start of actual DNSSEC deployment by a decade, the situation came very close to litigation that could have bankrupted the IETF. When Sclavos resigned in 2007, one of the principle complaints of his performance made by the board was the failure to show synergies between the businesses he had acquired, in particular the failure to deploy DNSSEC.

Attempts to deploy DANE and DPRIV have fallen victim to similarly blinkered thinking.

1.3.1. DANE

DANE was an attempt to use the DNS to provide certification of server keys and distribute security policy using the DNS. Despite repeated warnings, the working group never recognized that attempting to achieve both goals in one system would introduce constraints that doomed deployment.

At the time DANE was proposed, most DNS registrars operated their domain name registration businesses as a loss leader for their other services. The major profit center for most being sale of TLS Certificates. The same DNS registrars were the gatekeeper for deployment of DNSSEC.

Had the scope of DANE been limited to issue of free certificates, DNSSEC need not have been an essential requirement and the registrars would not have been gatekeepers for the deployment of DANE and the fact that DANE would eliminate their main source of earnings would not have mattered. But DANE was also intended to be a means of publishing security policy information and in particular to tell clients that they must use TLS. This meant that deployment of DANE was necessarily dependent of deployment of DNSSEC.

Had deployment of DANE and DNSSEC been decoupled so that one could be used without the other if necessary, a virtuous cycle of deployment might have been realized in which the success of one encouraged the other.

To this day, very few DNS registrars advertise support for DNSSEC and none that I am aware of facilitate use of DANE TLSA records.

1.3.2. DPRIV

DPRIV was an attempt to provide confidentiality for DNS protocol communications between end user clients and resolution services.

As with DANE and DNSSEC, the deployment constraints the Web browser providers were ignored and the design was predicated on an undeployed technology.

DPRIV did have the backing of VeriSign, the primary DNS operator. But the Web browser providers did not express interest. Recognizing the urgent need to protect the confidentiality of DNS traffic, the working group decided to complete its work in a year. This in turn constrained the choice of cryptographic protection to TLS and since TLS is layered on TCP/IP, this meant DPRIV could only come close to meeting the latency requirements set by the browser providers if TCP Fast Open, an experimental technology was used.

2. Recommendations

Almost any advice on deployment strategy is likely to prove useful. The one counterexample being the advice that is most frequently given: To give up on any hope of making changes because the scale of the Internet has made all new infrastructure deployment impossible.

The Internet can and does change. New protocols and protocol features are developed and successfully deployed every day. Only a small fraction of that work takes place in the IETF and for every project that succeeds, ten or perhaps a hundred fail. But even work that is a failure is worthwhile if at least one other person learns a lesson that allows another project to succeed.

2.1. Purpose of the IETF

The first comment I received when I announced the Mesh 3.0 documentation set was to ask if anyone else was planning to implement the specification as the primary focus of IETF work was to enable interoperability between implementations.

While developing clear specification documents that describe compelling designs facilitating widespread interoperability is a noble and important goal, it is not one that the IETF is designed to serve. If I wanted to develop elegant and compelling specifications, I would hardly want to do so in a hundred-person committee.

It doesn't take a hundred people to design a specification, but it frequently requires that number or even more to represent the interests of all the stakeholders and gatekeepers whose requirements must be met if deployment is to be successful.

The main if not the sole reason I attend standards organization meetings is to build the constituency necessary for adoption. If I had already established that deployment constituency, I would have no need of coming to the IETF in the first place. Equally, it would be surprising to say the least if anyone else had begun an implementation of a set of specifications when I have been discouraging anyone from doing that until the work was nearly complete.

If our work has any importance, it is that it improves the Internet. It is the consensus outside the IETF in the user community that ultimately counts. Over the years I have seen many attempts to short circuit the process and march a specification through at breakneck speed. While this is an effective strategy for impressing managers, it is not an effective strategy for building a deployment community. It is much easier to form a strong consensus among a dozen people who start with strongly aligned views and interests than to form such a consensus among a hundred people representing twenty different deployment constituencies. But it is the latter that is necessary if we are to achieve deployment.

Many people try to accelerate IETF process, I have frequently preferred a slower pace when I believe it might allow endorsement by a key stakeholder or a gatekeeper.

Recommendation: The IAB should recognize that at least one purpose of the IETF is to help technology developers build a deployment constituency and that this is properly a result of rather than a precondition for considering work.

2.2. Design for Deployment

The need to design for deployment is argued in the previous section. The question for the IAB is when and how that approach should be required.

Rather than clutter up every RFC with a mandatory 'Deployments Consideration' section, the intended outcome is more likely to be received if this is an exceptional rather than a routine requirement and takes the form of a deliverable required of a working group during the chartering process. As is frequently the case with use cases and requirements documents, a document describing deployment considerations need not necessarily be published as an RFC but would serve to inform and facilitate the design process.

Recommendation: The IESG should request presentation of a 'Deployment Considerations' section as a deliverable when chartering or re-chartering work where success requires widespread adoption.

2.3. Identify Stakeholders and Gatekeepers

When deployment of a specification depends on adoption by a particular community of stakeholders, the opinions expressed by that community must be considered when designing for deployment. When one or more stakeholders have an influence so strong that it amounts to a veto power on particular forms of deployment they should be recognized as a gatekeeper and the deployment strategy designed accordingly.

It is important to note however that recognizing stakeholders and gatekeepers is not the same as affording them veto power. What is important is that their views must be considered by the Working Group even if the stakeholders and gatekeepers themselves are not present. If a proposal to improve Internet security is critically dependent on adoption by Web browser providers, their deployment criteria must be determined and respected. Or if it is impossible to reconcile the objectives with those criteria the design must be changed so that it does not depend on adoption by the Web browser providers.

It is of course necessary that the IETF operate under the fiction that every participant participates in their personal capacity alone and is not speaking for their employer. And this is certainly true to the extent that of the 1,500 attendees at a typical meeting, few if any have direct authority to speak on behalf of an organization of more than fifty people. And those that do are less likely to speak because of that fact.

But this polite fiction should not prevent Working Groups from soliciting and receiving direct input on the issue of deployment criteria from constituencies identified as stakeholders and gatekeepers.

Recommendation: When the IESG requires deployment considerations be produced, these should specify the key stakeholders and gatekeepers and the positions these parties have expressed.

2.4. Realistic Schedules

Some of the standards efforts I have been involved in have succeeded and some of those efforts have failed. While it is never possible to know with certainty that an effort will succeed, it is often possible to predict with a high degree of certainty that an effort will fail. There is no more certain sign that an effort is doomed than when the introductory slides assert that the urgency of the problem is so great that a solution much be found and deployed within 12 months.

There are very few problems that are currently being addressed in IETF Working Groups that have not been recognized as issues for at least five years. Most had been understood as issues for a decade or more. The claim that any issue has suddenly become so urgent that there is insufficient time to consider it properly should therefore bear a heavy burden of proof.

The fact that a Working Group charter sets an unrealistic schedule is not of course any guarantee that it will be met. And it is usually apparent from the start that this was never the intention. Setting an unrealistic schedule allows the scope of work to be controlled to exclude unwanted use cases, requirements and constraints and thus ensure that the Working Group selects a particular approach that allows the party controlling the process to re-use existing code rather than write new code.

Recommendation: The IESG should reject Working group schedules that leave insufficient time to discuss the use cases, requirements and appropriate technology.

2.5. Eliminate Deployment Dependencies

One of the most useful, certainly the most frequent advice offered by Jim Schadd when I shared an office with him at W3C was to avoid 'error 22': do not build research on research.

It is not uncommon for one Working Group to attempt to force deployment of their work by persuading another working group to make it an essential requirement. Rather than encouraging such dependencies, they should be vigorously discouraged.

Another form of deployment dependency is the requirement that a standard be designed in a particular way so that a particular stakeholder can re-use existing code. While re-use of existing code is an advantage, it is very rarely as decisive an advantage as the proposer imagines. The fact that a designer was able to lash together a prototype using an existing 500K line library does not necessarily suggest that this will provide a short cut to development of a production version of the same system.

Recommendation: When the IESG requires deployment considerations be produced, these should specify all the technologies that the proposal is dependent on, the status of each and a justification given for reliance on any technology that is not already ubiquitously deployed.

2.6. Recognize Failure

Probably the hardest step for the IETF to take as an institution is to recognize when an approach has failed and to stop investing resources in that effort.

One of the most important decisions that the IETF took in the deployment of end-to-end secure mail was the recognition that PEM had failed to win adoption and clear the field for S/MIME and OpenPGP.

It is now time for the IETF to have the courage to recognize that S/MIME and OpenPGP have failed to thrive. They have both established significant user bases and serve important functions. But neither has made appreciable progress in adoption in the past two decades and neither is likely to achieve ubiquity. Recognizing that these legacy protocols have failed to thrive would not render them obsolete but would clear the field allowing alternative approaches to be proposed.

Recommendation: The IAB should be tasked with performing periodic reviews of IETF standards and identify those that have 'failed to thrive'.

Author's Address

Phillip Hallam-Baker EMail: phill@hallambaker.com