Internet Engineering Task Force                          A. Burness, Ed.
Internet-Draft                                                P. Eardley
Intended status: Informational                                        BT
Expires: August 21, 2008                               February 18, 2008


                     Locater ID proposal evaluation
                    draft-burness-locid-evaluate-00

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on August 21, 2008.

Copyright Notice

   Copyright (C) The IETF Trust (2008).

Abstract

   There are many proposals for improving the Inter-domain routing
   system, most of which involve a form of locater-identity split.
   There needs to be a means to reason about the strengths of the
   different proposals against the design criteria, and without
   requiring large scale implementations.  This document aims to start
   this process by drawing parallels with existing systems.  It
   identifies a number of questions that need to be more fully thought
   about whilst we press ahead with system development.


Burness & Eardley        Expires August 21, 2008                [Page 1]

Internet-Draft       Locater ID proposal evaluation        February 2008


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
   2.  Design Goals . . . . . . . . . . . . . . . . . . . . . . . . .  5
     2.1.  Router Scalability . . . . . . . . . . . . . . . . . . . .  5
     2.2.  Traffic Engineering  . . . . . . . . . . . . . . . . . . .  5
     2.3.  Multi-Homing . . . . . . . . . . . . . . . . . . . . . . .  6
     2.4.  Mobility . . . . . . . . . . . . . . . . . . . . . . . . .  6
     2.5.  Ease of changing providers . . . . . . . . . . . . . . . .  6
     2.6.  Routing Quality  . . . . . . . . . . . . . . . . . . . . .  6
     2.7.  Routing Security . . . . . . . . . . . . . . . . . . . . .  6
     2.8.  Deployability  . . . . . . . . . . . . . . . . . . . . . .  7
     2.9.  Unclear Requirements . . . . . . . . . . . . . . . . . . .  8
     2.10. Address Shortage . . . . . . . . . . . . . . . . . . . . .  8
     2.11. Failure Management . . . . . . . . . . . . . . . . . . . .  8
   3.  Related Working Options  . . . . . . . . . . . . . . . . . . .  9
     3.1.  NAT  . . . . . . . . . . . . . . . . . . . . . . . . . . .  9
     3.2.  Mobile networks and directory systems  . . . . . . . . . . 10
       3.2.1.  3G Systems . . . . . . . . . . . . . . . . . . . . . . 10
       3.2.2.  Mobile IP  . . . . . . . . . . . . . . . . . . . . . . 11
       3.2.3.  DNS  . . . . . . . . . . . . . . . . . . . . . . . . . 11
       3.2.4.  Summary  . . . . . . . . . . . . . . . . . . . . . . . 12
     3.3.  The routing system . . . . . . . . . . . . . . . . . . . . 12
   4.  Map and Encap Schemes  . . . . . . . . . . . . . . . . . . . . 13
     4.1.  Routing System Scalability . . . . . . . . . . . . . . . . 13
     4.2.  Traffic Engineering  . . . . . . . . . . . . . . . . . . . 14
     4.3.  Multi-Homing . . . . . . . . . . . . . . . . . . . . . . . 14
     4.4.  Mobility . . . . . . . . . . . . . . . . . . . . . . . . . 15
     4.5.  Changing Provider  . . . . . . . . . . . . . . . . . . . . 15
     4.6.  Route Quality  . . . . . . . . . . . . . . . . . . . . . . 15
     4.7.  Routing Security . . . . . . . . . . . . . . . . . . . . . 16
     4.8.  Deployability  . . . . . . . . . . . . . . . . . . . . . . 16
     4.9.  Address Shortage . . . . . . . . . . . . . . . . . . . . . 16
     4.10. Failure Handling . . . . . . . . . . . . . . . . . . . . . 17
   5.  Translation Schemes  . . . . . . . . . . . . . . . . . . . . . 17
     5.1.  Routing System Scalability . . . . . . . . . . . . . . . . 17
     5.2.  Traffic Engineering  . . . . . . . . . . . . . . . . . . . 17
     5.3.  Multi-Homing . . . . . . . . . . . . . . . . . . . . . . . 17
     5.4.  Mobility . . . . . . . . . . . . . . . . . . . . . . . . . 18
     5.5.  Changing Provider  . . . . . . . . . . . . . . . . . . . . 18
     5.6.  Route Quality  . . . . . . . . . . . . . . . . . . . . . . 18
     5.7.  Deployability  . . . . . . . . . . . . . . . . . . . . . . 18
     5.8.  Address Shortage . . . . . . . . . . . . . . . . . . . . . 19
     5.9.  Failure Handling . . . . . . . . . . . . . . . . . . . . . 19
   6.  Mapping System Design  . . . . . . . . . . . . . . . . . . . . 19
     6.1.  Push . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
     6.2.  Pull . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
     6.3.  Route Through  . . . . . . . . . . . . . . . . . . . . . . 20


Burness & Eardley        Expires August 21, 2008                [Page 2]

Internet-Draft       Locater ID proposal evaluation        February 2008


   7.  Conclusions  . . . . . . . . . . . . . . . . . . . . . . . . . 20
   8.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 21
   9.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 22
   10. Security Considerations  . . . . . . . . . . . . . . . . . . . 22
   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22
     11.1. Normative References . . . . . . . . . . . . . . . . . . . 22
     11.2. Informative References . . . . . . . . . . . . . . . . . . 22
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23
   Intellectual Property and Copyright Statements . . . . . . . . . . 24


Burness & Eardley        Expires August 21, 2008                [Page 3]

Internet-Draft       Locater ID proposal evaluation        February 2008


1.  Introduction

   The Internet routing system has problems with scalability and
   stability.  These problems are made worse by the need to support
   functionality such as multi-homing and traffic engineering [IAB].
   There have been a multitude of proposals that involve some form of
   locater-identity split that all aim to solve the problem of routing
   scalability.  However without large scale implementations it is very
   difficult to assess the relative strengths of these different
   proposals.  On the other hand, it should be possible to characterize
   the proposals against the requirements.  Further, by comparing the
   proposals against existing systems, we may also be able to start to
   understand the likely processing, storage and communications
   requirements.

   Whittle [whittle] has made a study of this type to compare some of
   the specific locator-ID split proposals.  Here, instead of studying
   specific proposals, we group proposals into simple categories ( map
   and encap schemes which were the focus of the previous study,
   translation schemes and directory systems) to enable us to understand
   the likely behaviour of whole groups of proposals at a more generic
   level.

   This paper aims to start a process of evaluation.  This document is
   written not as a truth, but as the perception of the authors that
   should be challenged.

   We begin by reviewing the requirements against which proposals should
   be assessed.  Then we highlight some existing systems which may have
   processing, communications or memory requirements similar to those of
   the proposed schemes.  Their behavior might help to guide us in
   assessing the proposals.  This is essentially trying to learn from
   history.  We appeal here in particular to equipment manufacturers who
   may have a better grasp of equipment capabilities; which are
   fundamental and which are limits based on market requirements.  We
   then assess the generic schemes against the requirements.  There are
   essentially two main approaches to routing, commonly known and map
   and encap, and translation.  The critique of map and encap is based
   primarily on an understanding of LISP (draft 5) [LISP], the apparent
   current leader in that set of schemes; similarly the translation
   section is based upon 6/1 [six-one].  We consider directory systems,
   which form part of any map and encap solution, as a separte entity.
   The aim is to be as critical as possible in order to stimulate future
   activity before making some conclusions.


Burness & Eardley        Expires August 21, 2008                [Page 4]

Internet-Draft       Locater ID proposal evaluation        February 2008


2.  Design Goals

   In order to compare the solutions we need to understand the full
   breadth of requirements for a future routing proposal.  The first
   eight are direct echoes of the requirements in [Goals], the later
   requirements we feel are not sufficiently highlighted in that draft.
   Although these are the list of requirements for the new routing
   architecture, there is no need for all these features to be
   implemented within one protocol.  For example, making it easy for
   networks to change provider may mean that the edge network addresses
   need to be decoupled from those in the core.  However an alternative
   approach is to develop automated tools that can smoothly manage
   address changes of hosts, routers and other elements (access control
   lists for example) within an edge network.  Multi-homing may be
   managed by the routing system, or the routing system might simply(!)
   expose multiple paths that can be used by another mechanism to
   support multi-homing.  However, we feel that any routing proposal
   should make clear how well the additional features could be supported
   in order to assess the whole solution.

2.1.  Router Scalability

   Memory and processing requirements are growing all the time; already
   routers need to be upgraded every 2 to 5 years.  Many people believe
   the rate of growth is faster than Moore's law, meaning that cost
   could go up significantly or technology could start to fail.  The
   reason behind this growth appears to be a decreasing reliance on
   address aggregation rather than the absolute growth of the system
   itself.  Also, it is not necesarily the actual memory requiements
   that is the problem, but the need to be able to read and write those
   memories quickly, because there is a high rate of churn in the
   routing system.  The churn in the system also adds a processing
   requirement.  Churn appears to increase with increasing
   deaggregation.

2.2.  Traffic Engineering

   Traffic engineering is the ability to direct traffic along non-
   default path(s).  The ability to control the path taken by inbound
   traffic is as important as the ability to control the outbound path.
   Both these are non-trivial today: control of the in-bound data path
   requires manipulation of BGP messages.  Control of the outbound path
   can be made difficult as a result of ingress filtering blocking data
   which appears to have been spoofed.


Burness & Eardley        Expires August 21, 2008                [Page 5]

Internet-Draft       Locater ID proposal evaluation        February 2008


2.3.  Multi-Homing

   A multi-homed site can connect to the Internet via more than one
   network provider.  Today this is done by injecting multiple, more
   specific address prefixes into the global routing table, which
   therefore impacts on BGP's scalability.  Therefore any solution
   should have a simple and effective means to manage multi-homing.
   Since one reason for multi-homing is to improve resilience, the
   multi-homing solution must be clear how failures are detected and
   repaired.  This type of edge network failure management should
   ideally not impact on the convergence and stability of the global
   routing system.  Availability requirements vary tremendously from a
   few seconds to as small as possible (ms range.  These different
   recovery times are associated with different types of application
   using the network; in general the recovery time should ensure running
   sessions are not affected.

2.4.  Mobility

   Increasingly nodes and sites will be mobile.  An efficient, scalable
   means is needed to support mobility.  When a host moves, hosts and
   routers that are not in communication with the mobile host should not
   need to be informed of the mobility.  When a network moves, the
   number of routers informed of the change should be minimized.

2.5.  Ease of changing providers

   This is often cited as a key reason behind the increasing use of
   provider independent (PI) addresses, and hence a key reason behind
   routing scalability issues.  Using PI addresses, end-sites can change
   providers without renumbering (or at least with much less
   disruption).  Customers may want to change service provider on a
   yearly basis.  A future routing system should make it easy for
   customers to change provider with minimal configuration requirements
   on the customer.  The process should be as simple as possible, almost
   certainly automated.

2.6.  Routing Quality

   Quality of routes includes convergence time, stability of path, loss,
   delay and stretch.  The first parameters are of interest to the
   network user.  The later paramter gives and indication of efficincy
   of use of network transmission resources.

2.7.  Routing Security

   The new architecture should be at least as secure as the existing
   system.


Burness & Eardley        Expires August 21, 2008                [Page 6]

Internet-Draft       Locater ID proposal evaluation        February 2008


2.8.  Deployability

   The Internet is stagnating; it is amazingly difficult to get new
   networking solutions deployed [Handley].The solution MUST be:

   o  Tecnically deployable

   o  Incremenatlly deployable

   o  All aspects of operation with legacy systems must be well
      understood.  Applications (that have not hard-coded address into
      themselves) would see no changes.  An updated or legacy node in a
      part of the Internet that uses the new system should be (where the
      local security policy permits) reachable by legacy or updated
      nodes operating within legacy or updated networks.  (There is no
      need to try and enable an updated node in a NATed legacy network
      to be reachable; it may in fact remain necesary that this node
      should not be reachable in such corcumstances)

   o  There must be a motivation for the person or organisation to
      deploy the system as solving the greater good is not sufficient.
      This benefit (or a subset) should exist for an isolated
      deployment.  It seems probable that new functionality (rather than
      faster or even cheaper) is most likely to motivate deployment.
      This is because any new technology always has hidden costs such as
      training people to install and manage it for example.  Examples of
      new functionality could be a security improvement, in and out-
      bound reliable traffic engineering, or visibility of alternative,
      low delay or highly reliable data paths.  However, it is difficult
      to predict what new features or services will attract users.

   o  Flexible service models should be supported, in other words a
      user, edge site or ISP should be able to deploy the service on
      behalf of others.

   o  Key players must not be disadvantaged, or they may try to obstruct
      standards or restrict deployment.  A specific aspect of this to
      highlight is how network providers today use policy control.
      Providers are unlikely to support any scheme which make policy
      management more difficult that today.  They are likely to require
      the ability to check that routes are as diverse as possible, to
      chose routes based on cost and performance and to avoid routes
      leaving or entering a specific country or domain.

   If the constraints of operation with legacy systems and flexibility
   in location of functionality are met, then a non-issue is that of
   host upgradeability.  However, host upgradeability is not impossible
   and recent history suggests this might be easier than network


Burness & Eardley        Expires August 21, 2008                [Page 7]

Internet-Draft       Locater ID proposal evaluation        February 2008


   evolution.  Recent host upgrades in ECN, IPv6 and RSVP-based QoS are
   not in general being supported by similar network evolution.

2.9.  Unclear Requirements

   Two other requirements are mentioned in [Goals].

   The first is that mechanisms used must be first class elements within
   the architecture.  I am not totally sure what this means.

   The second requirement is that location and identification should be
   able to be decoupled.  It is required that a solution for scalable
   routing is compatible with (but does not require) a solution that
   separates the host identification from the host location name-space.
   This separation should improve the flexibility of the Internet.  The
   significance of this requirement is unclear, perhaps because none of
   the proposed solutions have failed to meet this requirement, and may
   only become clearer if assumptions or requirements on the
   identification such as cryptographic authentication requirements or
   the need to be able to reverse map from location to identifier are
   made.

   Less often mentioned are two other requirements that we believe are
   nevertheless critical:

2.10.  Address Shortage

   Current predictions are that the unallocated IPv4 address space will
   soon be used up, with suggestions [Huston] that IANA will run out of
   addresses by 2011, with RIR running out by 2012.  Routing and
   addressing are closely related, and the impact of the scheme on the
   address shortage problem should be considered.  It may be easier if
   one major network overhaul is required rather than two.

2.11.  Failure Management

   If the routing system never encountered any changes, then it is
   likely that there would be no scalability issues.  Minimizing
   connectivity disruption in the presence of failures is critical as
   failure recovery is one of the drivers behind multi-homing.  Some
   end-sites have a target of no more than 10ms downtime _ although it
   is not clear that this would ever be achievable!  Other sites may be
   happier with a few seconds disruption.  Any scheme should make it
   clear how failures are handled, and should be no less robust to
   failure than today's systems.


Burness & Eardley        Expires August 21, 2008                [Page 8]

Internet-Draft       Locater ID proposal evaluation        February 2008


3.  Related Working Options

   What can we learn from running systems today?  This section is by no
   means complete, but rather presented as a starting point.  In
   particular, we have not got hard data on systems that exactly match
   anything than any new systems are trying to do.  We are simply
   placing a stake in the ground at a loosely justifiable point; and
   asking people to move it.  Note however that at this stage, we are
   looking at order of magnitude figures and hence OM movement of the
   stake!

3.1.  NAT

   NAT solves the problems of address shortage and provider
   independence.  Hence, whatever we may feel about the architectural
   violations of NAT, we could imagine simply promoting the greater use
   of NAT to reduce the scalability problem as provider-dependent
   addresses can then be more easily promoted.  It is after all clearly
   deployable.  Many edge sites, mobile operators and even some ISPs are
   already using NAT, often claiming increased secutity in addition to
   the other benefits.  For example, by hiding the addresses of servers
   and routers inside the network, it makes it a bit harder for an
   attacker to try and establish a session with these devices.

   A NAT box can control traffic flows over different links if it is
   multi-homed, thus providing some traffic engineering capabilities.
   In particular, for sessions that are started behind the NAT, then the
   in and out-bound data path can be controlled by the choice of address
   that the NAT box uses for the session.  One could imagine
   enhancements to NAT that would enable widely separated NAT boxes to
   communicate to support different multi-homing architectures.

   Of course, there are issues with NAT which is why it has never been
   proposed as a solution to the routing system scalability problem;
   most significantly it breaks the end to end semantics of the
   Internet.

   However, it is interesting to note that NAT is typically not used by
   the larger sites, and it appears to be the performance rather than
   any purist objections that lead to this.

   The performance limitations come from the fact that NAT requires a
   high level of per-flow tracking and per-packet modifications.
   Because there are so many flavors of NAT, it is hard to get
   quantifiable information on the performance.  For NAT-PT, we should
   expect to map one IP address to 65,000 different sessions using the
   port identifier.  CISCO's web site [Cisco] suggest that a typical NAT
   router would not need to support more than 10,000 translations, and


Burness & Eardley        Expires August 21, 2008                [Page 9]

Internet-Draft       Locater ID proposal evaluation        February 2008


   based on the same source, 128,000 such sessions would take 40MBytes
   DRAM.

   Assuming we can easily support 128,000 NAT sessions, we can then
   estimate then how many users this corresponds to.  Each TCP flow is
   mapped to a different NAT session.  A peer to peer application may
   run 100 concurrent sessions.  Perhaps only 10% of an ISP customers
   are peer-peer users; the remaining 90% will typically have a low
   number of concurrent connections, say 5.  So on average a customer
   has 14.5 active TCP sessions, meaning that the NAT as described can
   handle 8,827 users.  This might mean that universities and medium
   enterprises could all be placed behind NAT devices, but larger
   corporate bodies and large ISPs would need something a little
   different, or very many co-ordinated NAT boxes.  If we assume that
   the mappings maintained are between pairs of IP addresses rather than
   each individual TCP sessions, then we may be able to handle 3 times
   that number of users behind a single device.

   Netflow is another networking tool that supports per flow packet
   processing.  Cisco [Cisco] claim that their NETFLOW accounting tool
   can support 128,000 simultaneous connections - similar in scale to
   our NAT estimations.

   In summary, per flow processing of each packet is likely to lead to
   limitations on how fast edge devices can operate, putting a limit on
   how many users could be behind such a box.  Routers work fast today
   because they are highly optimised towards a single simple forwarding
   duty.

3.2.  Mobile networks and directory systems

3.2.1.  3G Systems

   GSM and 3G cellular systems already have a locater-identity split.
   The phone number acts as an identity.  The Home Location Database
   (HLR) contains a mapping of the phone number to its current location
   as identified as a routing area.  A routing area will contain 10's or
   even 100's of cells which range in size from the few metres ( pico
   cells in buildings in cities) to several kilometres in the
   countryside.  To find a user, the HLR is used to discover which
   routing area last knowingly contained the phone.  All nodes in that
   routing area then receive a paging message in an attempt to discover
   the actual location of the user.  This temporary location mapping is
   then held by the router responsible for that routing area.

   The HLR system will typically have up to tens of millions of users in
   this centralized database.  The HLR does not know if the end node is
   reachable.  This is discovered during the paging process, which means


Burness & Eardley        Expires August 21, 2008               [Page 10]

Internet-Draft       Locater ID proposal evaluation        February 2008


   that it can take 5 to 10 seconds in order to make initial contact
   with a mobile device.  When a user moves around a routing area, it is
   not necessary to update the HLR of the location unless the node
   changes routing area.  The size of the routing area depends not on
   how fast the HLR can be updated but on how much paging is expected as
   paging wastes the resources (battery power) of all phones in the
   area.

   Handover (without session disruption) is only possible within one
   service provider network, as much as anything due to the time taken
   to manage security associations.  Handover is managed locally with
   co-ordination between the different base stations.  A make before
   break system is used to minimize service disruption.

   Roaming occurs when a node changes service provider network.  Here,
   the HLR will be updated to point to a Visitor Locator Database in the
   visited network which is updated with the routing area associated
   with the node.  The (hand-crafted)peering arrangements to allow
   roaming are sorted by management processes.

3.2.2.  Mobile IP

   Mobile IP (MIP) uses a different scheme.  Data is directed via a home
   agent which is updated with the current location of the mobile
   device.  (In one sense this is similar to schemes where the data is
   re-directed via the mapping system).  Mobile IP is much less widely
   deployed.  Reasons for this could include the performance
   implications of the tunneling process and the amount of per-node
   state management at the home agent.  Designing adequate security
   mechanisms has also troubled MIP development.

3.2.3.  DNS

   Within the Internet, we already have experiance with a large
   distributed database for mapping from name to address.  DNS works
   well, so well in fact that people are loathe to change it in case it
   gets disrupted; after all it is a critical piece of the
   communications infrastructure and a user is unlikely to care if it is
   a routing or DNS problem that disrupts their on-line shopping trip -
   it will be equally broken.

   It is usually stated that DNS works well because of the hierarchy in
   the name space (although the structure is relatively flat at about 3
   levels) and the aggressive use of caching.  Time to live (TTL) values
   are typically set at about an hour.  However recent studies [DNS] are
   beginning to suggest that cache hits rates are lower than thought.
   This implies that caching is not as vital as previously thought and
   that much shorter TTL, of the order a few hundred seconds, would not


Burness & Eardley        Expires August 21, 2008               [Page 11]

Internet-Draft       Locater ID proposal evaluation        February 2008


   noticably degrade DNS performance.  This is in part because a DNS
   update message is only processed locally, there is no attempt to keep
   all DNS servers with up to date information.

   A host typically begins each transport layer session with a DNS
   lookup.  This can take up to 2 seconds to resolve, although it is
   usually much quicker.

   The DNS system is held together by IP addresses that are hand-coded
   into the system.  A question to answer is what happens if the IP
   address is replaced by an intransient identifier and a transient
   locater.  If the DNS servers need to be identified and their current
   location found before a DNS query could be resolved, then the
   performance of the identifier resolution system will have a big
   impact here as several DNS servers often need to be found to achieve
   a single name resolution.  Further, if the DNS system is the
   identifier resolution system, we would have a nasty circular
   dependancy.

3.2.4.  Summary

   We can make some observations about the systems that work well.  They
   seem to have extremely low functionality with low rates of change of
   the data.  These changes are effectively confined (localized).  Data
   changes are not propagated around the system.  Hard-wiring of
   directory associations is commonplace.  Perhaps an automated
   discovery and topology building protocol may give more problems than
   its worth for this type of system?  It is possible that automation is
   only required for systems with large amounts of change.

3.3.  The routing system

   So having considered systems that work well, what are the
   characteristics of the routing system?  This section is incomplete

   A mid-tier isp network may contain double the number of prefixes as
   the core of the Internet - thus we must be careful of designs that
   move complexity from the core to the periphery of the network.

   how many prefixes; how many AS; how many nodes; how many end sites;
   how many transits; how big is the DFZ; I would not say I have found
   definitive answers to these questions.

   The churn rate is very high and very variable.  If a network recieves
   on average 400 BGP messages a minute, it may easily expect to have
   8000 or even 80,000 updates at peak periods of intense instability.
   Churn is typically slowed by the introduction of timers to delay
   sending of messages.  Often however these timers are turned as low as


Burness & Eardley        Expires August 21, 2008               [Page 12]

Internet-Draft       Locater ID proposal evaluation        February 2008


   possible (to the point where processing capability of the routers
   becomes noticable) to try and maximise network availability.
   Therefore, ideally, failure repair should be localised.

   Many of these messages are not really indicating true physical
   problems.  A site may rapdily flap its links in an effort to
   manipulate the flow of data between different multi-paths.  A site
   may be performing mild policy updates.

   really the whole thing needs more on policy, more on why the routing
   system is asbused as it is.


4.  Map and Encap Schemes

4.1.  Routing System Scalability

   These schemes aim to encourage use of provider dependent addresses
   thus leading to aggregation and removing the load from the core
   routing system.  This is achieved by making addresses in the edge
   network independent from those in the core transit system, so that
   provider lock-in is avoided.

   All these schemes require a mapping system to translate between edge
   and core network locators.  The scalability of mapping system is
   uncertain.  We shall assume that the mapping system holds essentially
   static information.  We further assume that (using LISP terminology)
   End Point Identfiers (EID) are aggregatable so a system of required
   size could be built.  It is probable that this system could be built
   to store and return all locators associated with an end point
   identifier prefix range.  Issues that would impact the probable
   scalability of this system are

   o  if the system needs to propogate this information globally, in
      which case it would become very sensitive to churn rate and
      bandwidth.  In this case, it could not sensibly be used for
      mobility management for example

   o  if the system was used to propogate policy or traffic engineering
      informtaion, as all the evidence is that this information is very
      rapidly changing

   The third item to be considered are the edge routers which may need
   to do per-flow packet processing.  This processing may be required to
   manage reachability information (is it sufficient to hold a mapping
   to the core locator of the edge router and to know that the lower
   layer routing system thinks this address is still valid, or do we
   need to know that the higher layer functionality is alive?  MPLS


Burness & Eardley        Expires August 21, 2008               [Page 13]

Internet-Draft       Locater ID proposal evaluation        February 2008


   experiance suggests the later is useful.)

   Further, the LISP description of multi-homing management seems to
   imply per-flow packet processing, for example processing of the
   headers on return packets of a flow to discover which of the possible
   edge routers are prepared to handle this session).  If per-flow
   packet processing is required, we may run into scalability problems
   as in NAT routers today.  Is the per-flow assumption fair?  If we
   were considering all flows to a specific tunnel end-point, perhaps
   there may be some way to aggregate information?  This would depend on
   the location of the tunnel end points.  If they are near to the
   network edge it is quite likely that there will be a limited number
   of flows heading towards a specific the tunnel router.  The low cache
   hit rates on DNS support the idea that flows are widely distributed.
   If the edge routers are near the core, we then introduce a scaling
   problem behind the edge routers, where all networks now have provider
   independent address spaces.  Since the absolute size of the mid-tier
   networks is greater than that of the DFZ, adding scaling pressure
   here is unlikely to be a good idea.

   Of course, another way to consider these schemes is to assume that
   they do nothing apart from append a new packet header at the edge
   router: in this case, a better simile would be with MPLS; where the
   primary scalability worry to date comes from lack of labels (only 20
   bytes available).  The main issues with MPLS are the ability to
   verify reachability, rather than processing and memory requirements.
   Certainly MPLS has yet to be implemenetd inter-domain and is not
   suggested as a solution itself.

   In summary, the main scalability questions may arise only when a
   clearer understanding of how multi-homing with traffic engineering
   are to be managed.

4.2.  Traffic Engineering

   Traffic engineering and policy controls may require co-ordination
   between two layers.  It requires the ITR to respect ETR instructions.
   It is probable that some policy opaqueness is lost.  One interesting
   question for example, is how peering relationships are managed, as to
   be reachable by any node, the ETR must be advertised openly in the
   mapping system, and once this is done, how is it ensured that only
   networks with distinct peering relationships use the more expensive
   links?

4.3.  Multi-Homing

   The mapping system may return many possible locaters.  The edge
   routers using edge to edge communications manage multi-homing.  In


Burness & Eardley        Expires August 21, 2008               [Page 14]

Internet-Draft       Locater ID proposal evaluation        February 2008


   LISP it is described how an ITR will spray packets from a flow across
   the different possible ETRs, according to the weights associated with
   the ETR devices.  The ETRs communicate back to the ITR(s) which
   addresses they would prefer to see used.  This is used for traffic
   engineering as well as simply reachability purposes.  If this
   information is piggybacked onto a data session (which may raise
   security questions [Bagnulo] ), how is this managed for UDP
   applications which may have the return control channel as a different
   session to the data channel?  This also breaks the model that TCP
   has, of packets typically following a single path which may have
   unfortunate implications both for congestion control and for TCP
   performance.  If we assume that a TCP flow is kept together, but that
   packets destined to the same end site are spread amongst the edge
   routers, we now definatly have per-flow state, and unlike ECMP,
   associated packet processing (adding the correct outer header).

4.4.  Mobility

   It may be possible to manage simple portability by updating the
   mapping system so that new sessions would start correctly.  This
   assumes that the mapping system operates like DNS today, without the
   information needing to be distributed globally.  In-session mobility
   however requires the updating of the mappings directly and
   dynamically; Discssions on the mailing lists [MailList] to date imply
   that this is difficult, with suggestions that this functionality
   should rely on application specific signalling.  In that case, it is
   likely that should source and destination simultaneouslymove, the
   session will be dropped, unless the edge routers offer a forwarding
   functionality.

4.5.  Changing Provider

   This is by design extremely simple as only the mapping system needs
   updating.  However there may still be issues ensuring packet filters
   and firewalls are correctly configured.  These have been covered to
   some extent for IPv6 in RFC 4192 [RFC4192] where make before break
   techniques have been described, but this may not be suitable on the
   whole for IPv4.

4.6.  Route Quality

   Since multiple edge routers can be associated with a name, the
   network system may have a greater choice of routes to use to reach a
   specific device (although it is not clear that this control could be
   passed back to the data sources).

   If the mapping replies take a long time, a TCP session start up may
   be disrupted.  Similarities with ARP are not necessarily relevant:


Burness & Eardley        Expires August 21, 2008               [Page 15]

Internet-Draft       Locater ID proposal evaluation        February 2008


   ARP is an extremely local process that can resolve very quickly, and
   ARP entries are normally within a cache because they are used
   frequently.

   Since multi-homing requires a flow to be sent along diverse paths,
   TCP may see lots of out of sequence packets and congestion control
   mechanisms may not work as expected.

   It is not clear how easy it is to solve the problem of tunnel
   overheads and packet fragmentation, or if indeed that is a major
   issue.

4.7.  Routing Security

   Pending further thought.  The security analysis so far performed
   [Bagnulo] was on LISP version 1.

4.8.  Deployability

   o  Technically deployable

   o  It is not clear how incrementally deployable this is.  If it is
      required that (PI) EID space is advertised in the legacy routing
      system to enable communication with legacy nodes, then the scaling
      pressures on the routing system will shoot up dramatically during
      the early stages of deployment.

   o  Operation with legacy systems is not well understood

   o  There is no clear motivation why an edge system should deploy this
      scheme.  Since provider lock-in can be avoided today using
      existing well known techniques, there is no motivation for a end
      site to chose LISP over the familiar technology.  Traffic
      engineering and multi-homing control have been mentioned as
      possibilities to motivate a deployment, but to date are too poorly
      described to be able to judge if they meet all requirements well.

   o  There may be opposition as traffic engineering and policy control
      requires communications between ITR and ETR devices, which may
      reduce the opaqueness of the policy control over existing
      techniques.  Policy control may become more complicated

4.9.  Address Shortage

   Although described for IPv4, which is seen as an advantage, these
   schemes are essentially IP version agnostic.  Unlike the NAT
   solutions of today, the EIDs in any domain must have global
   uniqueness for the mapping system, thus potentially making the


Burness & Eardley        Expires August 21, 2008               [Page 16]

Internet-Draft       Locater ID proposal evaluation        February 2008


   problem worse.  Although better allocations of addresses may become
   possible, it is unlikely that addresses can be easily recovered.

4.10.  Failure Handling

   These schemes always require an additional global database
   infrastructure.  This is therefore as critical a resource as the
   current DNS system is.  All things being equal, the addition of this
   would decrease the resilience of the overall Internet.  Further,
   fault tracing would become yet more complex.  The underlying routing
   system takes care of path failures between the tunnel routers.
   However tunnel routers become critical points of failure if they hold
   state.


5.  Translation Schemes

5.1.  Routing System Scalability

   It aims to encourage use of provider dependent addresses so removing
   the load from the core routing system.  It does this by providing a
   different way to manage multi-homing.  Since the edge routers are not
   state holding, and only need to tamper with the first few packets of
   a flow, the scalability of these edge routers should be better than
   that of current NAT devices.

5.2.  Traffic Engineering

   In and outbound traffic engineering is managed through either the
   node or egress router setting the routing portion of the locater.
   For in-bound sessions, this only works when both ends are translation
   aware.  Existing policy control is possible, although there is
   motivation to move to alternative ways to achieve same goal.  For
   example AS pre-pending to indicate that a route should be avoided
   could be replaced with a translation to the preferred route.  Since
   this could work more reliably than AS pre-pending there is a driver
   for change.

5.3.  Multi-Homing

   For muti-homed edge networks (as oposed to multi-homed hosts) this
   can be controlled by edge networks but is visible to end hosts.
   Applications bind only to the identifier part of the address.  It is
   assumed that each multi- homed route is identified by the different
   locator.


Burness & Eardley        Expires August 21, 2008               [Page 17]

Internet-Draft       Locater ID proposal evaluation        February 2008


5.4.  Mobility

   Since applications can tolerate the address changing, mobility should
   be simplified.  Many of the functions to support multi-homing are
   like those required to support mobility but it is not clear that the
   details and overlaps have been fully identified, especially with
   regard to security.

5.5.  Changing Provider

   This will be complicated, and additional protocol support will be
   required.  As well as DNS updates and DHCP re- configuration of
   hosts, and firewall and filter settintgs, the intra-domain routing
   system may be affected.  This later problem may be made more
   manageable if internal routers can mask out the network address
   portion within the internal routing system.  This may make it harder
   to do efficient routing inside the network or to manage edge node
   failures.

5.6.  Route Quality

   The scheme adds minimal additional delays.  All data translations are
   based only on locally held, locally visible material.  Alternative
   routes, as indicated by different address pairings, are visible to
   the end devices.

5.7.  Deployability

   o  Technically deployable

   o  Proxy support to avoid upgrading of hosts, may look very like NAT
      with a break in the end to end semantics

   o  Some of the new benefits over the existing system (specifically
      in-bound TE) are only evident when there is a large deployed base.

   o  Operation with legacy hosts is possible provided all 6/1 elements
      can identify it as a legacy host

   o  Motivation is based on additional feature of in-bound TE.  The
      ability to see and use different routes, as identified through
      different addresses may also be valuable.

   o  Hosts, edge devices and possibly internal networks all need to be
      upgraded.


Burness & Eardley        Expires August 21, 2008               [Page 18]

Internet-Draft       Locater ID proposal evaluation        February 2008


5.8.  Address Shortage

   Forces upgrade to IPv6

5.9.  Failure Handling

   Since the edge devices are expected only to translate on the first
   packets of a flow (relying on the end host to use the correct address
   once it is made aware), the edge devices become less critical as they
   are not state holding.  It has been suggested that should the edge
   router or access link fail, a local mechanism (similar to handover in
   a cellular system) can be used to achieve fast recovery.

   Relies on DNS system to provide the locater mapping.  Currently DNS
   servers are found through the hard-coding of related DNS server
   addresses.  If addresses become transient what does this mean for the
   DNS system?  Thus although a separate resolution system is not
   required, some consideration on DNS use would still be needed.  Would
   DNS servers need to be logically within the transit (provider
   independent address) zone?


6.  Mapping System Design

   The concept of tunnelling IP data packets across a large scale
   network is not new.  Many years ago there was much activity put into
   the design of networks that could run IP over ATM clouds.  This
   activity failed because of the difficulty of managing the mapping
   process - hence the design of MPLS which uses a single IP control
   plane across the entire network.  Are there any lessons to be learnt
   from this experiance?

   There appear to be three basic options: push, pull or route through.

6.1.  Push

   If the full database is pushed to all tunnel routers, these devices
   may end up with larger storage requirements than current routers
   because all end sites now have provider independent addresses and so
   no aggregation is possible here.  There is also the problem of
   keeping the database securly up to date.  This is the way that name
   to address mappings were orginially managed, before DNS was
   introduced.  This new database could however be smaller than DNS
   because you have a locater associated with an EID prefix (ie roughly
   equivalent to having a locater associated with bt.com, not one for
   www.bt.com, mail.bt.com etc).  There have been claims that this
   mapping system would be easier to manage than the current routing
   system because it can be the same everywhere, whereas a routing table


Burness & Eardley        Expires August 21, 2008               [Page 19]

Internet-Draft       Locater ID proposal evaluation        February 2008


   varies according to the router.  However, link state protocols
   actually distribute a topology database which is the same everywhere,
   and they are not used for very large scale networks because there is
   no localisation of changes and they are considered un-scalable.

6.2.  Pull

   DNS is an example of a pull system.  It enables localisation of
   changes so could be used to carry more dynamically varying
   information, although the rate of updates should be slower than the
   cache lifetimes.  The disadvantage of this scheme used mid-flight is
   the additional delays that will be introduced.  These, as well as
   being annoying, may also upset protocols such as TCP.  Further, as
   the query is performed by a network element this opens up the
   potential for a DOS attack where a source simply sends initial
   packets to unknowable destinations.

6.3.  Route Through

   Routing through systems will increase the work expected from name
   resolution servers.  It may lead to inefficient routing.  If this is
   only used for the start of a data flow (and for all short sessions of
   course), then TCP flow rates will frequently be incorrect (too fast
   or slow for the path they have been changed to).  Applications such
   as voice also seem to struggle to cope with large path changes
   because of the delay variation seen.  This might also make fault
   tracing much more complex.


7.  Conclusions

   1.   There is no obvious correct solution.  The two classes of
        solution both aim to increase the use of aggregatable addresses
        and essentially differ in the driver they assume is the more
        critical, ie provider lock-in or multi-homing support.  The
        working assumption should be that both problems must be
        adequately solved by any solution, unless one requirement can be
        proven to be irrelevent

   2.   We are not really sure if there is a problem, although it could
        be major and if we leave it until we are certain it is likely to
        be too late to solve it.  More importantly, the exact nature of
        the problem (FIB size, RIB size, processing churn, writing FIB
        updates etc) has escaped definition.  A simpler solution may be
        possible.

   3.   Each of the different approaches deserves further research.


Burness & Eardley        Expires August 21, 2008               [Page 20]

Internet-Draft       Locater ID proposal evaluation        February 2008


   4.   the area that has received least real attention is legacy inter-
        working and partial deployment.

   5.   the mapping system is a real crunch point and needs some serious
        analysis

   6.   We are focusing on the locator-ID split, but have in reality two
        types of split, one which is recognizable as a locator-identity
        split and the other which could be termed a locator-locator
        split which involves splitting the addressing regions into core
        and edge.  The addition of an identifier has been proposed in
        other quarters for security and authentication reasons.  What
        are the wider implications of a locator space split?

   7.   Compact routing is a completely different routing algorithm that
        essentially trades path stretch for router state.  At present
        there is no way to implement a distributed dynamic version of
        compact routing so this particular protocol may be very far out.
        Nevertheless, there is no apparent study of the potential of
        different routing algorithms

   8.   Schemes such as HRA which simply look at how we organize the
        routing system are not included.

   9.   ROFL assumes that there is really no need for any locator at all
        and it may be correct.  It assumes that using modern techniques
        (based on DHTs) we could build an adequate system based on
        semantic-free identifiers.  It may be that the problems we face
        are caused by things other than scalability (eg lack of
        accountability means that we get endless pointless update
        messages, and means that there is no back-pressure on
        deagregation).

   10.  We are looking at the simple schemes; complex schemes such as
        NODE-ID and HRA are not considered.  However, in considering
        small scale changes, are we missing the point that we should
        first have a long term target architecture that any point
        solution should be compliant with?


8.  Acknowledgements

   An prelimary version of this document was prepared for Chinacom with
   help from Sheng Jiang and Xiaohu Xu.

   We are grafetul to Olivier Bonaventure and Simon Schuetz for very
   useful comments


Burness & Eardley        Expires August 21, 2008               [Page 21]

Internet-Draft       Locater ID proposal evaluation        February 2008


9.  IANA Considerations

   This memo includes no request to IANA.


10.  Security Considerations


11.  References

11.1.  Normative References

   [min_ref]  authSurName, authInitials., "Minimal Reference", 2006.

11.2.  Informative References

   [Bagnulo]  Bagnulo, M., "Preliminary LISP Threat Analysis", 2007, <ht
              tps://datatracker.ietf.org/drafts/
              draft-bagnulo-lisp-threat/>.

   [Cisco]    Cisco, "NAT FAQ", 2008,
              <http://www.cisco.com/warp/public/556/nat-faq>.

   [DNS]      Jung, J., "DNS performance and the effectiveness of
              caching", 2001, <SIGCOMM workshop on Internet
              Measurement>.

   [Goals]    Li, T., "Design Goals for Scalable Internet Routing",
              2007, <Internet draft draft-irtf-rrg-design-goals-01>.

   [Handley]  Handley, M., "Why the Internet only just works", 200, <BT
              Technology Journal>.

   [Huston]   Farinacci, D., "Locator/ID separation Protocol  (LISP)",
              2007, <Internet draft draft-farinacci-lisp-05.txt>.

   [IAB]      Meyer, D., "Report from the IAB workshop on Routing and
              Addressing", 2007, <Internet draft
              draft-iab-raws-report-02.txt>.

   [LISP]     Huston, G., "IPv4 address report", 2007,
              <http://www.potaroo.net/tools.ipv4/index.html>.

   [MailList]
              Farinacci, D., "e-mail thread", 2007,
              <http://www.ops.ietf.org/lists/rrg/2008/msg00232.html>.

   [RFC4192]  Baker, F., Lear, E., and R. Droms, "Procedures for


Burness & Eardley        Expires August 21, 2008               [Page 22]

Internet-Draft       Locater ID proposal evaluation        February 2008


              Renumbering an IPv6 Network without a Flag Day", RFC 4192,
              September 2005.

   [six-one]  Vogt, C., "Six/one: A solution for routing and addressing
              in IPv6", 2007, <Internet draft
              draft-vogt-rrg-six-one-01>.

   [whittle]  Whittle, R., "Comparing LISP-NERD/CONS, eFIT-APT and
              Ivip", 2007, <http://www.firstpr.com.au/ip/ivip/comp/>.


Authors' Addresses

   Louise Burness (editor)
   BT
   BT Adatral Park
   martlesham Heath, Suffolk
   UK

   Phone: +44 1473 646504
   Email: louise.burness@bt.com


   Philip Eardley
   BT
   BT Adatral Park
   martlesham Heath, Suffolk
   UK

   Phone:
   Email: philip.eardley@bt.com


Burness & Eardley        Expires August 21, 2008               [Page 23]

Internet-Draft       Locater ID proposal evaluation        February 2008


Full Copyright Statement

   Copyright (C) The IETF Trust (2008).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Acknowledgment

   Funding for the RFC Editor function is provided by the IETF
   Administrative Support Activity (IASA).


Burness & Eardley        Expires August 21, 2008               [Page 24]