Provider Provisioned VPN WG                    N. Finn      (Cisco)
Internet-draft                                 M. Seaman    (Consultant)
Expires: December 2002                         A. Smith     (Consultant)
                                               A. Romanow       (Cisco)


                           Bridging and VPLS
                   draft-finn-ppvpn-bridging-vpls-00


Status of this Memo

     This document is an Internet-Draft and is in full conformance with
     all provisions of Section 10 of RFC2026.

     Internet-Drafts are working documents of the Internet Engineering
     Task Force (IETF), its areas, and its working groups.  Note that
     other groups may also distribute working documents as Internet-
     Drafts.

     Internet-Drafts are draft documents valid for a maximum of six
     months and may be updated, replaced, or obsoleted by other
     documents at any time.  It is inappropriate to use Internet-Drafts
     as reference material or to cite them other than as "work in
     progress."

     The list of current Internet-Drafts can be accessed at
     http://www.ietf.org/ietf/1id-abstracts.txt

     The list of Internet-Draft Shadow Directories can be accessed at
     http://www.ietf.org/shadow.html.

Copyright Notice

     Copyright (C) The Internet Society (2002). All Rights Reserved.


Abstract

     Layer 2 techniques based on IEEE 802.1Q bridges are in widespread
     use by Ethernet MAN Service Providers.  It is possible to implement
     the data plane functionality of Service Provider Backbone as
     described in the framework draft [ANDERSSON] using bridges as the
     Provider Edge (PE) equipment.  There are three small but
     significant changes to the [LASSERRE-VKOMPELLA] VPLS draft which
     would make the Service Provider Backbone much more compatible with
     a bridge-based PE implementation, and which would improve the
     efficiency of all L2VPN implementations.


Finn and others           Expires December 2002                 [Page 1]

Internet-Draft              Bridging and VPLS               21 June 2002


Table of Contents

     1.   Introduction . . . . . . . . . . . . . . . . . . . . . . .   2
     2.   Signaling the Need to Unlearn MAC Addresses  . . . . . . .   3
     2.1.   Requirement for Unlearning MAC Addresses . . . . . . . .   3
     2.2.   How Bridges Forget MAC Addresses . . . . . . . . . . . .   4
     2.3.   Improved L2VPN Flush Messages  . . . . . . . . . . . . .   5
     3.   Send Flush on Recovery, Not Failure  . . . . . . . . . . .   5
     4.   Independent vs. Shared Address Learning  . . . . . . . . .   8
            Acknowledgements . . . . . . . . . . . . . . . . . . . .   9
            References . . . . . . . . . . . . . . . . . . . . . . .  10
            Authors' Addresses . . . . . . . . . . . . . . . . . . .  10
            Full Copyright Statement . . . . . . . . . . . . . . . .  11


1.  Introduction

     A number of Ethernet Service Providers are currently building their
     networks using purely L2 technologies, based around bridges.  When
     such an L2-oriented network provider looks at the architecture of
     [ANDERSSON], the similarity of Provider Edges (PEs) and bridges is
     unavoidable, and the desirability of constructing a PE based on a
     bridge is attractive.  A bridge-based PE allows the Provider to
     make use of the extensive capabilities of the current generation of
     bridges.

     Trying to base a PE implementation on a bridge raises certain
     issues with the specification and implementation of L2VPNs which
     have not been addressed in the drafts, to date.  Resolving these
     issues requires some minor changes to [LASSERRE-VKOMPELLA]. These
     are:

     1.   Two new "forget MAC addresses" L2VPN control packets are
          needed.

     2.   "Forget MAC addresses" L2VPN control packets should be sent
          when backup links become activated, not when links fail.

     3.   It must be possible to configure associations among L2VPN
          instances such that a group of L2VPNs share a single MAC
          address database in any given attached device, but different
          groups of L2VPNs use different databases.

     This present document assumes the validity of the [ANDERSSON] and
     [LASSERRE-VKOMPELLA] drafts.  We use the terminology of
     [ANDERSSON], borrowing from [LASSERRE-VKOMPELLA] and [SAJASSI] when
     needed.


Finn and others           Expires December 2002                 [Page 2]

Internet-Draft              Bridging and VPLS               21 June 2002


     Sections 2 and 3 explain the need to have forms of the [LASSERRE-
     VKOMPELLA] flush message that are compatible with the operation of
     bridges, and the necessary timing of sending those messages.
     Section 4 describes a requirement to allow some L2VPNs to share a
     common MAC address database.


2.  Signaling the Need to Unlearn MAC Addresses

     The "flush" message of [LASSERRE-VKOMPELLA] is inadequate to the
     needs of bridges either serving as PEs or as part of an Access
     Network [ANDERSSON] in a Provider Network.  The two forms of the
     flush message are, "forget all MAC addresses in this list," and
     "forget all the MAC addresses you learned from me."  There is no
     way for a bridge to generate an accurate list of MAC addresses for
     the first message, and no circumstances under which a bridge would
     issue the second message.

     The following sub-sections describe the need for unlearning, how
     the two major spanning tree protocols handle the deliberate
     forgetting of MAC address information, and the bridge requirements
     for flush messages.

2.1.  Requirement for Unlearning MAC Addresses

     If bridges operate over Psuedo Wires (PWs) such that redundant PEs
     are provided to improve the availability of the Provider Network,
     then the possibility arises that changes in an Ethernet Service
     Provider's Access Network will require packets to take different
     paths through the Provider Backbone.  An obvious example is shown
     in Figure 1:

       +-------------+ .1Q +----------+ PW1  +----------+   +-------------+
       + PE-bridge 1 +-----+   PE 1   +------+   PE 3   +---+ PE-bridge 2 |
       +-+---------+-+     +--------+-+      +-+--------+   +-----------+-+
         | Ether    \          PW2 | Backbone /                   Ether |
         |           \ .1Q +--------+-+      /                          |
       Customer       '----+   PE 3   +-----'                      Customer
       Station A           +----------+ PW3                        Station B

          FIGURE 1:  Unlearning MAC addresses in L2VPN due to L2 changes

     In this diagram, we have five Provider bridges, PE-bridge 1 and 2,
     and PEs 1 through 3.  The PWs of one (data) VLPLS are shown.  Keep
     in mind, however that we are not using spanning tree to select
     among PWs; the PWs form a full mesh, and split horizon is used to
     prevent loops within the L2VPN.


Finn and others           Expires December 2002                 [Page 3]

Internet-Draft              Bridging and VPLS               21 June 2002


     Spanning tree may, however, be running over the whole network, in
     which case spanning tree Bridge Protocol Data Units (BPDUs) may be
     carried as ordinary multicast data over one or more of the PWs.

     Suppose that the normal path between stations A and B goes through
     PE 1.  PE 3 has learned this fact, and directs its packets destined
     for station A to PE 1.  If the link between PE-bridge 1 and PE 1
     fails, then it is possible that the redundancy/failover algorithm
     in use will select PE 2, rather than PE 1, to be the portal to the
     Backbone.  In that case, PE 3 needs to "unlearn" its association
     between MAC address A and PE 1.

2.2.  How Bridges Forget MAC Addresses

     The spanning tree algorithms in [802.1D] and [802.1w] provide two
     methods for notifying bridges when MAC address information needs to
     be unlearned.  The "classic" Spanning Tree Protocol, STP, defined
     in [802.1D] describe one way, and the new Rapid Spanning Tree
     Protocol (RSTP) in [802.1w] behaves another way.

     In STP, when a topology change occurs, notification of the change
     is transmitted to the root bridge, which in turn relays the fact to
     all bridges.  The notification is not directional; the root places
     all bridges in "topology change mode" for a certain length of time,
     then returns all bridges to normal mode.  While in topology change
     mode, all MAC address information is timed out over a much shorter
     period than is normally the case.

     In normal times, the default timeout period for MAC address
     information is five minutes.  During a topology change, that time
     shortens to a default value of 15 seconds.  This time is comparable
     to the time during which service may be interrupted by a topology
     change.  The shortened timeout period ensures that stale
     directionality information will not survive the topology change.
     Note that, if the MAC addresses were instantly forgotten, instead
     of timed out rapidly, a great deal of traffic otherwise unaffected
     by the topology change would be unnecessarily flooded.

     In RSTP, a Topology Change Notification (TCN) is initiated only by
     a bridge port that has just transitioned from standby to
     operational status.  It is generated only on that newly operational
     port, and is then relayed along the spanning tree by the other
     bridges.  Thus, RSTP TCNs have a direction of propagation, and
     bridges "behind" the new link do not receive it.  Since RSTP can
     converge in milliseconds after a topology change, receipt of a TCN
     causes a bridge to instantly forget many MAC addresses.  A bridge
     does not forget MAC addresses associated with the port on which the
     TCN was received; one can prove that the MAC address information on


Finn and others           Expires December 2002                 [Page 4]

Internet-Draft              Bridging and VPLS               21 June 2002


     that port cannot be affected by any topology change.

2.3.  Improved L2VPN Flush Messages

     To be consistent with bridges, the L2VPN learning and forgetting
     rules must be compatible with the bridge rules described in the
     previous section.  The two specific L2VPN messages required for
     full compatibility with all standard bridges are:
          v.IP 1.  STP TOPOLOGY CHANGE START/END.  Set the timeout
          period of all learned MAC addresses on this list of L2VPNs to
          this number of seconds.  This message is transmitted with a
          shortened timeout value at the beginning of a "classic" STP
          topology change event, and transmitted with the default
          timeout period after the event ends.

     2.   RSTP TOPOLOGY CHANGE.  Immediately delete all MAC address
          information on this list of L2VPNs except that information
          learned from the sender of this message.  (Note that this is
          exactly the opposite from the current "flush all you learned
          from the sender" message.)

     The first message, which is compatible with the old STP, is of
     questionable utility, simply because it is needed by a form of STP
     which takes 10s of seconds to converge after the failure or
     recovery of a node or link.  The Rapid Spanning tree converges much
     more quickly, and uses the second form.

     To the best of the authors' knowledge, these two actions are
     sufficient to meet the needs of all proprietary Layer 2
     failure/recovery protocols based on spanning tree.  If not,
     suggestions for additional actions are encouraged.


3.  Send Flush on Recovery, Not Failure

     In [LASSERRE-VKOMPELLA], a device that notices the failure of a
     non-PW link is required to transmit the flush message(s) over the
     Backbone.  It is much better to transmit the flush messages at
     recovery time, that is, when an alternate to the failed link
     becomes operational, or when a new link is added.  In brief, one
     reason is that frames are best flooded when there is a good chance
     that their destinations are reachable, and therefore will elicit
     the replies that will terminate the flooding.  The second reason is
     that link failures cannot reliably be detected, whereas recovery
     events are sure and certain.

     In RSTP [802.1w], it is the device that starts using the backup
     link that generates the flush message(s).


Finn and others           Expires December 2002                 [Page 5]

Internet-Draft              Bridging and VPLS               21 June 2002


         "backdoor" _________________________________
            link   /                                 \
               +--+-+     +----+  Y--+----+        +--+-+
               | B1 |-----| B2 |-----| B3 |........| B4 +--Z
               +----+  ___+--+-+     +--+-+,..   ..+----+
                     \/      |   ___/   |   : \ /    : (Backbone)
                     /\___   |  /       |   :  :     :
               +----+     +--+-+     +--+-+;../ \..+----+
           X --+ B5 |-----| B6 |-----| B7 |........| B8 |
               +----+     +----+     +----+        +----+

                 Figure 2: L2VPN Flush Messages

     In Figure 2, consider a PE B3 with some form of Ethernet
     connections on one side, and the Pseudowire world on the other
     side.  If that PE discovers that a link has gone down, say the
     B5-B2 link, the B2-B3 link, or the Y-B3 link, there are several
     possibilities for what can be wrong or right about learned MAC
     address information:

     1.   The link is permanently down.  Those MAC addresses will be
          unreachable for a very long time.

     2.   The link is momentarily down.  Those MAC addresses will again
          be reachable through the same link in a very short time.

     3.   The link is down, but a backup link to this same PE, carrying
          those same MAC addresses, will very shortly be activated.

     4    The link is down, but a backup link to another PE, carrying
          those same MAC addresses, will very shortly be activated.

     In all of these cases, forgetting the MAC addresses immediately
     upon failure of the link does not help anything, especially if
     there are a large number of them associated with the failed link.
     If the MAC addresses "never" come back, they will eventually time
     out and be deleted everywhere.  Assuming that the link was running
     at full speed when it failed, all of the traffic already in, queued
     for transmission to, or about to be sent by the user to the
     network, will be flooded throughout the L2VPN.  Furthermore, this
     burst of flooding is useless, as it will occur at the very moment
     when it is the least likely that the flooded frames can reach their
     destinations.  This means, of course, that the out-of-touch
     stations cannot respond to the flooded frames and put a stop the
     flooding.  The flooding continues until the upper layers decide to
     wait for responses.

     In fact, it is when a link transitions from backup to operational


Finn and others           Expires December 2002                 [Page 6]

Internet-Draft              Bridging and VPLS               21 June 2002


     status that MAC addresses can be profitably forgotten.  Unless or
     until an alternate path to the lost MAC addresses becomes
     available, the packets destined for those MAC addresses is best
     black-holed.  If the MAC addresses never come back, then in the
     usual case, the bridge MAC address timeouts are such that the upper
     layers will give up on the conversation before the bridges forget
     the MAC addresses and start flooding the doomed traffic.  If the
     MAC addresses do come back, then old information is deleted as the
     first thing, and the flooded frames have a good chance of reaching
     their (new) destinations.  This is why RSTP waits until a link
     comes up to generate a Topology Change Notification.  Only when an
     alternate path is available is there any reason to flood frames
     everywhere.

     Looking at Figure 2, suppose that links B5-B2, B2-B3, and the
     Backbone PWs are the primary links between B5 and B4, and
     specifically that B5-B6 is kept in reserve.  In RSTP parlance, B5's
     port to B2 is a (forwarding) root port, and B5's port to B6 is a
     (discarding) alternate port.  If the B5-B2 link fails, B5's port to
     B6 becomes the forwarding root port, and a TCN is sent.  Note that,
     because of the direction of propagation of the TCN, B4 does not
     forget address Y, because that address cannot possibly have
     changed; the new link came up "behind" the Y-B3 connection.
     Interestingly, B3 does have to forget address Z, because it cannot
     know for certain that the "backdoor" link shown in the diagram has
     not come into use to deliver frames from Z via B1 and B2.  In other
     words, RSTP does not assume that it knows the global topology.
     However, any such needlessly flooded frames will reach their
     destinations and be answered, and so will very quickly be
     relearned.

     Many bridges treat MAC addresses attached to configured local
     access ports, rather than inter-bridge links, as sure knowledge.
     Those MAC addresses are not deleted from the owning bridge.

     If the device that brings up the new link knows what MAC addresses
     it is serving, perhaps by configuration, then the best solution is
     to transmit a "learn the following MAC addresses" control message.
     This is ideal, and is provided by [LASSERRE-VKOMPELLA].  In
     general, however, a bridge does not know this.

     One should also consider what happens if the failure occurs in a
     link not directly connected to the PE?  What happens if the PE is
     connected to a shared medium Ethernet, so that the loss of another
     device's connection is invisible to the PE?  (Shared media still
     exist, new ones such as packet rings are being created, and
     wireless hubs are very similar in nature.)  What happens if only
     one side of the connection gets a "glitch" and believes that the


Finn and others           Expires December 2002                 [Page 7]

Internet-Draft              Bridging and VPLS               21 June 2002


     link has momentarily gone down?

     The RSTP solution is known to handle all of these cases correctly.
     It would be better for any L2VPN solution to employ that same
     technique.  In other words:

     1.   As "classic" STP enters and leaves the topology change mode,
          the bridge responsible for transmitting that fact to the
          Backbone should also transmit the "accelerate timeouts" L2VPN
          control message for all affected L2VPNs.  {This is not needed
          if classic STP is not to be supported.)

     2.   When a "rapid" RSTP bridge transmits a TCN BPDU over the
          Backbone, it should also transmit the "forget all MAC
          addresses not learned from me" L2VPN control message for the
          Pseudowires associated with the affected spanning tree
          instance.

     3.   When some L2 device not utilizing spanning trees, such as the
          PE-CLE of [SAJASSI], switches to a new PE, it should transmit
          a control message towards the new PE.  This message causes the
          PE to issue a "forget all MAC addresses not learned from me"
          L2VPN control message only for the L2VPNs connected to the
          affected PE-CLE.


4.  Independent vs. Shared Address Learning

     In order to be consistent with the current capabilities of [802.1Q]
     bridges, it must be possible to configure any number of L2VPNs
     either to use separate MAC address databases, as specified in
     [LASSERRE-VKOMPELLA], or to use the same MAC address database. If
     not, we remove a standard bridge capability that is not only
     expected by a significant fraction of the user community, but one
     which is actually more useful in the Ethernet Provider space than
     in the enterprise space.

     In particular, a bridge which is conformant to [802.1Q] can be
     configured, for any given pair of VLANs, to store those two VLANs'
     MAC address information in two different Filtering Databases
     (Independent VLAN Learning, IVL), or to use the same Filtering
     Database for both (Shared VLAN Learning, SVL).  (It also permits
     the user to specify "don't care", and let the bridge decide.)  That
     is, MAC addresses learned on one VLAN may or may not, according to
     the specific configuration of the bridge, be used to forward frames
     on another VLAN.  This fact has been utilized in a great many ways,
     by a number of bridge vendors and bridge customers, in order to
     implement a number of useful features.  It is a behavior that has


Finn and others           Expires December 2002                 [Page 8]

Internet-Draft              Bridging and VPLS               21 June 2002


     been in IEEE 802.1Q since 1998, and in various vendors' bridges
     long before that date.  It cannot be removed without a serious
     impact on the users of bridged LANs.

     Given that one VLAN in the Access Network is associated with one
     L2VPN, we are lead to the conclusion that two or more L2VPNs may be
     similarly configured to use either IVL or SVL.  In other words, two
     L2VPNs A and B may be configured such that a {MAC address, PW)
     association learned by a PE on L2VPN A is used when that PE
     transmits packets to a PW on L2VPN B.  Another way to phrase this
     is that bridges do not remember {MAC, VLAN-ID, port} triplets, but
     instead, remember {MAC, FID, port} triplets, where "FID" is the
     Filtering ID, identifying a Filtering Database.  The FID is derived
     from VLAN-ID through a statically configured mapping table.
     Similarly, L2VPN interfaces must learn {MAC, FID, PW} triplets
     instead of {MAC, L2VPN-ID, PW} triplets, by means of a configured
     L2VPN-to-Filtering-Database-ID table.

     A concrete example of Shared VLAN Learning is the "Spanning Tree
     Per Bridge" technique which is most useful in a ring of bridges.
     (This is a more common topology in Ethernet MAN Provider Networks
     than in enterprise networks.)  In a ring with one spanning tree,
     one link must be blocked, in order to prevent a closed forwarding
     loop.  Traffic between bridges on opposite sides of the blocked
     link must go the long way around the ring.  To avoid this
     inefficiency, the "Spanning Tree Per Bridge" technique employs n
     spanning trees, and splits each VLAN into a group of n VLANs, one
     for each spanning tree.  Each bridge associates itself with exactly
     one spanning tree, selected so that the blocked link of that
     spanning tree is near the opposite side of the ring.  Every frame
     it sends for a given VLAN group is sent only on the particular VLAN
     associated with that bridge's spanning tree.  Clearly, for this to
     work, all of the VLANs in one group must share the same Filtering
     Database.

     If the user of this "Spanning Tree Per Bridge" technique is a
     customer of an Ethernet MAN Service Provider, purchasing multiple
     L2VPNs to make the connections between the customer's bridges, then
     the provider's L2VPNs must share their learned MAC addresses.


Acknowledgements

     The authors wish to thank Ali Sajassi, Joel Halpern, Steve Phillips
     and Adam Sweeney for their valuable suggestions, both technical and
     editorial, for correcting and improving this document, as well as a
     number of IEEE P802.1 voting members who reviewed it.


Finn and others           Expires December 2002                 [Page 9]

Internet-Draft              Bridging and VPLS               21 June 2002


References

[ANDERSSON]
     "PPVPN L2 Framework", draft-andersson-ppvpn-l2-framework-00.txt
     (Work in Progress)

[LASSERRE-VKOMPELLA]
     "Virtual Private LAN Services over MPLS", draft-lasserre-vkompella-
     ppvpn-vpls-01.txt (Work in Progress)

[SAJASSI]
     "VPLS Architectures", draft-sajassi-vpls-architectures-00.txt (Work
     in Progress)

[802.1D]
     "Information technology.  Telecommunications and information
     exchange between systems.  Local and metropolitan area networks.
     Common specifications.  Part 3: Media Access Control (MAC)
     Bridges", ANSI/IEEE Std 802.1D-1998.

[802.1Q]
     "IEEE Standards for Local and Metropolitan Area Networks: Virtual
     Bridged Local Area Networks", IEEE Std 802.1Q-1998.

[802.1w]
     "IEEE Standard for Local and metropolitan area networks.  Common
     specifications Part 3: Media Access Control (MAC) Bridges.
     Amendment 2: Rapid Reconfiguration", IEEE Std 802.1w-2001.

Authors' Addresses


     Norman Finn
     Cisco Systems
     170 W Tasman Drive
     San Jose, CA 95134
     USA

     Phone: +1.408.526.4495
     Email: nfinn@cisco.com


Finn and others           Expires December 2002                [Page 10]

Internet-Draft              Bridging and VPLS               21 June 2002


     Mick Seaman
     Consultant
     160 Bella Vista Ave
     Belvedere
     CA 94920

     mick_seaman@ieee.org


     Andrew Smith
     Consultant

     Email: ah_smith@acm.org
     Fax:  +1.415.345.1827


     Allyn Romanow
     Cisco Systems
     170 W Tasman Drive
     San Jose, CA 95134
     USA

     Phone +1.408.525.8836
     Email: allyn@cisco.com


Full Copyright Statement

     Copyright (C) The Internet Society (2002). All Rights Reserved.

     This document and translations of it may be copied and furnished to
     others, and derivative works that comment on or otherwise explain
     it or assist in its implementation may be prepared, copied,
     published and distributed, in whole or in part, without restriction
     of any kind, provided that the above copyright notice and this
     paragraph are included on all such copies and derivative works.
     However, this document itself may not be modified in any way, such
     as by removing the copyright notice or references to the Internet
     Society or other Internet organizations, except as needed for the
     purpose of developing Internet standards in which case the
     procedures for copyrights defined in the Internet Standards process
     must be followed, or as required to translate it into languages
     other than English.

     The limited permissions granted above are perpetual and will not be
     revoked by the Internet Society or its successors or assigns.


Finn and others           Expires December 2002                [Page 11]

Internet-Draft              Bridging and VPLS               21 June 2002


     This document and the information contained herein is provided on
     an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET
     ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR
     IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
     THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
     WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Finn and others           Expires December 2002                [Page 12]