Internet-Draft Ellen J. Stokes LDAP Duplication/Replication/Update Tivoli Systems Protocols WG Russel F. Weiser Intended Category: Informational Digital Signature Trust Expires: March 2001 Ryan D. Moats Coreon, Inc. Richard V. Huber AT&T Laboratories September 2000 LDAPv3 Replication Requirements draft-ietf-ldup-replica-req-04.txt Status of This Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/lid-abstracts.txt. The list of Internet-Drafts Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (C) The Internet Society (2000). All Rights Reserved. Abstract This document discusses the fundamental requirements for replication of data accessible via the LDAPv3 [RFC2251] protocol. It is intended to be a gathering place for general replication requirements needed to provide interoperability between informational directories. Stokes, et al Expires February 2001 [Page 1] Internet-Draft LDAPv3 Replication Requirements August 2000 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. Stokes, et al Expires February 2001 [Page 2] Internet-Draft LDAPv3 Replication Requirements August 2000 Table of Contents 1 Introduction.......................................................4 2 Terminology........................................................4 3 The Possible Models................................................7 4 Requirements.......................................................8 4.1 General.........................................................8 4.2 Model...........................................................8 4.3 Protocol.......................................................10 4.4 Schema.........................................................10 4.5 Single Master..................................................11 4.6 Multi-Master...................................................11 4.7 Administration and Management..................................12 4.8 Security.......................................................12 5 Security Considerations...........................................13 6 Acknowledgements..................................................13 7 References........................................................13 A.APPENDIX A - Usage Scenarios......................................14 A.1.Extranet Example...............................................14 A.2.Consolidation Example..........................................14 A.3.Replication Heterogeneous Deployment Example...................14 A.4.Shared Name Space Example......................................15 A.5.Supplier Initiated Replication.................................15 A.6.Consumer Initiated Replication.................................15 A.7.Prioritized attribute replication..............................16 A.8.Bandwidth issues...............................................16 A.9.Interoperable Administration and Management....................16 A.10.Enterprise Directory Replication Mesh.........................17 A.11.Failure of the Master in a Master-Slave Replicated Directory..17 A.12.Failure of a Directory Holding Critical Service Information...18 B.APPENDIX B - Rationale............................................18 B.1.Meta-Data Implications.........................................18 B.2.Order of Transfer for Replicating Data.........................18 B.3.Schema Mismatches and Replication..............................19 B.4.Detecting and Repairing Inconsistencies Among Replicas.........20 B.5.Some Test Cases for Conflict Resolution in Multi-Master Replication........................................................21 B.6.Data Privacy During Replication................................24 B.7.Failover in Single-Master Systems..............................25 B.8.Including Operational Attributes in Atomic Operations..........26 Authors' Addresses...................................................26 Full Copyright Statement.............................................27 Stokes, et al Expires February 2001 [Page 3] Internet-Draft LDAPv3 Replication Requirements August 2000 1 Introduction The ability to distribute directory information throughout the network provides a two-fold benefit to the network: (1) increasing the reliability of the directory through fault tolerance, and (2) brings the directory content closer to the clients using the data. LDAP's acceptance as an access protocol for directory information is driving the need to distribute LDAP directory content among servers within enterprise and Internet. Currently LDAP does not define a replication mechanism and only generally mentions LDAP shadow servers (see [RFC2251]) in passing. A standard mechanism for replication that operates in a multi-vendor directory environment is critical to the successful deployment and acceptance of LDAP in the market place. This document sets out the requirements for replication between multiple LDAP servers. While RFC 2251 and RFC 2252 [RFC2252] set forth the standards for communication between LDAP clients and servers there are additional requirements for server-to-server communication. Some of these are covered here. This document first introduces the terminology to be used, then presents the different replication models being considered. The actual requirements follow, along with security considerations. The reasoning that leads to the requirements is presented in the Appendix. This was done to provide a clean separation of the requirements from their justification. 2 Terminology The following terms are used in this document: Area of replication - A whole or portion of a Directory Information Tree (DIT) that makes up a distinct unit of data to be replicated. This may also be known as "unit of replication". Atomic operation - A set of changes to directory data which the LDAP standards guarantee will be treated as a unit; all changes will be made or all the changes will fail. Atomicity Information - Information about atomic operations passed as part of replication. Conflict - A situation that arises when changes are made to the same directory data on different directory servers before replication can Stokes, et al Expires February 2001 [Page 4] Internet-Draft LDAPv3 Replication Requirements August 2000 synchronize the data on the servers. When the servers do synchronize, they have inconsistent data - a conflict. Conflict resolution - Deterministic procedures used to resolve change information conflicts that may arise during replication. Critical OID - Attributes or object classes defined in the replication agreement as being critical to the operation of the system. Changes affecting critical OIDs cause immediate initiation of a replica cycle. An example of a critical OID might be a password or certificate. Fractional replication - This is the capability to replicate a subset (as opposed to the full set) of attributes of those entries being replicated. Incremental Update - A replica update that contains only those attributes or objects that have changed. Master Replica - In a Master-Slave Replication system, the Master Replica is the only updateable replica in the replica ring. It is the supplier in all replication sessions. Master Slave, or Single Master Replication - A replication model that assumes only one server, the master, allows write access to the replicated data. Note that Master-Slave replication can be considered a proper subset of multi-master replication. Meta-Data - Data collected by the replication system that describes the status/state of replication. Multi-Master Replication - A replication model where entries can be written and updated on any of several updateable replica copies without requiring communication with other updateable replicas before the write or update is performed. Naming Context - Suffix of a sub-tree of entries held in a single server [X.500]. One-way Replication - The process of synchronization in a single direction where the authoritative source information is provided to a replica. Partial Replication - Partial Replication is Fractional Replication, Sparse Replication, or both. Stokes, et al Expires February 2001 [Page 5] Internet-Draft LDAPv3 Replication Requirements August 2000 Propagation behavior - The general behavior of the actual synchronization process between a consumer and a provider of replication information. Read-only Replica - A read-only copy of a replicated directory. A read- only replica is assumed to be a slave replica in the single master replication definition. Replica - A single instance of a whole or portion of the DIT as defined by the area of replication. Replica Ring - A set of servers, which hold in common the same DIT information as, defined by "Area of replication". These servers may be managed under a single replication agreement that handles all members of the set of servers as a group. Replica (or Replication) Cycle - A replica cycle is the communication of a change or groups of changes that need to be propagated to other members of a replica ring. The process of contacting a replica member is considered the beginning of a replication cycle while the termination of communications with a replica is the end of the cycle. Termination can occur either due to an error or successful exchange of update records. Replication - The process of copying portions of naming context information and content between multiple LDAP servers, such that certain predefined portions of the information are available from different servers. The replication process is neither implementation nor platform specific. Replication Agreement - A collection of information describing the parameters of replication between two or more servers in a replica ring. Replication Initiation Conflict - In multi-master replication, a Replication Initiation Conflict is a situation where two masters want to update the same replica at the same time. Replication Session - A session set up between two servers in a replica ring to pass update information as part of a Replica Cycle. Slave (or Read-Only) Replica - A replica that cannot be directly updated. Changes may only be made via replication from a master replica. Sparse Replication - The capability to replicate some subset of entries (other than a complete naming context) of a naming context. Stokes, et al Expires February 2001 [Page 6] Internet-Draft LDAPv3 Replication Requirements August 2000 Topology - The shape of the directed graph describing the relationships between replicas. Two-way Replication - The process of synchronization where change information flows bi-directionally between two replicas. Update Propagation - Protocol-based process by which directory replicas are reconciled. Updateable Replica - A read-writeable copy of the replicated information. 3 The Possible Models The major objective is to provide an interoperable LDAP V3 directory synchronization protocol that is simple, highly efficient and flexible enough to support both multi-master and master-slave replication operations. Such a protocol would meet the needs of both the Internet and enterprise environments. Generally, replication can be characterized by looking at data consistency models across existing technologies. This provides insight to LDAP v3 replication requirements. The following is a brief examination of data consistency models. Model 1: Transactional Consistency -- Environments that exhibit all four of the ACID properties (Atomicity, Concurrency, Independence, Durability) [ACID]. Model 2: Eventual Consistency or Transient Consistency -- Environments where definite knowledge of the global replica topology is provided through predetermined replication agreements. Examples include X.500 Directories, Bayou [XEROX], and NDS (Novell Directory Services) [NDS]. In this model, every update propagates to every replica that it can reach via a path of stepwise eventual connectivity. Model 3: Limited Effort Eventual Consistency -- Environments that provide a statistical probability of convergence with global knowledge of replica topology. An example is the Xerox Clearinghouse [XEROX]. This model is similar to "Eventual Consistency", except where replicas may purge updates. Purging drops propagation changes when some replica time boundary is exceeded, thus leaving some changes replicated to only a portion of the replica topology. Transactional consistency is not preserved, though some weaker constraints on consistency are available. Model 4: Loosest Consistency -- Environments where information is provided from an opportunistic or simple cache until stale. Stokes, et al Expires February 2001 [Page 7] Internet-Draft LDAPv3 Replication Requirements August 2000 Model 5: Ad hoc -- A copy of a data store where no follow up checks are made for the accuracy/freshness of the data. Consistency models 2 and 3 involve the use of prearranged replication agreements between cooperating servers. The added complexity of 2-phase commit required for Model 1 is significant enough that model 1 will not be considered at this time. Models 4 and 5 involve unregistered replicas that "pull" updates from another directory server without that server's knowledge. These models can be considered to violate a directory's security policies. Further review of models 2 and 3 reveal two example application areas that LDAP replication must be able to handle. These are policy configuration through security management parameters (model 2) and white-pages environments that contain fairly static data and address information (model 3). Therefore, replication requirements are presented for models 2 and 3. 4 Requirements 4.1 General G1. LDAP Replication MUST support models 2 (Eventual Consistency) and 3 (Limited Effort Eventual Consistency) above. G2. LDAP Replication SHOULD NOT preclude support for model 1 (Transactional Consistency) in the future. G3. The act of replication SHOULD have minimal impact on both the system and network performance. G4. An LDAP Replication Standard SHOULD NOT limit the transaction rate of a replication session. G5. The replication standard SHOULD NOT limit the size of a replica. G6. Any meta-data collected by the LDAP replication mechanism MUST NOT grow without bound. G7. All policy and state data pertaining to replication MUST be accessible via LDAP. 4.2 Model Stokes, et al Expires February 2001 [Page 8] Internet-Draft LDAPv3 Replication Requirements August 2000 M1. The model MUST support the following triggers for initiation of a replica cycle: a) A configurable set of scheduled times b) Periodically, with a configurable period between replica cycles c) A configurable maximum amount of time between replica cycles d) A configurable number of accumulated changes e) Change in the value of a critical OID f) As the result of an automatic rescheduling after a replication initiation conflict g) Administrative request for replication With the exception of administrative request, the specific trigger(s) and related parameters in effect for a given server MUST be identified in a well known place, e.g. the Replication Agreement(s). M2. The replication model MUST support both master-slave and multi- master relationships. M3. All replicated information between the master database and its replica databases MUST be identical including all non-user modify operational attributes such as time stamps. Note this does not imply that the entire database is identical from replica to replica, but that the subset of data, chosen to replicate is identical from replica to replica. Some operational attributes may be dynamically evaluated; these attributes will not necessarily appear to be identical. M4. LDAP replication MUST encompass schema objects, attributes, access control, and name space information. M5. LDAP replication MUST NOT require all copies of the replicated information to be complete copies of the replicated object. The model MUST support Fractional, Partial, and Sparse Replicas. M6. Sub-tree Replication MUST be defined to allow for greater flexibility in replication topologies of the DIT as defined by partial replication. M7. The determination of which OIDs are critical MUST be configurable in the replication agreement. M8. Replication activities MUST occur within the context of a predefined replication agreement that addresses proper knowledge of access requirements and credentials between the synchronizing directories. M9. The replication agreements SHOULD accommodate multiple servers receiving the same replica under a single predefined agreement. Stokes, et al Expires February 2001 [Page 9] Internet-Draft LDAPv3 Replication Requirements August 2000 M10. LDAP replication MUST provide scalability to both enterprise and Internet environments, e.g. an LDAP server must be able to provide replication services to replicas within an enterprise as well as across the Internet. M11. While different directory implementations can support different/extended schema, schema mismatches between two replicating servers MUST be handled. One way of handling such mismatches might be to raise an error condition. M12. The LDAP replication model MUST allow for full update to facilitate replica initialization and reset loading utilizing a standardized format such as LDIF [RFC2849] format. 4.3 Protocol P1. The replication protocol MUST provide for recovery and rescheduling of a replication session due to replication initiation conflicts (e.g. consumer busy replicating with other servers) and or loss of connection (e.g. supplier cannot reach a replica). P2. The replication protocol MUST allow a restart at the last acknowledged update prior to interruption rather than re-sending updates it had already sent to a consuming replica. P3. The LDAP replication protocol MUST allow for full update to facilitate replica initialization and reset loading utilizing a standardized format such as LDIF [RFC2849] format. P4. Incremental replication MUST be allowed. P5. The replication protocol MUST allow either a master or slave replica to initiate the replication process. P6. The protocol MUST support propagation of atomicity information. P7. The protocol SHOULD NOT preclude future support of Transactional Consistency (model 1). P8. The protocol MUST support a mechanism to report schema mismatches between replicas discovered during a replication session. 4.4 Schema SC1. A standard way to determine what replicas are held on a server MUST be defined. Stokes, et al Expires February 2001 [Page 10] Internet-Draft LDAPv3 Replication Requirements August 2000 SC2. A standard schema for representing replication agreements MUST be defined. SC3. The semantics associated with modifying the attributes of replication agreements MUST be defined. SC4. A standard method for determining the location of replication agreements MUST be defined. SC5. A standard schema for publishing state information about a given replica MUST be defined. SC6. A standard method for determining the location of replica state information MUST be defined. SC7. It MUST be possible for authorized administrators, regardless of their network location, to access replication agreements in the DIT. SC8. Replication agreements of all servers containing replicated information MUST be accessible via LDAP. SC9. All objects MUST be uniquely identifiable throughout the object lifetime. 4.5 Single Master SM1. A Single Master system SHOULD provide a fast method of promoting a slave replica to become the master replica. SM2. The master replica in a Single Master system SHOULD send all changes to read-only replicas in the order in which they were applied on the master. 4.6 Multi-Master MM1. Replica synchronization SHOULD be handled in such a manner as to not saturate the network with repetitive entry replication from supplier replicas. MM2. The initiator MUST be allowed to determine whether it will become a consumer or supplier during the synchronization startup process. MM3. During a replication session, it MUST be possible for the two servers to switch between the consumer and supplier roles. Stokes, et al Expires February 2001 [Page 11] Internet-Draft LDAPv3 Replication Requirements August 2000 MM4. When multiple master replicas want to begin a replication session with the same replica at the same time, the model MUST have a deterministic mechanism for resolving the resulting replication initiation conflict with no operator intervention. MM5. Multi-master replication MUST NOT lose information during replication. If conflict resolution would result in the loss of directory information, the replication process MUST store that information, notify the administrator of the nature of the conflict and the information that was lost, and provide a mechanism for possible override by the administrator. MM6. Multi-master replication MUST support convergence of the values of attributes and objects. Convergence may result in an event as described in MM5. 4.7 Administration and Management AM1. Replication agreements MUST allow the initiation of a replica cycle to be administratively postponed to a more convenient period. AM2. Each copy of a replica MUST maintain audit history information of which servers it has replicated with and which servers have replicated with it. AM3. Access to replication agreements, topologies, and policies attributes MUST be provided through LDAP access. AM4. The capability to check the differences between two replicas for the same information SHOULD be provided. AM5. A mechanism to fix differences between replicas without triggering new replica cycles SHOULD be provided. AM6. The deletion of sensitive data MUST be handled in an orderly manner so that at no time will that data be available without proper access control. That is, access control information (ACI) associated with sensitive data must be deleted after or simultaneously with the delete of the sensitive data. Likewise, when adding sensitive data, ACI MUST be added first or simultaneously with the addition of that data. 4.8 Security S1. During initiation of a replication session, authentication and verification of authorization of both the replica and the source directory MUST be allowed before any data is transferred. Stokes, et al Expires February 2001 [Page 12] Internet-Draft LDAPv3 Replication Requirements August 2000 S2. The transport for LDAP synchronization MUST permit assurance of the integrity and privacy of all data transferred. S3. To promote interoperability, there MUST be a mandatory-to- implement data privacy mechanism. S4. The transport for administrative access MUST permit assurance of the integrity and privacy of all data transferred. 5 Security Considerations This document includes security requirements (listed in section 4.8 above) for the replication model and protocol. 6 Acknowledgements This document is based on input from IETF members interested in LDUP Replication. 7 References [ACID] T. Haerder, A. Reuter, "Principles of Transaction-Oriented Database Recovery", Computing Surveys, Vol. 15, No. 4 (December 1983), pp. 287-317. [NDS] Novell, "NDS Technical Overview", 104-000223-001, http://developer.novell.com/ndk/doc/docui/index.htm#../ndslib/dsov_enu/ data/h6tvg4z7.htm, September, 2000. [RFC2119] S. Bradner, "Key Words for Use in RFCs to Indicate Requirement Levels", RFC 2119, March 1997. [RFC2251] M. Wahl, T. Howes, S. Kille "Lightweight Directory Access Protocol", RFC 2251, December 1997. [RFC2252] M. Wahl, A. Coulbeck, T. Howes, S. Kille, "Lightweight Directory Access Protocol (v3): Attribute Syntax Definitions", RFC 2252, December 1997. [RFC2849] Gordon Good, "The LDAP Data Interchange Format (LDIF)", RFC 2849, June 2000. [X.501] ITU-T Recommendation X.501 (1993), | ISO/IEC 9594-2: 1993, Information Technology - Open Systems Interconnection - The Directory: Models. Stokes, et al Expires February 2001 [Page 13] Internet-Draft LDAPv3 Replication Requirements August 2000 [XEROX] Hauser, C. "Managing update conflicts in Bayou, a weakly connected replicated storage system". Palo Alto, CA: Xerox PARC, Computer Science Laboratory; 1995 August; CSL-95-4. [CSL-95-04] A. APPENDIX A - Usage Scenarios The following directory deployment examples are intended to substantiate and validate our replication requirements. It is assumed in all cases that directory implementations from different vendors are involved. This material is intended as background; no requirements are presented in this Appendix. A.1. Extranet Example A company has a trading partner to whom it wishes to provide directory information. This information may be as simple as a corporate telephone directory, or as complex as an extranet workflow application. For performance reasons the company may wish to have a replica of its directory within the Partner Company, rather than simply exposed beyond its firewall. The requirements that follow from this scenario are: . One-way replication, single mastered. . Authentication of clients. . Common access control and access control identification. . Secure transmission of updates. . Selective attribute replication (Fractional Replication), so that only partial entries can be replicated. A.2. Consolidation Example Company A acquires company B. In the transition period, whilst the organizations are merged, both directory services must coexist. Company A may wish to attach company B's directory to its own. The requirements that follow from this scenario are: . Multi-Master replication. . Common access control model. Access control model identification. . Secure transmission of updates. . Replication between DITs with potentially differing schema. A.3. Replication Heterogeneous Deployment Example Stokes, et al Expires February 2001 [Page 14] Internet-Draft LDAPv3 Replication Requirements August 2000 An organization may deliberately deploy multiple directory services within their enterprise to employ the differing benefits of each service. In this case multi-master replication will be required to ensure that the multiple updateable replicas of the DIT are synchronized. Some vendors may provide directory clients, which are tied to their own directory service. The requirements that follow from this scenario are: . Multi-Master replication . Common access control model and Access control model identification. . Secure transmission of updates. . Replication among DITs with potentially differing schemas. A.4. Shared Name Space Example Two organizations may choose to cooperate on some venture and need a shared name space to manage their operation. Both organizations will require administrative rights over the shared name space. The requirements that follow from this scenario are: . Multi-Master replication. . Common access control model and Access control model identification. . Secure transmission of updates. A.5. Supplier Initiated Replication This is a single master environment that maintains a number of replicas of the DIT by pushing changes based on a defined schedule. The requirements that follow from this scenario are: . Single-master environment. . Supplier-initiated replication. . Secure transmission of updates. A.6. Consumer Initiated Replication Again a single mastered replication topology, but the replica initiates the replication exchange rather than the master. An example of this is a replica that resides on a laptop computer that may run disconnected for a period of time. The requirements that follow from this scenario are: . Single-master environment. Stokes, et al Expires February 2001 [Page 15] Internet-Draft LDAPv3 Replication Requirements August 2000 . Consumer initiated replication. . Open scheduling (anytime). A.7. Prioritized attribute replication The password attribute can provide an example of the requirement for prioritized attribute replication. A user is working in Utah and the administrator resides in California. The user has forgotten his password. So the user calls or emails the administrator to request a new password. The administrator provides the updated password (a change). Under normal conditions, the directory replicates to a number of different locations overnight. But corporate security policy states that passwords are critical and the new value must be available immediately (e.g. shortly) after any change. Replication needs to occur immediately for critical attributes/objects. The requirements that follow from this scenario are: . Incremental replication of changes. . Immediate replication on change of certain attributes. . Replicate based on time/attribute semantics. A.8. Bandwidth issues The replication of Server (A) R/W replica (a) in Kathmandu is handled via a dial up phone link to Paris where server (B) R/W replica of (a) resides. Server (C) R/W replica of (a) is connected by a T1 connection to server (B). Each connection has a different performance characteristic. The requirements that follow from this scenario are: . Minimize repetitive updates when replicating from multiple replication paths. . Incremental replication of changes. . Provide replication cycles to delay and/or retry when connections cannot be reached. . Allowances for consumer initiated or supplier initiated replication. A.9. Interoperable Administration and Management The administrator with administrative authority of the corporate directory which is replicated by numerous geographically dispersed LDAP servers from different vendors notices that the replication process is Stokes, et al Expires February 2001 [Page 16] Internet-Draft LDAPv3 Replication Requirements August 2000 not completing correctly as the change log is continuing to grow and/or error message informs him. The administrator uses his $19.95 RepCo LDAP directory replication diagnostics tools to look at Root DSE replica knowledge on server 17 and determines that server 42 made by LDAP'RUS Inc. is not replicating properly due to an Object conflict. Using his Repco Remote repair tools he connects to server 42 and resolves the conflict on the remote server. The requirements that follow from this scenario are: . Provides replication audit history. . Provisions for managing conflict resolution. . Provide LDAP access to predetermined agreements, topology and policy attributes. . Provide operations for comparing replica's content for validity. . Provide LDAP access to status and audit information. A.10. Enterprise Directory Replication Mesh A Corporation builds a mesh of directory servers within the enterprise utilizing LDAP servers from various vendors. Five servers are holding the same area of replication. The predetermined replication agreement(s) for the enterprise mesh are under a single management, and the security domain allows a single predetermined replication agreement to manage the 5 servers replication. The requirements that follow from this scenario are: . Predefined replication agreements that manage more than a single area of replication that is held on numerous servers. . Common support of replication management knowledge across vendor implementation. . Rescheduling and continuation of a replication cycle when one server in a replica ring is busy and/or unavailable. A.11. Failure of the Master in a Master-Slave Replicated Directory A company has a corporate directory that is used by the corporate email system. The directory is held on a mesh of servers from several vendors. A corporate relocation results in the closing of the location where the master copy of the directory is located. Employee information (such as mailbox locations and employee certificate information) must be kept up to date or mail cannot be delivered. The requirements that follow from this scenario are: . An existing slave replica must be "promote-able" to become the new master. Stokes, et al Expires February 2001 [Page 17] Internet-Draft LDAPv3 Replication Requirements August 2000 . The "promotion" must be done without significant downtime, since updates to the directory will continue. A.12. Failure of a Directory Holding Critical Service Information An ISP uses a policy management system that uses a directory as the policy data repository. The directory is replicated in several different sites on different vendors' products to avoid single points of failure. It is imperative that the directory be available and be updateable even if one site is disconnected from the network. Changes to the data must be traceable, and it must be possible to determine how changes made from different sites interacted. The requirements that follow from this scenario are: . Multi-master replication . Ability to reschedule replication sessions . Support for manual review and override of replication conflict resolution B. APPENDIX B - Rationale This Appendix gives some of the background behind the requirements. It is included to help the protocol designers understand the thinking behind some of the requirements and to present some of the issues that should be considered during design. With the exception of section B.8, which contains a suggested requirement for the update to RFC 2251, this Appendix does not state any formal requirements. B.1. Meta-Data Implications Requirement G4 states that meta-data must not grow without bound. This implies that meta-data must, at some point, be purged from the system. This, in turn, raises concerns about stability. Purging meta-data before all replicas have been updated may lead to incomplete replication of change information and inconsistencies among replicas. Therefore, care must be taken setting up the rules for purging meta- data from the system while still ensuring that meta-data will not grow forever. B.2. Order of Transfer for Replicating Data Situations may arise where it would be beneficial to replicate data out-of-order (e.g. send data to consumer replicas in a different order than it was processed at the supplier replica). One such case might occur if a large bulk load was done on the master server in a single- Stokes, et al Expires February 2001 [Page 18] Internet-Draft LDAPv3 Replication Requirements August 2000 master environment and then a single change to a critical OID (a password change, for example) was then made. Rather than wait for all the bulk data to be sent to the replicas, the password change might be moved to the head of the queue and be sent before all the bulk data was transferred. Other cases where this might be considered are schema changes or changes to critical policy data stored in the directory. While there are practical benefits to allowing out-of-order transfer, there are some negative consequences as well. Once out-of-order transfers are permitted, all receiving replicas must be prepared to deal with data and schema conflicts that might arise. As an example, assume that schema changes are critical and must be moved to the front of the replication queue. Now assume that a schema change deletes an attribute for some object class. It is possible that some of the operations ahead of the schema change in the queue are operations to delete values of the soon-to-be-deleted attribute so that the schema change can be done with no problems. If the schema change moves to the head of the queue, the consumer servers might have to delete an attribute that still has values, and then receive requests to delete the values of an attribute which is no longer defined. In the multi-master case, similar situations can arise when simultaneous changes are made to different replicas. Thus, multi- master systems must have conflict resolution algorithms in place to handle such situations. But in the single-master case conflict resolution is not needed unless the master is allowed to send data out- of-order. This is the reasoning behind requirement SM2, which recommends that data always be sent in order in single-master replication. Note that even with this restriction, the concept of a critical OID is still useful in single-master replication. An example of its utility can be found in section A.7. B.3. Schema Mismatches and Replication Multi-vendor environments are the primary area of interest for LDAP replication standards. Some attention must thus be paid to the issue of schema mismatches, since they can easily arise when vendors deliver slightly different base schema with their directory products. Even when both products meet the requirements of the standards [RFC2252], the vendors may have included additional attributes or object classes with their products. When two different vendor's products attempt to replicate, these additions can cause schema mismatches. Another potential cause of schema mismatches is discussed in section A.3. There are only a few possible responses when a mismatch is discovered. Stokes, et al Expires February 2001 [Page 19] Internet-Draft LDAPv3 Replication Requirements August 2000 . Raise an error condition and ignore the data. This should always be allowed and is the basis for requirement P8 and the comment on M11. . Map/convert the data to the form required by the consuming replica. A system may choose this course; requirement M11 is intended to allow this option. The extent of the conversion is up to the implementation; in the extreme it could support use of the replication protocol in meta-directories. . Quietly ignore (do not store on the consumer replica and do not raise an error condition) any data that does not conform to the schema at the consumer. Requirement M11 is intended to exclude the last option. Normal IETF practice in protocol implementation suggests that one be strict in what one sends and be flexible in what one receives. The parallel in this case is that a supplier should be prepared to receive an error notification for any schema mismatch, but a consumer may choose to do a conversion instead. The other option that can be considered in this situation is the use of fractional replication. If replication is set up so only the common attributes are replicated, mismatches can be avoided. One additional consideration here is replication of the schema itself. M4 requires that it be possible to replicate schema. If a consumer replica is doing conversion, extreme care should be taken if schema elements are replicated since some attributes are intended to have different definitions on different replicas. For fractional replication, the protocol designers and implementors should give careful consideration to the way they handle schema replication. Some options for schema replication include: . All schema elements are replicated. . Schema elements are replicated only if they are used by attributes that are being replicated. . Schema are manually configured on the servers involved in fractional replication; schema elements are not replicated via the protocol. B.4. Detecting and Repairing Inconsistencies Among Replicas Despite the best efforts of designers, implementors, and operators, inconsistencies will occasionally crop up among replicas in production directories. Tools will be needed to detect and to correct these inconsistencies. Stokes, et al Expires February 2001 [Page 20] Internet-Draft LDAPv3 Replication Requirements August 2000 A special client may accomplish detection through periodic comparisons of replicas. This client would typically read two replicas of the same naming context and compare the answers, possibly by BINDing to each of the two replicas to be compared and reading them both. In cases where the directory automatically reroutes some requests (e.g. chaining), mechanisms to force access to a particular replica should be supplied. Alternatively, the server could support a special request to handle this situation. A client would invoke an operation at some server. It would cause that server to extract the contents from some other server it has a replication agreement with and report the differences back to the client as the result If an inconsistency is found, it needs to be repaired. To determine the appropriate repair, the administrator will need access to the replication history to figure out how the inconsistency occurred and what the correct repair should be. When a repair is made, it should be restricted to the replica that needs to be fixed; the repair should not cause new replication events to be started. This may require special tools to change the local data store without triggering replication. Requirements AM2, AM4, and AM5 address these needs. B.5. Some Test Cases for Conflict Resolution in Multi-Master Replication Use of multi-master replication inevitably leads the possibility that incompatible changes will be made simultaneously on different servers. In such cases, conflict resolution algorithms must be applied. As a guiding principle, conflict resolution should avoid surprising the user. One way to do this is to adopt the principle that, to the extent possible, conflict resolution should mimic the situation that would happen if there were a single server where all the requests were handled. While this is a useful guideline, there are some situations where it is impossible to implement. Some of these cases are examined in this section. In particular, there are some cases where data will be "lost" in multi-master replication that would not be lost in a single-server configuration. In the examples below, assume that there are three replicas, A, B, and C. All three replicas are updateable. Changes are made to replicas A and B before replication allows either replica to see the change made on the other. In discussion of the multi-master cases, we assume that Stokes, et al Expires February 2001 [Page 21] Internet-Draft LDAPv3 Replication Requirements August 2000 the change to A takes precedence using whatever rules are in force for conflict resolution. B.5.1. Create-Create A user creates a new instance of an object with distinguished name DN to A. At the same time, a different user adds an object with the same distinguished name on B. In the single-server case, one of the create operations would have occurred before the other, and the second request would have failed. In the multi-master case, each create was successful on its originating server. The problem is not detected until replication takes place. When a replication request to create a DN that already exists arrives at one of the servers, conflict resolution is invoked. (Note that the two requests can be distinguished even though they have the same DN because every object has some sort of unique identifier per requirement SC9.) As noted above, in these discussions we assume that the change from replica A has priority based on the conflict resolution algorithm. Whichever change arrives first, requirement MM6 says that the values from replica A must be those in place on all replicas at the end of the replication cycle. Requirement MM5 states that the system cannot quietly ignore the values from replica B. The values from replica B might be logged with some notice to the administrators, or they might be added to the DIT with a machine generated DN (again with notice to the administrators). If they are stored with a machine generated DN, the same DN must be used on all servers in the replica ring (otherwise requirement M3 would be violated). Note that in the case where the object in question is a container object, storage with a machine generated DN provides a place where descendent objects may be stored if any descendents were generated before the replication cycle was completed. In any case, some mechanism must be provided to allow the administrator to reverse the conflict resolution algorithm and force the values originally created on B into place on all replicas if desired. B.5.2. Rename-Rename On replica A, an object with distinguished name DN1 is renamed to DN. At the same time on replica B, an object with distinguished name DN2 is renamed to DN. Stokes, et al Expires February 2001 [Page 22] Internet-Draft LDAPv3 Replication Requirements August 2000 In the single-server case, one rename operation would occur before the other and the second would fail since the target name already exists. In the multi-master case, each rename was successful on its originating server. Assuming that the change on A has priority in the conflict resolution sense, DN will be left with the values from DN1 in all replicas and DN1 will no longer exist in any replica. The question is what happens to DN2 and its original values. Requirement MM5 states that these values must be stored somewhere. They might be logged, they might be left in the DIT as the values of DN2, or they might be left in the DIT as the values of some machine generated DN. Leaving them as the values of DN2 is attractive since it is the same as the single-server case, but if a new DN2 has already been created before the replica cycle finishes, there are some very complex cases to resolve. Any of the solutions described in this paragraph would be consistent with requirement MM5. B.5.3. Locking Based on Atomicity of ModifyRequest There is an object with distinguished name DN which contains attributes X, Y, and Z. The value of X is 1. On replica A, a ModifyRequest is processed which includes modifications to change that value of X from 1 to 0 and to set the value of Y to "USER1". At the same time, replica B process a ModifyRequest which includes modifications to change the value of X from 1 to 0 and to set the value of Y to "USER2" and the value of Z to 42. The application in this case is using X as a lock and is depending on the atomic nature of ModifyRequests to provide mutual exclusion for lock access. In the single-server case, the two operations would have occurred sequentially. Since a ModifyRequest is atomic, the entire first operation would succeed. The second ModifyRequest would fail, since the value of X would be 0 when it was attempted, and the modification changing X from 1 to 0 would thus fail. The atomicity rule would cause all other modifications in the ModifyRequest to fail as well. In the multi-master case, it is inevitable that at least some of the changes will be reversed despite the use of the lock. Assuming the changes from A have priority per the conflict resolution algorithm, the value of X should be 0 and the value of Y should be "USER1" The interesting question is the value of Z at the end of the replication cycle. If it is 42, the atomicity constraint on the change from B has been violated. But for it to revert to its previous value, grouping information must be retained and it is not clear when that information can be safely discarded. Thus, requirement G6 may be violated. Stokes, et al Expires February 2001 [Page 23] Internet-Draft LDAPv3 Replication Requirements August 2000 B.5.4. General Principles With multi-master replication there are a number of cases where a user or application will complete a sequence of operations with a server but those actions are later "undone" because someone else completed a conflicting set of operations at another server. To some extent, this can happen in any multi-user system. If a user changes the value of an attribute and later reads it back, intervening operations by another user may have changed the value. In the multi- master case, the problem is worsened, since techniques used to resolve the problem in the single-server case won't work as shown in the examples above. The major question here is one of intended use. In LDAP standards work, it has long been said that replication provides "loose consistency" among replicas. At several IETF meetings and on the mailing list, usage examples from finance where locking is required have been declared poor uses for LDAP. Requirement G1 is consistent with this history. But if loose consistency is the goal, the locking example above is an inappropriate use of LDAP, at least in a replicated environment. B.5.5. Avoiding the Problem The examples above discuss some of the most difficult problems that can arise in multi-master replication. While they can be dealt with, dealing with them is difficult and can lead to situations that are quite confusing to the application and to users. The common characteristics of the examples are: . Several directory users/applications are changing the same data . They are changing the data at the same time . They are using different directory servers to make these changes . They are changing data that are parts of a distinguished name or they are using ModifyRequest to both read and write a given attribute value in a single atomic request If any one of these conditions is reversed, the types of problems described above will not occur. There are many useful applications of multi-master directories where at least one of the above conditions does not occur. For cases where all four do occur, application designers should be aware of the possible consequences. B.6. Data Privacy During Replication Stokes, et al Expires February 2001 [Page 24] Internet-Draft LDAPv3 Replication Requirements August 2000 Directories will frequently hold proprietary information. Policy information, name and address information, and customer lists can be quite proprietary and are likely to be stored in directories. Such data must be protected during replication. In some cases, the network environment (e.g. a private network) may provide sufficient privacy for the application. In other cases, the data in the directory may be public and not require protection. For these reasons data privacy was not made a requirement for all replication sessions. But there are a substantial number of applications that will need data privacy, so there is a requirement (S2) that the protocol allow for data privacy in those cases where it is needed. This leaves the question of what privacy mechanism(s) to use. While this is ultimately a design/implementation decision, replication across different vendors' directory products is an important goal of the LDAP replication work at the IETF. If different vendors choose to support different data privacy mechanisms, the advantages of a standard replication protocol would be lost. Thus there is a requirement (S3) for a mandatory-to-implement data privacy mechanism. B.7. Failover in Single-Master Systems In a single-master system, all modifications must originate at the master. The master is therefore a single point of failure for modifications. This can cause concern when high availability is a requirement for the directory system. One way to reduce the problem is to provide a failover process that converts a slave replica to master when the original master fails. The time required to execute the failover process then becomes a major factor in availability of the system as a whole. Factors that designers and implementors should consider when working on failover include: . If the master replica contains control information or meta-data that is not part of the slave replica(s), this information will have to be inserted into the slave which is being "promoted" to master as part of the failover process. Since the old master is presumably unavailable at this point, it may be difficult to obtain this data. For example, if the master holds the status information of all replicas, but each slave replica only holds its own status information, failover would require that the new master get the status of all existing replicas, presumably from those replicas. Similar issues could arise for replication agreements if the master is the only system that holds a complete set. Stokes, et al Expires February 2001 [Page 25] Internet-Draft LDAPv3 Replication Requirements August 2000 . If data privacy mechanisms (e.g. encryption) are in use during replication, the new master would need to have the necessary key information to talk to all of the slave replicas. . It is not only the new master that needs to be reconfigured. The slaves also need to have their configurations updated so they know where updates should come from and where they should refer modifications. . The failover mechanism should be able to handle a situation where the old master is "broken" but not "dead". The slave replicas should ignore updates from the old master after failover is initiated. . The old master will eventually be repaired and returned to the replica ring. It might join the ring as a slave and pick up the changes it has "missed" from the new master, or there might be some mechanism to bring it into sync with the new master and then let it take over as master. Some resynchronization mechanism will be needed. . Availability would be maximized if the whole failover process could be automated (e.g. failover is initiated by an external system when it determines that the original master is not functioning properly). B.8. Including Operational Attributes in Atomic Operations LDAPv3 [RFC2251] declares that some operations are atomic (e.g. all of the modifications in a single ModifyRequest). It also defines several operational attributes that store information about when changes are made to the directory (createTimestamp, etc.) and which ID was responsible for a given change (modifiersName, etc.). Currently, there is no statement in RFC2251 requiring that changes to these operational attributes be atomic with the changes to the data. It is RECOMMENDED that this requirement be added during the revision of RFC2251. In the interim, replication SHOULD treat these operations as though such a requirement were in place. Authors' Addresses Russel F. Weiser Digital Signature Trust Co. 1095 East 2100 South Suite #201 Salt Lake City, Utah 84106 Stokes, et al Expires February 2001 [Page 26] Internet-Draft LDAPv3 Replication Requirements August 2000 USA E-mail: rweiser@digsigtrust.com Telephone: +1 801 246 4323 Fax: +1 801 246 4361 Ellen J. Stokes Tivoli Systems 6300 Bridgepoint Parkway Austin, Texas 78731 USA E-mail: estokes@tivoli.com Telephone: +1 512 436 9098 Fax: +1 512 436 1199 Ryan D. Moats Coreon, Inc. 15621 Drexel Circle Omaha, NE 68135 USA E-Mail: rmoats@coreon.com Telephone: +1 402 894 9456 Richard V. Huber Room C3-3B30 AT&T Laboratories 200 Laurel Avenue South Middletown, NJ 07748 USA E-Mail: rvh@att.com Telephone: +1 732 420 2632 Fax: +1 732 368 1690 Full Copyright Statement Copyright (C) The Internet Society (2000). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined Stokes, et al Expires February 2001 [Page 27] Internet-Draft LDAPv3 Replication Requirements August 2000 in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society.