Network Working Group N. Barry Internet-Draft Stellar Development Foundation Intended status: Experimental D. Mazieres Expires: September 6, 2018 Stanford University J. McCaleb Stellar Development Foundation S. Polu Stripe Inc. March 5, 2018 The Stellar Consensus Protocol (SCP) draft-mazieres-dinrg-scp-00 Abstract SCP is a protocol for Internet-level Byzantine agreement. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on September 6, 2018. Copyright Notice Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of Barry, et al. Expires September 6, 2018 [Page 1] Internet-Draft scp March 2018 the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. The Model . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.1. Configuration . . . . . . . . . . . . . . . . . . . . . . 3 2.2. Input and output . . . . . . . . . . . . . . . . . . . . 3 3. Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.1. Basic types . . . . . . . . . . . . . . . . . . . . . . . 4 3.2. Quorum slices . . . . . . . . . . . . . . . . . . . . . . 5 3.3. Nomination . . . . . . . . . . . . . . . . . . . . . . . 5 3.4. Prepare messages . . . . . . . . . . . . . . . . . . . . 7 3.5. Confirm messages . . . . . . . . . . . . . . . . . . . . 9 3.6. Externalize messages . . . . . . . . . . . . . . . . . . 10 3.7. Message envelopes . . . . . . . . . . . . . . . . . . . . 11 4. Security considerations . . . . . . . . . . . . . . . . . . . 12 5. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 12 6. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 6.1. Normative References . . . . . . . . . . . . . . . . . . 13 6.2. Informative References . . . . . . . . . . . . . . . . . 13 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 1. Introduction Various aspects of Internet infrastructure depend on irreversible and transparent updates to data sets such as authenticated mappings [cite Li-Man-Watson draft]. Examples include public key certificates mapping domain names to public keys, transparency logs [RFC6962], HSTS [RFC6797] and HPKP [RFC7469] preload lists. The Stellar Consensus Protocol (SCP) specified in this draft allows various Internet infrastructure stakeholders to collaborate in applying irreversible transactions to public state. SCP is an open Byzantine agreement protocol that resists Sybil attacks by allowing individual parties to specify minimum quorum memberships in terms of specific trusted peers. Hence, participants can choose specific peers they know to be trustworthy and enjoy safety so long as the peers they depend on honestly obey the protocol. An more detailed abstract description of the SCP protocol and English-language proof of safety is available in [SCP]. This document specifies the end-system logic and wire format of the messages. Barry, et al. Expires September 6, 2018 [Page 2] Internet-Draft scp March 2018 2. The Model 2.1. Configuration Each participant or _node_ in the SCP protocol has a digital signature key and is named by the corresponding public key, which we term a "NodeID". Each node also selects one or more groups of nodes (each of which includes itself) called _quorum slices_. A quorum slice represents a large or important enough set of peers that the node selecting the quorum slice believes the slice collectively speaks for the whole network. A _quorum_ is a non-empty set of nodes containing at least one quorum slice of each of its members. For instance, suppose "v1" has the single quorum slice "{v1, v2, v3}", and each of "v2", "v3", and "v4" has the single quorum slice "{v2, v3, v4}". In this case, "{v2, v3, v4}" is a quorum, because it contains a slice for each member. On the other hand "{v1, v2, v3}" is not a quorum, because it does not contain a quorum slice for "v2" or "v3". The smallest quorum including "v1" in this example is the set of all nodes "{v1, v2, v3, v4}". Unlike traditional Byzantine agreement protocols, nodes in SCP only care about quorums to which they belong themselves (and hence which contain at least one of their quorum slices). Intuitively, this is what protects nodes from Sybil attacks. In the example above, if "v3" deviates from the protocol maliciously invents 96 Sybils "v5, v6, ..., v100", the honest nodes quorums will all still include one another. Every message in the SCP protocol specifies the sender's quorum slices. Hence, by collecting messages a node dynamically learns what constitutes a quorum, and can decide when a particular message has been sent by a quorum to which it belongs. (Again, nodes to not care about quorums to which they do not belong themselves.) 2.2. Input and output The SCP protocol outputs a series of _values_ each associated with a consecutively numbered _slot_. 5 seconds after SCP outputs the value for one slot, nodes restart the protocol to select a value for the next slot. The time between slots is used to construct candidate values for the next slot, as well as to amortize the cost of consensus over an arbitrary-sized batch of operations. Barry, et al. Expires September 6, 2018 [Page 3] Internet-Draft scp March 2018 There must exist a deterministic combining function that reduces multiple candidate values into a single _composite_ value. Hence, should nodes nominate multiple values, they can still deterministically converge on the same composite value. Note that this does not mean that every node has a say in the composite value for every slot. Whether or not a node can in practice influence a given slot depends on how many other nodes include that node in their quorum slices as well as how the current slot number perturbs a cryptographic hash of nodes' public keys. 3. Protocol The protocol consists of exchanging digitally-signed messages containing nodes' quorum slices. When multiple nodes send the same messages, two thresholds have significance throughout the protocol for each node "v". o quorum threshold: When "v" node is a member of a quorum in which every member (including "v") has issued some signed statement. o blocking threshold: When every one of "v"'s quorum slices has a member issuing some signed statement (which doesn't necessarily include "v"). Message formats are specified using XDR [RFC4506]. 3.1. Basic types SCP employs 32- and 64-bit integers, as defined below. typedef unsigned int uint32; typedef int int32; typedef unsigned hyper uint64; typedef hyper int64; SCP uses the SHA-256 cryptograhpic hash function [RFC6234], and represents hash values as a simple array of 32 bytes. typedef opaque Hash[32]; SCP employes the Ed25519 digital signature algorithm [RFC8032]. For purposes of cryptographic agility, however, public keys are represented as a union type that can be compatibly extended with other key types. Barry, et al. Expires September 6, 2018 [Page 4] Internet-Draft scp March 2018 typedef opaque uint256[32]; enum PublicKeyType { PUBLIC_KEY_TYPE_ED25519 = 0 }; union PublicKey switch (PublicKeyType type) { case PUBLIC_KEY_TYPE_ED25519: uint256 ed25519; }; // variable size as the size depends on the signature scheme used typedef opaque Signature<64>; Nodes are public keys, while values are simply opaque arrays of byes. typedef PublicKey NodeID; typedef opaque Value<>; 3.2. Quorum slices Theoretically a quorum slice can be an arbitrary set of sets of nodes. However, it would not be possible to represent an arbitrary predicate on sets concisely. Instead we specify quorum slices as any set of k-of-n members, where each of the n members can either be an individual node ID, or, recursively, another k-of-n predicate. // supports things like: A,B,C,(D,E,F),(G,H,(I,J,K,L)) // only allows 2 levels of nesting struct SCPQuorumSet { uint32 threshold; // the n in k-of-n PublicKey validators<>; SCPQuorumSet innerSets<>; }; 3.3. Nomination Nomination message are sent in the first phase of each slot in an attempt to select a value on which to try to agree. Barry, et al. Expires September 6, 2018 [Page 5] Internet-Draft scp March 2018 struct SCPNomination { Hash quorumSetHash; // D Value votes<>; // X Value accepted<>; // Y }; As with all messages, the nomination protocol includes a SHA-256 hash of the sender's XDR-encoded SCPQuorumSet. It also includes two monotonically-growing sets of values. "votes" consists of candidate values nominated by the sender. Each node progresses through a series of nomination _rounds_ in which it potentially adds more and more values to "votes" by adding "votes" for valid values received from a growing set of peers to its own set of "votes". Each node computes the set of peers whose nomination "votes" it should echo based on its quorum slices as follows for round "n" of slot "i": o Let "Gi(m) = SHA-256(i || output[i-1] || m)", where "output[i-1]" is the consensus output of slot i-1 or the zero-byte value for slot 0. (Recall values are encoded as an XDR opaque vector, with a 32-byte length followed by bytes padded to a multiple of 4 bytes.) Treat the output of "Gi" as a 256-bit binary number in big-endian format. o For each peer "v", define "weight(v)" as the faction of quorum slices containing "v". o Define the set of nodes "neighbors(n)" as the set of nodes v for which "Gi("N" || n || v) < 2^{256}-1 * weight(v)". o Define "priority(n, v)" as "Gi("P" || n || v)". For each round "n" until nomination starts, a node picks the available peer "v" with the highest value of priority in the given round from among the nodes in "neighbors(n)", and adds any values in that node's "votes" set to its own "votes" set. Round "n" lasts for "2+n" seconds, after which if nomination has not finished (see below), a node proceeds to round "n+1". If a particular valid value "x" reaches quorum threshold in the messages sent by peers (meaning that every node in a quorum contains "x" either in the "votes" or the "accepted" field), then a node moves "x" from its "votes" field to its "accepted" field and broadcasts a new "SCPNomination" message. Similarly, if "x" reaches blocking threshold in a node's peers' "accepted" field (meaning every one of a node's quorum slices contains at least one node with "x" in its Barry, et al. Expires September 6, 2018 [Page 6] Internet-Draft scp March 2018 "accepted" field), then the node adds "x" to its "accepted" field (removing it from "votes" if applicable). The nomination process terminates at a node when any value "x" reaches quorum threshold in the "accepted" fields, meaning every node in a quorum that includes the node has sent a signed "SCPNomination" message in which the "accepted" field contains "x". Once nomination terminates, nodes stop adding new values to their "votes" sets, but potentially continues adding new values to "accepted" as previously described. 3.4. Prepare messages Once the nomination process is complete, meaning at least one "accepted" value has reached quorum threshold, a node moves on to the balloting phase of the protocol. A ballot is a pair of a counter ("n") and candidate value ("x"): struct SCPBallot { uint32 counter; // n Value value; // x }; The counter beings at 1, and is incremented to higher numbers if ballot 1 fails to reach consensus on an output value. The value "x" begins as the output of the deterministic combining function on all values that have achieved quorum threshold in the "accepted" fields of "SCPNomination" messages (meaning all values for which the local node is part of a quorum in which every node includes the value in the "accepted" field of an "SCPNomination" message). Note, however, that "x" may change in subsequent ballots if more values reach this quorum threshold (since nomination keeps running in the background, even though a node does not add new values to its "votes" field in "SCPNomination"). Moreover, as described below, "x" may change if a particular ballot makes it a certain way through the protocol before failing. Once they have selected "x" for a particular ballot, nodes attempt to "prepare" the ballot by sending the following message: Barry, et al. Expires September 6, 2018 [Page 7] Internet-Draft scp March 2018 struct SCPPrepare { Hash quorumSetHash; // D SCPBallot ballot; // b SCPBallot* prepared; // p SCPBallot* preparedPrime; // p' uint32 nC; // c.n uint32 nH; // h.n }; The fields have the following meaning: o "quorumSetHash" - the SHA-256 hash of the sending node's "SCPQuorumSet" structure. o "ballot" - the current ballot (see below) o "prepared" - the highest ballot "" for which one of the following two thresholds has been met: * The sending node is part of a quorum in which each node included "" for "n' >= n" in one of the "ballot", "prepared" or "preparedPrime" fields of an "SCPPrepare" message. * Every one of the sending node's quorum slices contains at least one node that sent an "SCPPrepare" containing some "" with "n' >= n" in either the "prepared" or "preparedPrime" field. If no such ballot exists, then "prepared" is "NULL". o "preparedPrime" - the highest ballot "" that satisfies the same criteria as "prepared", but has a different value "x" from that in the "prepared" field. "NULL" if no such ballot exists. o "nH" - the counter from the highest ballot "" for which the sender is in a quorum in which each member has sent an "SCPPrepare" message with one of the following properties (or "nH = 0" if no such ballot exists): * The "prepared" field contains "" where "n' >= n", or * The "preparedPrime" field contains "" where "n' >= n" and "x'" can be any value. o "nC" - the counter for the lowest ballot the sender is attempting to confirm (see below), otherwise 0. Barry, et al. Expires September 6, 2018 [Page 8] Internet-Draft scp March 2018 The "value" field "x" of each ballot is selected as follows. If "nH = 0", then use the deterministic combination function on all values that have made it all the way through the nomination protocol to produce a candidate value. Otherwise, use the value "x" associated with the ballot that produced "nH". The "counter" field "n" of each ballot is selected as follows. o Initially, "n = 1". o If a node is still sending "SCPPrepare" or "SCPConfirm" messages (meaning it has not yet output a value for a particular slot), then a node arms a timer whenever "n" reaches quorum threshold of ballots (meaning the node is a member of a quorum in which each member is sending explicit or implicit "SCPPrepare" messages with ballot "counter" >= "n"). o If the timer fires, a node increments "n" and determines a new value depending on the latest state of the nomination protocol and "nH", as discussed above. o If a "counter" value greater than "n" ever reaches a blocking threshold, then a node immediately disables any pending timer and increases "n" to the blocking "counter" value, recomputing "value" in the usual way. The value "nC" is maintained based on an internally-maintained "commit ballot" "c", where initially "c = cNULL = <0, NULL>" (for some arbitrary or invalid value "NULL"). o If either "prepared" or "preparedPrime" has "counter" greater than "c"'s and a different "value", then reset "c = cNULL". o If "c = cNULL" and the ballot determining "nH" hasn't been aborted by "prepared" or "preparedPrime", then set "c" to the lowest ballot containing the value of that ("nH") ballot that is between "ballot" and the "nH" ballot. o If some ballot with the same value reaches quorum threshold betwen "c" and the "nH" ballot, move to the confirm phase. 3.5. Confirm messages Barry, et al. Expires September 6, 2018 [Page 9] Internet-Draft scp March 2018 struct SCPConfirm { SCPBallot ballot; // b uint32 nPrepared; // p.n uint32 nCommit; // c.n uint32 nH; // h.n Hash quorumSetHash; // D }; This message conveys an implicit "SCPPrepare" message with the following fields: o "quorumSetHash" - same o "ballot" - "" where "x" is the value in the "SCPConfirm message's"ballot`. o "prepared" - same o "preparedPrime" - "NULL" o "nC" - same o "nH" - "infinity" (Note that "infinity" here just represents 2^{32} -- an integer one greater than the largest representable value on the wire.) 3.6. Externalize messages struct SCPExternalize { SCPBallot commit; // c uint32 nH; // h.n Hash commitQuorumSetHash; // D used before EXTERNALIZE } externalize; An "SCPExternalize" message conveys two implicit "SCPConfirm" messages. The first one has the following fields: o "ballot" - "" where "x" is the value from the "SCPExternalize" ballot. o "nPrepared" - "infinity" o "nCommit" - the "counter" field from the "SCPExternalize" messages "commit" field. Barry, et al. Expires September 6, 2018 [Page 10] Internet-Draft scp March 2018 o "nH" - "infinity" o "quorumSetHash" - the "commitQuorumSetHash" from the "SCPExternalize" message. The second one has the same fields as the first, except for the last two: o "nH" - the "nH" value from the "SCPExternalize" message o "quorumSetHash" - a trivial "SCPQuorumSet" that contains a single slice whose only member is the sender of the "SCPExternalize" message 3.7. Message envelopes In order to provide full context for each signed message, all signed messages are part of an "SCPStatement" union type that includes the "slotIndex" naming the slot to which the message applies, as well as the "type" of the message. A signed message and its signature are packed together in an "SCPEnvelope" structure. Barry, et al. Expires September 6, 2018 [Page 11] Internet-Draft scp March 2018 enum SCPStatementType { SCP_ST_PREPARE = 0, SCP_ST_CONFIRM = 1, SCP_ST_EXTERNALIZE = 2, SCP_ST_NOMINATE = 3 }; struct SCPStatement { NodeID nodeID; // v (node signing message) uint64 slotIndex; // i union switch (SCPStatementType type) { case SCP_ST_PREPARE: SCPPrepare prepare; case SCP_ST_CONFIRM: SCPConfirm confirm; case SCP_ST_EXTERNALIZE: SCPExternalize externalize; case SCP_ST_NOMINATE: SCPNomination nominate; } pledges; }; struct SCPEnvelope { SCPStatement statement; Signature signature; }; 4. Security considerations If nodes do not pick quorum slices well, the protocol will not be safe. 5. Acknowledgments The Stellar development foundation supported development of the protocol and produced the first production deployment of SCP. The IRTF DIN group including Dirk Kutscher, Sydney Li, Colin Man, Melinda Shore, and Jean-Luc Watson helped with the framing and motivation for this specification. Barry, et al. Expires September 6, 2018 [Page 12] Internet-Draft scp March 2018 6. References 6.1. Normative References [RFC4506] Eisler, M., Ed., "XDR: External Data Representation Standard", STD 67, RFC 4506, DOI 10.17487/RFC4506, May 2006, . [RFC6234] Eastlake 3rd, D. and T. Hansen, "US Secure Hash Algorithms (SHA and SHA-based HMAC and HKDF)", RFC 6234, DOI 10.17487/RFC6234, May 2011, . [RFC8032] Josefsson, S. and I. Liusvaara, "Edwards-Curve Digital Signature Algorithm (EdDSA)", RFC 8032, DOI 10.17487/RFC8032, January 2017, . 6.2. Informative References [RFC6797] Hodges, J., Jackson, C., and A. Barth, "HTTP Strict Transport Security (HSTS)", RFC 6797, DOI 10.17487/RFC6797, November 2012, . [RFC6962] Laurie, B., Langley, A., and E. Kasper, "Certificate Transparency", RFC 6962, DOI 10.17487/RFC6962, June 2013, . [RFC7469] Evans, C., Palmer, C., and R. Sleevi, "Public Key Pinning Extension for HTTP", RFC 7469, DOI 10.17487/RFC7469, April 2015, . [SCP] Mazieres, D., "The Stellar Consensus Protocol: A Federated Model for Internet-level Consensus", Stellar Development Foundation whitepaper , 2015, . Authors' Addresses Nicolas Barry Stellar Development Foundation 170 Capp St., Suite A San Francisco, CA 94110 US Email: nicolas@stellar.org Barry, et al. Expires September 6, 2018 [Page 13] Internet-Draft scp March 2018 David Mazieres Stanford University 353 Serra Mall, Room 290 Stanford, CA 94305 US Email: dm@uun.org Jed McCaleb Stellar Development Foundation 170 Capp St., Suite A San Francisco, CA 94110 US Email: jed@stellar.org Stanislas Polu Stripe Inc. 185 Berry Street, Suite 550 San Francisco, CA 94107 US Email: stan@stripe.com Barry, et al. Expires September 6, 2018 [Page 14]