Independent Stream                                       M. Schanzenbach
Internet-Draft                                          Fraunhofer AISEC
Intended status: Informational                               C. Grothoff
Expires: 1 July 2023                               Berner Fachhochschule
                                                                  B. Fix
                                                             GNUnet e.V.
                                                        28 December 2022


                     The R5N Distributed Hash Table
                         draft-schanzen-r5n-01

Abstract

   This document contains the R^5N DHT technical specification.  R^5N is
   a secure distributed hash table (DHT) routing algorithm and data
   structure for decentralized applications.  It features an open peer-
   to-peer overlay routing mechanism which supports ad-hoc
   permissionless participation and support for topologies in
   restricted-route environments.

   This document defines the normative wire format of protocol messages,
   routing algorithms, cryptographic routines and security
   considerations for use by implementers.

   This specification was developed outside the IETF and does not have
   IETF consensus.  It is published here to guide implementation of R^5N
   and to ensure interoperability among implementations including the
   pre-existing GNUnet implementation.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 1 July 2023.


Schanzenbach, et al.       Expires 1 July 2023                  [Page 1]

Internet-Draft       The R5N Distributed Hash Table        December 2022


Copyright Notice

   Copyright (c) 2022 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
     1.1.  Requirements Notation . . . . . . . . . . . . . . . . . .   4
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   4
   3.  Overview  . . . . . . . . . . . . . . . . . . . . . . . . . .   6
   4.  Underlay  . . . . . . . . . . . . . . . . . . . . . . . . . .   7
   5.  Routing . . . . . . . . . . . . . . . . . . . . . . . . . . .   9
     5.1.  Routing Table . . . . . . . . . . . . . . . . . . . . . .  10
     5.2.  Peer Discovery  . . . . . . . . . . . . . . . . . . . . .  10
     5.3.  Peer Bloom Filter . . . . . . . . . . . . . . . . . . . .  11
     5.4.  Routing Functions . . . . . . . . . . . . . . . . . . . .  12
     5.5.  Pending Table . . . . . . . . . . . . . . . . . . . . . .  13
   6.  Message Processing  . . . . . . . . . . . . . . . . . . . . .  14
     6.1.  Message components  . . . . . . . . . . . . . . . . . . .  14
       6.1.1.  Header  . . . . . . . . . . . . . . . . . . . . . . .  14
       6.1.2.  Flags . . . . . . . . . . . . . . . . . . . . . . . .  15
       6.1.3.  Path Element  . . . . . . . . . . . . . . . . . . . .  15
     6.2.  HelloMessage  . . . . . . . . . . . . . . . . . . . . . .  20
       6.2.1.  Wire Format . . . . . . . . . . . . . . . . . . . . .  20
       6.2.2.  Processing  . . . . . . . . . . . . . . . . . . . . .  21
     6.3.  PutMessage  . . . . . . . . . . . . . . . . . . . . . . .  21
       6.3.1.  Wire Format . . . . . . . . . . . . . . . . . . . . .  22
       6.3.2.  Processing  . . . . . . . . . . . . . . . . . . . . .  23
     6.4.  GetMessage  . . . . . . . . . . . . . . . . . . . . . . .  25
       6.4.1.  Wire Format . . . . . . . . . . . . . . . . . . . . .  25
       6.4.2.  Result Filter . . . . . . . . . . . . . . . . . . . .  27
       6.4.3.  Processing  . . . . . . . . . . . . . . . . . . . . .  27
     6.5.  ResultMessage . . . . . . . . . . . . . . . . . . . . . .  29
       6.5.1.  Wire Format . . . . . . . . . . . . . . . . . . . . .  29
       6.5.2.  Processing  . . . . . . . . . . . . . . . . . . . . .  31
   7.  Blocks  . . . . . . . . . . . . . . . . . . . . . . . . . . .  33
     7.1.  Block Operations  . . . . . . . . . . . . . . . . . . . .  33
     7.2.  HELLO Blocks  . . . . . . . . . . . . . . . . . . . . . .  35


Schanzenbach, et al.       Expires 1 July 2023                  [Page 2]

Internet-Draft       The R5N Distributed Hash Table        December 2022


     7.3.  Persistence . . . . . . . . . . . . . . . . . . . . . . .  38
       7.3.1.  Approximate Search Considerations . . . . . . . . . .  39
       7.3.2.  Caching Strategy Considerations . . . . . . . . . . .  39
   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  40
     8.1.  Approximate Result Filtering  . . . . . . . . . . . . . .  40
     8.2.  Access control  . . . . . . . . . . . . . . . . . . . . .  40
   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  40
     9.1.  GNUnet URI Scheme Registration  . . . . . . . . . . . . .  41
     9.2.  R5N URI Scheme Registration . . . . . . . . . . . . . . .  41
   10. GANA Considerations . . . . . . . . . . . . . . . . . . . . .  41
     10.1.  Block Type Registry  . . . . . . . . . . . . . . . . . .  41
     10.2.  GNUnet URI schema Subregistry  . . . . . . . . . . . . .  42
     10.3.  GNUnet Signature Purpose Registry  . . . . . . . . . . .  42
     10.4.  GNUnet Message Type Registry . . . . . . . . . . . . . .  42
   11. Test Vectors  . . . . . . . . . . . . . . . . . . . . . . . .  43
   12. Normative References  . . . . . . . . . . . . . . . . . . . .  43
   13. Informative References  . . . . . . . . . . . . . . . . . . .  44
   Appendix A.  Bloom filters in R^5N  . . . . . . . . . . . . . . .  44
   Appendix B.  Overlay Operations . . . . . . . . . . . . . . . . .  46
     B.1.  GET operation . . . . . . . . . . . . . . . . . . . . . .  46
     B.2.  PUT operation . . . . . . . . . . . . . . . . . . . . . .  47
   Appendix C.  HELLO URLs . . . . . . . . . . . . . . . . . . . . .  48
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  49

1.  Introduction

   This specification describes the protocol of R^5N.  R^5N is a
   Distributed Hash Table (DHT) is an acronym for "randomized recursive
   routing for restricted-route networks" and its first academic
   description can be found in [R5N].

   DHTs are a key data structure for the construction of decentralized
   applications.  and they generally provide a robust and efficient
   means to distribute the storage and retrieval of key-value pairs.

   The core idea behind R^5N is to combine an initial randomized routing
   algorithm with an efficient, classical closest-peer algorithm.  This
   allows us to construct an algorithm that is able to escape and
   circumvent restricted route environments while at the same time allow
   for O(log n) routing complexity.

   R^5N also includes advanced features like tracing paths messages take
   through the network, response filters and on-path application-
   specific data validation.

   This document defines the normative wire format of peer-to-peer
   messages, routing algorithms, cryptographic routines and security
   considerations for use by implementors.


Schanzenbach, et al.       Expires 1 July 2023                  [Page 3]

Internet-Draft       The R5N Distributed Hash Table        December 2022


1.1.  Requirements Notation

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

2.  Terminology

   Address  is a UTF-8 [RFC3629] URI [RFC3986] which can be used as
      address to contact a peer.  An example of an addressing scheme
      used in this document is "r5n+ip+tcp", which refers to a standard
      TCP/IP socket connection.  The "hier"-part of the URI must provide
      a suitable address for the given addressing scheme.  The following
      is a non-normative example of address strings:

      r5n+ip+udp://1.2.3.4:6789/
      gnunet+tcp://12.3.4.5/

                       Figure 1: Example Address URIs.

   Applications  Applications are components which directly use the DHT
      overlay interfaces.  Possible applications include the GNU Name
      System [I-D.schanzen-gns] and the CADET transport system [cadet].

   Application API  The application API exposes the core operations of
      the DHT overlay to applications.  This includes storing blocks in
      the DHT and retrieving blocks from the DHT.

   Block  Variable-size unit of payload stored in the DHT under a Key.
      Commonly also called a "value" when talking about a DHT as a "key-
      value store".

   Block Storage  The Block Storage component is used to persist and
      manage Block data by peers.  It includes logic for enforcing
      storage quotas, caching strategies and data validation.

   Block-Type  A unique 32-bit value identifying the data format of a
      Block.  Block-Types are either private or registered in the GANA
      block type registry (see Section 10.1).

   Initiator  The peer that initially creates and sends a message
      (Section 6.2, Section 6.3, Section 6.4, Section 6.5).

   HELLO block  A HELLO block is a block with a dedicated block type and


Schanzenbach, et al.       Expires 1 July 2023                  [Page 4]

Internet-Draft       The R5N Distributed Hash Table        December 2022


      is specified in this document.  The HELLO block is used to store
      and retrieve Peer addresses.  In this document, HELLO blocks are
      used by the peer discovery mechanism.

   HELLO URL  HELLO URLs are URL-formatted HELLO blocks.  They can used
      for out-of-band exchanges of peer information and are used for
      address update signalling messages to neighbours.

   Key  512-bit identifier of a location in the DHT.  Multiple Blocks
      can be stored under the same key.  Peer Addresses are valid keys.

   Message Processing  The Message Processing component processes
      requests from and generates responses to applications and the
      underlay network.

   Neighbor  A neighbor is a peer which is directly able to communicate
      with our peer via the Underlay Interface.

   Peer  A host that is participating in the overlay.  Peers are
      responsible for holding some portion of the data that has been
      stored in the overlay, and they are responsible for routing
      messages on behalf of other hosts as needed by the Routing
      Algorithm.

   Peer Address  The Peer Address is the identifier used on the Overlay
      to address a peer.  It is a SHA-512 hash of the Peer ID.

   Peer ID  The Peer ID is the public key which is used to authenticate
      a peer in the underlay.  The Peer ID is the public key of the
      corresponding Ed25519[ed25519] peer private key.

   Routing  The Routing component includes the routing table as well as
      routing and peer selection logic.  It facilitates the R^5N routing
      algorithm with required data structures and algorithms.

   Responsible Peer  The peer N that is responsible for a specific key
      K, as defined by the SelectClosestPeer(K, P) algorithm (see
      Section 5.

   Underlay Interface  The Underlay Interface is an abstraction layer on
      top of the supported links of a peer.  Peers may be linked by a
      variety of different transports, including "classical" protocols
      such as TCP, UDP and TLS or advanced protocols such as GNUnet, I2P
      or Tor.


Schanzenbach, et al.       Expires 1 July 2023                  [Page 5]

Internet-Draft       The R5N Distributed Hash Table        December 2022


3.  Overview

   In R^5N peers communicate with each other in order to realize and
   maintain two basic operations of a distributed hash table:

   *  PUT: This operation stores a block payload on one or more peers
      with the goal of making the block availiable for queries using the
      GET operation.  In the classical definition of a dictionary
      interface, this operation would be called "insert".

   *  GET: This operation queries the network of peers for blocks
      previously stored under or near the key.  In the classical
      definition of a dictionary interface, this operation would be
      called "find".

   A peer or its implementation does not necessarily need to expose the
   above operations to applications but it commonly will.  For example,
   the peer could be a server purely used for bootstrapping, routing or
   supporting the overlay network with resources.  An example for
   possible semantics of the above operations provided as an API to
   applications by an implementation are outlined in Appendix B.

   In a trivial scenario where there is only one peer (the local host),
   R^5N operates in a very similar fashion to a dictionary data
   structure.  However, the default use case is one where nodes
   communicate directly and indirectly in order to realize a distributed
   storage mechanism.  This communication requires a lower-level peer
   addressing and message transport mechanism such as TCP/IP.  R^5N is
   agnostic to the underlying transport protocol which is why this
   document defines a common addressing and messaging interface in
   Section 4.  The interface provided by this underlay is used across
   the specification of the R^5N protocol.  It also serves as a set of
   requirements of possible transport mechanisms that can be used to
   implement R^5N with.  That being said, common transport protocols
   such as TCP/IP or UDP/IP and their interfaces are suitable R^5N
   underlays used by existing implementations.

   Specifics about the protocols of the underlays providing connectivity
   or the applications using the DHT are out of the scope of this
   document.  However, we note that peers implementing disjoint sets of
   underlay protocols may experience difficulties communicating (unless
   other peers bridge the respective underlays).  Similarly, peers that
   do not support a particular application will not be able to validate
   application-specific payloads and may thus be tricked into storing or
   forwarding corrupt blocks.


Schanzenbach, et al.       Expires 1 July 2023                  [Page 6]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   In order to establish an initial connection to a network of R^5N
   peers, an initial, addressable bootstrap peer is required.  Further
   peers, including neighbors, are then learned via a peer discovery
   process as defined in Section 5.2.

   Across this document, the functional components of an R^5N
   implementation are divided into routing (Section 5), message
   processing (Section 6) and block processing (Section 7).
   Applications that require application-specific block payloads are
   expected to register a block type in the GANA block type registry
   (Section 10.1) and provide a specification of the associated block
   operations (Section 7.1).  to implementors of R^5N.  Figure 2
   illustrates the architectural overview of R^5N.

                |  +-----------------+  +-------+
   Applications |  | GNU Name System |  | CADET |  ...
                |  +-----------------+  +-------+
   -------------+------------------------------------ Application API
                |  ^
                |  |   +---------------+
                |  |   | Block Storage |
                |  |   +---------------+
                |  |    ^
   R5N          |  v    v
                | +--------------------+    +---------+
                | | Message Processing |<-->| Routing |
                | +--------------------+    +---------+
                |  ^                          ^
                |  v                          v
   -------------+------------------------------------ Underlay Interface
                | +--------+  +--------+
                | |GNUnet  |  |IP      |  ...
   Connectivity | |Underlay|  |Underlay|
                | |Link    |  |Link    |
                | +--------+  +--------+


                      Figure 2: The R5N architecture.

4.  Underlay

   In the network underlay, a peer is addressable by traditional means
   out of scope of this document.  For example, the peer may have a TCP/
   IP address, or a HTTPS endpoint.  While the specific addressing
   options and mechanisms are out of scope for this document, it is
   necessary to define a universal addressing format in order to
   facilitate the distribution of connectivity information to other
   peers in the DHT overlay.  This format is the "HELLO" Block


Schanzenbach, et al.       Expires 1 July 2023                  [Page 7]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   (described in Section 7.2), which contains URIs.  The scheme of each
   URI indicates which underlay understands the respective address given
   in the rest of the URI.

   It is expected that the underlay provides basic mechanisms to manage
   peer connectivity and addressing.  The required functionalities can
   be represented by the following API:

   TRY_CONNECT(N, A)  A function which allows the local peer to attempt
      the establishment of a connection to another peer N using an
      address A.  When the connection attempt is successful, information
      on the new peer is offered through the PEER_CONNECTED signal.

   HOLD(P)  A function which tells the underlay to keep a hold on to a
      connection to a peer P.  Underlays are usually limited in the
      number of active connections.  With this function the DHT can
      indicate to the underlay which connections should preferably be
      preserved.

   DROP(P)  A function which tells the underlay to drop the connection
      to a peer P.  This function is only there for symmetry and used
      during the peer's shutdown to release all of the remaining HOLDs.
      As R^5N always prefers the longest-lived connections, it would
      never drop an active connection that it has called HOLD() on
      before.  Nevertheless, underlay implementations should not rely on
      this always being true.  A call to DROP() also does not imply that
      the underlay must close the connection: it merely removes the
      preference to preserve the connection that was established by
      HOLD().

   SEND(P, M)  A function that allows the local peer to send a protocol
      message M to a peer P.

   L2NSE = ESTIMATE_NETWORK_SIZE()  A procedure that provides an
      estimate of the network size.  The result, L2NSE, must be the
      base-2 logarithm of the estimated number of peers in the network.
      It is used by the routing algorithm.  If the underlay does not
      support a protocol for network size estimation (such as cite paper
      NSE) the value is assumed to be provided as a configuration
      parameter to the implementation.

   The above procedures are meant to be actively executed by the
   implementation as part of the peer-to-peer protocol.  In addition,
   the underlay is expected to emit the following signals (usually
   implemented as callbacks) based on network events observed by the
   underlay implementation:

   PEER_CONNECTED -> P  is a signal that allows the DHT to react to a


Schanzenbach, et al.       Expires 1 July 2023                  [Page 8]

Internet-Draft       The R5N Distributed Hash Table        December 2022


      newly connected peer P.  Such an event triggers, for example,
      updates in the routing table and gossiping of HELLOs to that peer.

   PEER_DISCONNECTED -> P  is a signal that allows the DHT to react to a
      recently disconnected peer.  Such an event triggers, for example,
      updates in the routing table.

   ADDRESS_ADDED -> A  The underlay signals indicates that an address A
      was added for our local peer and that henceforth the peer may be
      reachable under this address.  This information is used to
      advertise connectivity information about the local peer to other
      peers.  A must be a URI suitable for inclusion in a HELLO payload
      Section 7.2.

   ADDRESS_DELETED -> A  This underlay signals indicates that an address
      A was removed from the set of addresses the local peer is possibly
      reachable under.  Addresses must have been added before they may
      be deleted.  This information is used to no longer advertise this
      address to other peers.

   RECEIVE -> (P, M)  This signal informs the local peer that a protocol
      message M was received from a peer P.

   These signals then drive updates of the routing table, local storage
   and message transmission.

5.  Routing

   In order to select peers which are suitable destinations for routing
   messages, R^5N uses a hybrid approach: Given an estimated network
   size N, the peer selection for the first N hops is random.  After the
   initial N hops, peer selection follows an XOR-based peer distance
   calculation.

   To enable routing, any R^5N implementation must keep information
   about its current set of neighbors.  Upon receiving a connection
   notification from the Underlay through PEER_CONNECTED, information on
   the new neighbor MUST be added to the routing table.  Peers added to
   the routing table SHOULD be signalled to the Underlay as important
   connections using HOLD.  Similarly when a disconnect is indicated by
   the Underlay through PEER_DISCONNECTED messages for all addresses of
   the peer it MUST be removed from the routing table.

   In order to achieve O(log n) routing performance, the data structure
   for managing neighbors and their metadata MUST be implemented using
   the k-buckets concept of [Kademlia] as defined in Section 5.1.
   Maintenance of the routing table (after bootstrapping) is described
   in Section 5.2.


Schanzenbach, et al.       Expires 1 July 2023                  [Page 9]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   Unlike [Kademlia], routing decisions in R^5N are also influenced by a
   Bloom filter in the message that prevents routing loops.  This data
   structure is discussed in Section 5.3.  Section 5.4 describes the key
   functions provided on top of these data structures.

5.1.  Routing Table

   Whenever a PEER_CONNECTED signal is received from the Underlay, the
   respective peer is considered for insertion into the routing table.
   The routing table consists of an array of k-buckets.  Each k-bucket
   contains a list of neighbors.  The i-th k-bucket stores neighbors
   whose peer IDs are between distance 2^i and 2^(i+1) from the local
   peer.  System constraints will typically force an implementation to
   impose some upper limit on the number of neighbors kept per k-bucket.
   Upon insertion, the implementation MUST call HOLD on the respective
   connection.

   Implementations SHOULD try to keep at least 5 entries per k-bucket.
   Embedded systems that cannot manage this number of connections MAY
   use connection-level signalling to indicate that they are merely a
   client utilizing a DHT and not able to participate in routing.  DHT
   peers receiving such connections MUST NOT include connections to such
   restricted systems in their k-buckets, thereby effectively excluding
   them when making routing decisions.

   If a system hits constraints with respect to the number of active
   connections, an implementation MUST evict peers from those k-buckets
   with the largest number of neighbors.  The eviction strategy MUST be
   to drop the shortest-lived connections first.

   The implementation MAY cache valid HELLOs of disconnected peers
   outside of the routing table and sporadically or periodically try to
   (re-)establish connection to the peer by issuing TRY_CONNECT requests
   on the Underlay.

5.2.  Peer Discovery

   Initially, the implementation depends upon either the Underlay
   providing at least one initial connection to a peer (signalled
   through PEER_CONNECTED), or the application/end-user providing at
   least one working HELLO which is then in turn used to call
   TRY_CONNECT on the Underlay in order to trigger a subsequent
   PEER_CONNECTED signal from the Underlay.  This is commonly achieved
   through the configuration of hardcoded bootstrap peers or bootstrap
   servers either for the Underlay or the R^5N implementation.  While
   details on how the first connection is established MAY depend on the
   specific implementation, this SHOULD usually be done by an out-of-
   band exchange of the information from a HELLO block.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 10]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   Section Appendix C specifies a URL format for encoding HELLO blocks
   as text strings which allow portable, human-readable, text-based
   serialization format that can, for example, be encoded into a QR for
   dissemination.  HELLO URLs SHOULD be supported by implementations for
   both import and export of HELLOs.

   To discover peers for its routing table, a peer will initiate
   GetMessage requests Section 6.4 asking for blocks of type HELLO using
   its own peer address as QUERY_HASH.  The PEER_BF is initialized and
   set using the peers own peer address as well as the addresses of all
   currently connected peers.  These requests MUST use the
   FindApproximate and DemultiplexEverywhere flags.  FindApproximate
   will ensure that other peers will reply with keys they merely
   consider close-enough, while DemultiplexEverywhere will cause each
   peer on the path to respond, which is likely to yield HELLO s of
   peers that are useful somewhere in the routing table.  The
   RECOMMENDED replication level set in the REPL_LVL field is 4.  The
   size and format of the result filter is specified in Section 7.2.
   The XQUERY is empty.

   In order to facilitate the above, the Underlay is expected to provide
   the implementation with one or more addresses signalled through
   ADDRESS_ADDED.  Zero addresses MAY be provided if a peer can only
   establish outgoing connections and is otherwise unreachable.  An
   implementation MUST advertise its addresses periodically to its
   neighbors through HelloMessages.  The advertisement interval and
   expiration should be configurable or chosen at the discretion of the
   implementation based on external factors such as DHCP leases.  The
   specific frequency of advertisements MAY depend on available
   bandwidth, the set of already connected neighbors, the workload of
   the system and other factors which are at the discretion of the
   developer, but SHOULD be a fraction of the expiration period.
   Whenever a peer receives such a HELLO message from another peer that
   is already in the routing table, it must cache it as long as that
   peer is in its routing table (or until the HELLO expires) and serve
   it in response to GET requests for HELLO blocks (see Section 6.4.3).
   This behaviour makes it unnecessary to initiate dedicated PutMessages
   containing HELLO blocks by the implementation.

5.3.  Peer Bloom Filter

   As DHT GetMessages and PutMessages traverse a random path through the
   network for the first N hops, it is essential that routing loops are
   avoided.  This peer Bloom filter is constant in size at L=1024
   buckets (128 bytes) and k=16 buckets per element.  The peer Bloom
   filter is part of the routing metadata in messages in order to
   prevent circular routes and is updated at each hop with the hops peer
   identity.  For the next hop selection in both the random and the


Schanzenbach, et al.       Expires 1 July 2023                 [Page 11]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   deterministic case, any peer which is in the Bloom filter for the
   respective message is not included in the peer selection process.

   Any peer which is forwarding GetMessages or PutMessages (Section 6)
   adds its own peer ID to the peer Bloom filter.  This allows other
   peers to (probabilistically) exclude already traversed peers when
   searching for the next hops in the routing table.

   The peer Bloom filter follows the definition in Appendix A.  The set
   of elements E consists of of all possible 256-bit peer IDs.  The
   mapping function M is defined as follows:

   M(e) -> SHA-512 (e) as uint32[]

   The element e is hashed using SHA-512.  The resulting byte string is
   interpreted as a string of k=16 32-bit integers in network byte order
   which are used to set and check the bucket bits in B using BF-SET and
   BF-TEST.

   We note that the peer Bloom filter may exclude peers due to false-
   postive matches.  This is acceptable as routing should nevertheless
   terminate (with high probability) in close vicinity of the key.

5.4.  Routing Functions

   Using the data structures described so far, the R^5N routing
   component provides the following functions for message processing
   (Section 6):

   GetDistance(A, B) -> Distance as Integer  This function calculates
      the binary XOR between A and B.  The resulting distance is
      interpreted as an integer where the leftmost bit is the most
      significant bit.

   SelectClosestpeer(K, B) -> N  This function selects the neighbor N
      from our routing table with the shortest XOR-distance to the key
      K.  This means that for all other peers N' in the routing table
      GetDistance(N, K) < GetDistance(N',K).  Peers with a positive test
      against the peer Bloom filter B are not considered.

   SelectRandompeer(B) -> N  This function selects a random peer N from
      all neighbors.  Peers with a positive test in the peer Bloom
      filter B are not considered.

   Selectpeer(K, H, B) -> N  This function selects a neighbor N
      depending on the number of hops H parameter.  If H <
      NETWORK_SIZE_ESTIMATE this function MUST return
      SelectRandompeer(B) and SelectClosestpeer(K, B) otherwise.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 12]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   IsClosestPeer(N, K, B) -> true | false  This function checks if N is
      the closest peer for K (cf.  SelectClosestpeer(K)).  Peers with a
      positive test in the Bloom filter B are not considered.

   ComputeOutDegree(REPL_LVL, HOPCOUNT, L2NSE) -> Number  This function
      computes the number of neighbors that a message should be
      forwarded to.  The arguments are the desired replication level
      (REPL_LVL), the HOPCOUNT of the message so far and and the current
      network size estimate (L2NSE) as provided by the underlay.  The
      result is the non-negative number of next hops to select.  The
      following figure gives the pseudocode for computing the number of
      neighbors the peer should attempt to forward the message to.

      function ComputeOutDegree(REPL_LVL, HOPCOUNT, L2NSE)
      BEGIN
        if (HOPCOUNT > L2NSE * 4)
          return 0;
        if (HOPCOUNT > L2NSE * 2)
          return 1;
        if (0 = REPL_LEVL)
          REPL_LEVL = 1
        if (REPL_LEVEL > 16)
          REPL_LEVEL = 16
        RM1 = REPL_LEVEL - 1
        return 1 + (RM1 / (L2NSE + RM1 * HOPCOUNT))

                 Figure 3: Computing the number of next hops.

      The above calculation may yield values that are not discrete.
      Hence, the result MUST be rounded probabilistically to the nearest
      discrete value, using the fraction as the probability for rounding
      up.  This probabillistic rounding is necessary to achieve the
      statistically expected value of the replication level and average
      number of peers a message is forwarded to.

5.5.  Pending Table

   R^5N performs stateful routing where the messages only carry the
   query hash and do not encode the ultimate source or destination of
   the request.  Routing a request towards the key is doing hop-by-hop
   using the routing table and the query hash.  The pending table is
   used to route responses back to the originator.  In the pending table
   each peer primarily associates a query hash with the associated
   originator of the request.  The pending table MUST store entries for
   the last MAX_RECENT requests the peer has encountered.  To ensure
   that the peer does not run out of memory, information about older
   requests is discarded.  The value of MAX_RECENT MAY be configurable
   and SHOULD be at least 128 * 10^3.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 13]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   For each entry in the pending table, the DHT MUST track not only the
   query key and the origin, but also the extended query, requested
   block type and flags, and the result filter.  If the query did not
   provide a result filter, a fresh result filter MUST still be created
   to filter duplicate replies.  Details of how a result filter works
   depend on the type, as described in Section 7.1.

   When a second query from the same origin for the same query hash is
   received, the DHT MUST attempt to merge the new request with the
   state for the old request.  If this is not possible, the existing
   result filter MUST be discarded and replaced with the result filter
   of the incoming message.

   We note that for local applications, a fixed limit on the number of
   concurrent requests may be problematic.  Hence, it is RECOMMENDED
   that implementations track requests from local applications
   separately and preserve the information until the application
   explicitly stops the request.

6.  Message Processing

   Further, the implementation MAY act as an initiator of messages.  If
   instructed through an application-facing API such as the one outlined
   in Appendix B, the peer may acts as an initiator of GetMessages or
   PutMessages.  The status of initiator is relevant for peers when
   processing ResultMessages and the potential handover of results to
   the application.

   The implementation MUST listen for RECEIVE(P, M) signals from the
   Underlay and respond to the respective messages sent by the peer P.

   Wheather initiated locally or received from a neighbour, the
   implementation processes the messages according to the wire formats
   and the required validations detailed in the following.  Where
   required, the local peer's ID is referred to as SELF.

6.1.  Message components

   This section describes some data structures and fields shared by
   various message types.

6.1.1.  Header

   A message header that identifies the message length and type is
   shared across all messages used in the R^5N protocol.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 14]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   0     8     16    24
   +-----+-----+-----+-----+
   |  MSIZE    |   MTYPE   |
   +-----+-----+-----+-----+

                    Figure 4: The common message header.

   where:

   MSIZE  denotes the size of this message in network byte order.

   MTYPE  is the 16-bit message type.  Message types are registered in
      the GANA "GNUnet Message Type" registry Section 10.4.

6.1.2.  Flags

   Flags is a 16-bit vector representing binary options.  Each flag is
   represented by a bit in the field starting from 0 as the rightmost
   bit to 15 as the leftmost bit.

   0: DemultiplexEverywhere  This bit indicates that each peer along the
      way should process the request.  If the bit is not set,
      intermediate peers only route the message and only peers which
      consider themselves closest to the key look for answers in their
      local storage for GetMessages and cache the block in their local
      storage for PutMessages and ResultMessages.

   1: RecordRoute  This bit indicates to keep track of the path that the
      message takes in the P2P network.

   2: FindApproximate  This bit allows results where the key does not
      match exactly.

   3: Truncated  This is a special flag which is set if a peer truncated
      the path and thus the first hop on the path is given without a
      signature to enable checking of the next signature.  MUST never be
      set in a query.

   4-15: Reserved  The remaining bits are reserved for future use and
      MUST be set to 0 when initiating an operation.  If non-zero bits
      are received, implementations MUST preserve these bits when
      forwarding messages.

6.1.3.  Path Element

   A Path Element represents a hop in the path a message has taken
   through the network.  The wire format of a Path Element is
   illustrated in Figure 5.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 15]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   0     8     16    24    32    40    48    56
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  SIGNATURE                    |
   |                  (64 byte)                    |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  PEER ID                      |
   |                  (32 byte)                    |
   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+

                Figure 5: The Wire Format of a Path Element.

   where:

   SIGNATURE  is a 64 byte EdDSA signature using the current hop's
      private key affirming the previous and next hops.

   PEER ID  is the EdDSA public key of the peer on the path.

   An ordered list of Path Elements may be appended to any routed
   PutMessages or ResultMessages.  The signature of a Path Element is
   created by the current hop after it made its routing decision
   identifiying the successor peer.

   Figure 6 shows the wire format of an example path from Peers A over B
   and C as it would be received by D in the PUTPATH of a PutMessage or
   the combined PUTPATH and GETPATH of a ResultMessage.  The wire format
   of the Path Elements allows a natural extension of the PUTPATH along
   the route of the ResultMessage to the destination forming the
   GETPATH.  The PutMessage would indicate in the PATH_LEN field a
   length of 3.  The ResultMessage would indicate a path length of 3 as
   the sum of the field values in PUTPATH_L and GETPATH_L.

   0     8     16    24    32    40    48    56
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  SIGNATURE A                  |
   |                  (64 byte)                    |
   |                                               |
   |                                               |
   |                                               |
   |                                               |


Schanzenbach, et al.       Expires 1 July 2023                 [Page 16]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  PEER A                       |
   |                  (32 byte)                    |
   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  SIGNATURE B                  |
   |                  (64 byte)                    |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  PEER B                       |
   |                  (32 byte)                    |
   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  SIGNATURE C                  |
   |                  (64 byte)                    |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  PEER C                       |
   |                  (32 byte)                    |
   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  SIGNATURE D                  |
   |                  (64 byte)                    |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+

           Figure 6: Example of a path as found in PutMessages or
                        ResultMessages from A to D.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 17]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   A path may be truncated in which case the signature of the truncated
   Path Element is omitted leaving only the Peer ID required for the
   verification of the subsequent Path Element signature.  Such a
   truncated path is indicated with the respective flag (Section 6.1.2).
   The Peer ID of the last Path Element is omitted as it must be that of
   the sender of the PutMesssage or ResultMessage.  The wire format of a
   truncated example path from Peers B over C to D is illustrated in
   Figure 7.  The wire format of an example path from Peers B over C as
   it would be received by D in a PutMessage or ResultMessage is
   illustrated in Figure 7.  A ResultMessage would indicate in the
   PATH_LEN field a length of 1.  A PutMessage would indicate a length
   of 1 as the sum of PUTPATH_L and GETPATH_L fields.

   0     8     16    24    32    40    48    56
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  PEER B                       |
   |                  (32 byte)                    |
   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  SIGNATURE C                  |
   |                  (64 byte)                    |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  PEER C                       |
   |                  (32 byte)                    |
   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  SIGNATURE D                  |
   |                  (64 byte)                    |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+

        Figure 7: Example of a truncated path from Peer B to Peer D.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 18]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   The SIGNATURE field in a Path Element covers a 64-bit
   contextualization header, the the block expiration, a hash of the
   block payload, as well as the predecessor peer ID and the peer ID of
   the successor that the peer making the signature is routing the
   message to.  Thus, the signature made by SELF basically says that
   SELF received the block payload from PEER PREDECESSOR and has
   forwarded it to PEER SUCCESSOR.  The wire format is illustrated in
   Figure 8.

   0     8     16    24    32    40    48    56
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |         SIZE          |       PURPOSE         |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                   EXPIRATION                  |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  BLOCK HASH                   |
   |                  (64 byte)                    |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  PEER PREDECESSOR             |
   |                  (32 byte)                    |
   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  PEER SUCCESSOR               |
   |                  (32 byte)                    |
   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+

         Figure 8: The Wire Format of the Path Element for Signing.

   SIZE  A 32-bit value containing the length of the signed data in
      bytes in network byte order.  The length of the signed data MUST
      be 144 bytes.

   PURPOSE  A 32-bit signature purpose flag.  This field MUST be 6 (in
      network byte order).

   EXPIRATION  denotes the absolute 64-bit expiration date of the block.
      In microseconds since midnight (0 hour), January 1, 1970 UTC in
      network byte order.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 19]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   BLOCK HASH  a SHA-512 hash over the block payload.

   PEER PREDECESSOR  the Peer ID of the previous hop.  If the signing
      peer initiated the PUT, this field is set to all zeroes.

   PEER SUCCESSOR  the Peer ID of the next hop (not of the signer).

6.2.  HelloMessage

   When the Underlay notifies the implementation of added or removed
   addresses through ADDRESS_ADDED and ADDRESS_DELETED it MAY
   disseminate those changes to neighbors using HelloMessages.
   Initiation of HelloMessages by the implementation itself is
   RECOMMENDED.  HelloMessages are used to inform neighbors of a peer
   about the sender's available addresses.  The recipients use these
   messages to inform their respective Underlays about ways to sustain
   the connections and to generate HELLO blocks (see Section 7.2) to
   answer peer discovery queries from other peers.

6.2.1.  Wire Format

   0     8     16    24    32    40    48    56
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |        HEADER         | RESERVED  | URL_CTR   |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                    SIGNATURE                  /
   /                   (64 byte)                   |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                    EXPIRATION                 |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   / ADDRESSES (variable length)                   /
   +-----+-----+-----+-----+-----+-----+-----+-----+

                  Figure 9: The HelloMessage Wire Format.

   where:

   HEADER  the common message header.  Its MTYPE field must be set to
      the value 157 in network byte order.

   RESERVED  is a 16-bit field that must be zero.

   URL_CTR  is a 16-bit number that gives the total number of addresses
      encoded in the ADDRESSES field.  In network byte order.

   SIGNATURE  is a 64 byte EdDSA signature using the sender's private


Schanzenbach, et al.       Expires 1 July 2023                 [Page 20]

Internet-Draft       The R5N Distributed Hash Table        December 2022


      key affirming the information contained in the message.  The
      signature is signing exactly the same data that is being signed in
      a HELLO block as described in Section 7.2.

   EXPIRATION  denotes the absolute 64-bit expiration date of the
      content.  The value specified is microseconds since midnight (0
      hour), January 1, 1970, but must be a multiple of one million (so
      that it can be represented in seconds in a HELLO URL).  Stored in
      network byte order.

   ADDRESSES  A sequence of exactly URL_CTR addresses (Section 2) which
      can be used to contact the peer.  Each address MUST be
      0-terminated.  The set of addresses MAY be empty.

6.2.2.  Processing

   If the initiator of a HelloMessage is SELF, the message is simply
   sent to all neighbors P currently in the routing table using SEND.

   Otherwise, upon receiving a HelloMessage from a peer P an
   implementation MUST process it step by step as follows:

   1.  If P is not in its routing table, the message is discarded.

   2.  The signature is verified, including a check that the expiration
       time is in the future.  If the signature is invalid, the message
       is discarded.

   3.  The information contained in the HelloMessage can be used to
       synthesize a block of type HELLO (Section 7.2).  The block is
       cached in the routing table until it expires, the peer is removed
       from the routing table, or the information is replaced by another
       message from the peer.  The implementation SHOULD instruct the
       Underlay to connect to all now available addresses using
       TRY_CONNECT in order to make the underlay aware of alternative
       addresses for this connection and to maintain optimal
       connectivity.

   4.  Received HelloMessages MUST NOT be forwarded.

6.3.  PutMessage

   PutMessages are used to store information at other peers in the DHT.
   Any API which allows applications to initiate PutMessages needs to
   provide sufficient, implementation-specific information needed to
   construct the initial PutMessage.  For example, implementations
   supporting multiple applications and blocks will have block type and
   message flag parameters in addition to the actual data payload and


Schanzenbach, et al.       Expires 1 July 2023                 [Page 21]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   key.

6.3.1.  Wire Format

   0     8     16    24    32    40    48    56
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |        HEADER         |         BTYPE         |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |   FLAGS   | HOPCOUNT  | REPL_LVL  | PATH_LEN  |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                    EXPIRATION                 |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                   PEER_BF                     /
   /                 (128 byte)                    |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  BLOCK_KEY                    /
   /                 (64 byte)                     |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   /       TRUNCATED ORIGIN (0 or 32 bytes)        /
   +-----+-----+-----+-----+-----+-----+-----+-----+
   /              PUTPATH (variable length)        /
   +-----+-----+-----+-----+-----+-----+-----+-----+
   /      LAST HOP SIGNATURE (0 or 64 bytes)       /
   +-----+-----+-----+-----+-----+-----+-----+-----+
   /              BLOCK (variable length)          /
   +-----+-----+-----+-----+-----+-----+-----+-----+

                   Figure 10: The PutMessage Wire Format.

   where:

   HEADER  is the common message header.  Its MTYPE field is set by the
      initiator to the value 146 in network byte order.  Read-only.

   BTYPE  is a 32-bit block type.  The block type indicates the content
      type of the payload.  Set by the initiator.  Read-only.  In
      network byte order.

   FLAGS  is a 16-bit vector with binary options (see Section 6.1.2).
      Set by the initiator.  Read-only.

   HOPCOUNT  is a 16-bit number indicating how many hops this message
      has traversed to far.  Set by the initiator to 0.  Incremented by
      processing peers.  In network byte order.

   REPL_LVL  is a 16-bit number indicating the desired replication level
      of the data.  Set by the initiator.  Read-only.  In network byte
      order.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 22]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   PATH_LEN  is a 16-bit number indicating the number of Path Elements
      recorded in PUTPATH.  As PUTPATH is optional, this value may be
      zero.  If the PUTPATH is enabled, set initially to 0 by the
      initiator.  Incremented by processing peers.  In network byte
      order.

   EXPIRATION  denotes the absolute 64-bit expiration date of the
      content.  Set by the initiator.  Read-only.  In microseconds since
      midnight (0 hour), January 1, 1970 in network byte order.

   PEER_BF  A peer Bloom filter to stop circular routes (see
      Section 5.3).  Set by the initiator to contain the local peer and
      all neighbors it is forwarded to.  Modified by processing peers to
      include their own peer ID using BF-SET.

   BLOCK_KEY  The key under which the PutMessage wants to store content
      under.  Set by the initiator.  Read-only.

   TRUNCATED ORIGIN  is only provided if the TRUNCATED flag is set in
      FLAGS.  If present, this is the public key of the peer just before
      the first entry on the PUTPATH and the first peer on the PUTPATH
      is not the actual origin of the message.  Thus, to verify the
      first signature on the PUTPATH, this public key must be used.
      Note that due to the truncation, this last hop cannot be verified
      to exist.  Value is modified by processing peers.

   PUTPATH  the variable-length PUT path.  The path consists of a list
      of PATH_LEN Path Elements.  Set by the initiator to 0.
      Incremented by processing peers.

   LAST HOP SIGNATURE  is only provided if the RECORD ROUTE flag is set
      in FLAGS.  If present, this is an EdDSA signature of the sender of
      this message (using the same format as the signatures in PUTPATH)
      affirming that the sender forwarded the message from the
      predecessor (all zeros if PATH_LEN is 0, otherwise the last peer
      in PUTPATH) to the target peer.  Modified by processing peers (if
      flag is set).

   BLOCK  the variable-length block payload.  The contents are
      determined by the BTYPE field.  The length is determined by MSIZE
      minus the size of all of the other fields.  Set by the initiator.
      Read-only.

6.3.2.  Processing

   Upon receiving a PutMessage from a peer P , or created through
   initiation by an overlay API, an implementation MUST process it step
   by step as follows:


Schanzenbach, et al.       Expires 1 July 2023                 [Page 23]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   1.  The EXPIRATION field is evaluated.  If the message is expired, it
       MUST be discarded.

   2.  If the BTYPE is not supported by the implementation, no
       validation of the block payload is performed and processing
       continues at (5).  If the BTYPE is ANY, then the message MUST be
       discarded.  Else, the block MUST be validated as defined in (3)
       and (4).

   3.  The message is evaluated using the block validation functions
       matching the BTYPE.  First, the client attempts to derive the key
       using the respective DeriveBlockKey procedure as described in
       Section 7.1.  If a key can be derived and does not match, the
       message MUST be discarded.

   4.  Next, the ValidateBlockStoreRequest procedure for the BTYPE as
       described in Section 7.1 is used to validate the block payload.
       If the block payload is invalid, the message MUST be discarded.

   5.  The peer address of the sender peer P SHOULD be in PEER_BF.  If
       not, the implementation MAY log an error, but MUST continue.

   6.  If the RecordRoute flag is not set, the PATH_LEN MUST be set to
       zero.  If the flag is set and PATH_LEN is non-zero, the local
       peer SHOULD verify the signatures from the PUTPATH.  Verification
       MAY involve checking all signatures or any random subset of the
       signatures.  It is RECOMMENDED that peers adapt their behavior to
       available computational resources so as to not make signature
       verification a bottleneck.  If an invalid signature is found, the
       PUTPATH MUST be truncated to only include the elements following
       the invalid signature.

   7.  If the local peer is the closest peer (cf.  IsClosestPeer(SELF,
       BLOCK_KEY, PeerFilter)) or the DemultiplexEverywhere flag ist
       set, the message SHOULD be stored locally in the block storage if
       possible.  The implementation MAY choose not store the block if
       external factors or configurations prevent this, such as limited
       (alottted) disk space.

   8.  If the BTYPE of the message indicates a HELLO block, the peer
       MUST be considered for the local routing table by using the peer
       address in BLOCK_KEY.  If the peer is not either already
       connected or the respective k-bucket is not already full the peer
       MUST try to establish a connection to the peer indicated in the
       HELLO block using the address information from the HELLO block
       and the Underlay function TRY_CONNECT.  The implementation MUST
       instruct the Underlay to try to connect to all provided addresses
       using TRY_CONNECT in order to make the underlay aware of multiple


Schanzenbach, et al.       Expires 1 July 2023                 [Page 24]

Internet-Draft       The R5N Distributed Hash Table        December 2022


       addresses for this connection.  When a connection is established,
       the signal PEER_CONNECTED will cause the peer to be added to the
       respective k-bucket of the routing table (Section 5).

   9.  Given the value in REPL_LVL, HOPCOUNT and the result of
       IsClosestPeer(SELF, BLOCK_KEY, PeerFilter) the number of peers to
       forward to MUST be calculated using ComputeOutDegree().  The
       implementation SHOULD select up to this number of peers to
       forward the message to.  The implementation MAY forward to fewer
       or no peers in order to handle resource constraints such as
       limited bandwidth.  For each selected peer with peer address P a
       dedicated PutMessage_P is created containing the original (and
       where applicable already updated) fields of the received
       PutMessage.  In each message the all selected addresses and the
       local peer MUST be added to the PEER_BF and the HOPCOUNT is
       incremented by 1.  If the RecordRoute flag is set, a new Path
       Element is created using the predecessor peer ID and the
       signature of the current peer.  The Path Element is added to the
       PUTPATH fields and the PATH_LEN field is incremented by 1.  When
       creating the Path Element signature, the successor must be set to
       the recipient peer P of the PutMessageP.  The successor in the
       new Path Element is the recipient peer P of Finally, the messages
       are sent using SEND(P, PutMessageP) each recipient.

6.4.  GetMessage

   GetMessages are used to request information from other peers in the
   DHT.  Any overlay API which allows applications to initiate
   GetMessages needs to provide sufficient, implementation-specific
   information needed to construct the initial GetMessage.  For example,
   implementations supporting multiple applications and blocks will have
   block type and message flag parameters.

6.4.1.  Wire Format


Schanzenbach, et al.       Expires 1 July 2023                 [Page 25]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   0     8     16    24    32    40    48    56
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |        HEADER         |         BTYPE         |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |   FLAGS   |  HOPCOUNT | REPL_LVL  |  RF_SIZE  |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                 PEER_BF                       /
   /                 (128 byte)                    |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                 QUERY_HASH                    /
   /                 (64 byte)                     |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                 RESULT_FILTER                 /
   /                 (variable length)             /
   +-----+-----+-----+-----+-----+-----+-----+-----+
   /                 XQUERY (variable length)      /
   +-----+-----+-----+-----+-----+-----+-----+-----+

                   Figure 11: The GetMessage Wire Format.

   where:

   HEADER  is the common message header.  Its MTYPE field is set by the
      initiator to the value 147 in network byte order.  Read-only.

   BTYPE  is a 32-bit block type field.  The block type indicates the
      content type of the payload.  Set by the initiator.  Read-only.
      In network byte order.

   FLAGS  is a 16-bit vector with binary options (see Section 6.1.2).
      Set by the initiator.  Read-only.

   HOPCOUNT  is a 16-bit number indicating how many hops this message
      has traversed to far.  Set by the initiator to 0.  Incremented by
      processing peers.  In network byte order.

   REPL_LVL  is a 16-bit number indicating the desired replication level
      of the data.  Set by the initiator.  Read-only.  In network byte
      order.

   RF_SIZE  is a 16-bit number indicating the length of the result
      filter RESULT_FILTER.  Set by the initiator.  Read-only.  In
      network byte order.

   PEER_BF  A peer Bloom filter to stop circular routes (see
      Section 5.3).  Set by the initiator to include itself and all
      connected neighbors in the routing table.  Modified by processing
      peers to include their own peer address.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 26]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   QUERY_HASH  The query used to indicate what the key is under which
      the initiator is looking for blocks with this request.  The block
      type may use a different evaluation logic to determine applicable
      result blocks.  Set by the initiator.  Read-only.

   RESULT_FILTER  the variable-length result filter, described in
      Section 6.4.2.  Set by the initiator.  Modified by processing
      peers.

   XQUERY  the variable-length extended query.  Optional.  Set by the
      initiator.  Read-only.

6.4.2.  Result Filter

   The result filter is used to indicate to other peers which results
   are not of interest when processing a GetMessage (Section 6.4).  Any
   peer which is processing GetMessages and has a result which matches
   the query key MUST check the result filter and only send a reply
   message if the result does not test positive under the result filter.
   Before forwarding the GetMessage, the result filter MUST be updated
   using the result of the BTYPE-specific FilterResult (see Section 7.1)
   function to filter out all results already returned by the local
   peer.

   How a result filter is implemented depends on the block type as
   described in Section 7.1.  Result filters may be probabilistic data
   structures.  Thus, it is possible that a desireable result is
   filtered by a result filter because of a false-positive test.

   How exactly a block result is added to a result filter is specified
   as part of the definition of a block type (cf.  Section 7.2).

6.4.3.  Processing

   Upon receiving a GetMessage from a peer P, or created through
   initiation by the overlay API, an implementation MUST process it step
   by step as follows:

   1.  If the BTYPE is supported, the QUERY_HASH and XQUERY fields are
       validated as defined by the respective ValidateBlockQuery
       procedure for this type.  If the result yields REQUEST_INVALID,
       the message MUST be discarded and processing ends.  If the BTYPE
       is not supported, the message MUST be forwarded (Skip to step 4).
       If the BTYPE is ANY, the message is processed further without
       validation.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 27]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   2.  The peer address of the sender peer P SHOULD be in the PEER_BF
       Bloom filter.  If not, the implementation MAY log an error, but
       MUST continue.

   3.  The local peer SHOULD try to produce a reply in any of the
       following cases: (1) If the local peer is the closest peer (cf.
       IsClosestPeer (SELF, QueryHash, PeerFilter), or (2) if the
       DemultiplexEverywhere flag is set, or (3) if the local peer is
       not the closest and a previously cached ResultMessage also
       matches this request (Section 6.5.2).

       The reply is produced (if one is available) using the following
       steps:

       a)  If the BTYPE is HELLO, the implementation MUST only consider
           synthesizing its own addresses and the addresses it has
           cached for the peers in its routing table as HELLO block
           replies.  Otherwise, if the BTYPE does not indicate a request
           for a HELLO block or ANY, the implementation MUST only
           consider blocks in the local block storage and previously
           cached ResultMessages.

       b)  If the FLAGS field includes the flag FindApproximate, the
           peer SHOULD respond with the closest block (smallest value of
           GetDistance(QUERY_HASH, BLOCK_KEY)) it can find that is not
           filtered by the RESULT_BF.  Otherwise, the peer MUST respond
           with the block with a BLOCK_KEY that matches the QUERY_HASH
           exactly and that is not filtered by the RESULT_BF.

       c)  Any resulting (synthesized) block is encapsulated in a
           ResultMessage.  The ResultMessage SHOULD be transmitted to
           the neighbor from which the request was received.

       Implementations MAY not reply if they are resource-constrained.
       However, ResultMessages MUST be given the highest priority among
       competing transmissions.

       If the BTYPE is supported and ValidateBlockReply for the given
       query has yielded a status of FILTER_LAST, processing MUST end
       and not continue with forwarding of the request to other peers.

   4.  The implementation SHOULD create (or merge) an entry in the
       pending table Section 5.5 for the query represented by this
       GetMessage.  If the peer is unable to handle an additional entry
       in the table, the message MUST be discarded and processing ends.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 28]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   5.  Using the value in REPL_LVL, the number of peers to forward to
       MUST be calculated using ComputeOutDegree().  If there is at
       least one peer to forward to, the implementation SHOULD select up
       to this number of peers to forward the message to.  The
       implementation MAY forward to fewer or no peers in order to
       handle resource constraints such as bandwidth.  The peer Bloom
       filter PEER_BF MUST be updated with the local peer address SELF
       for any forwarded message.  For all peers with peer address P
       chosen to forward the message to, SEND(P, GetMessageP) is called.
       Here, GetMessageP is the original message with the updated fields
       for HOPCOUNT (incremented by 1), PEER_BF and RESULT_FILTER.

6.5.  ResultMessage

   ResultMessages are used to return information to other peers in the
   DHT or to applications using the overlay API that previously
   initiated a GetMessage.  The initiator of a ResultMessage is a peer
   triggered through the processing of a GetMessage.

6.5.1.  Wire Format

   0     8     16    24    32    40    48    56
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |        HEADER         |        BTYPE          |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |  RESERVED |   FLAGS   | PUTPATH_L | GETPATH_L |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                   EXPIRATION                  |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                  QUERY_HASH                   /
   /                 (64 byte)                     |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   /       TRUNCATED ORIGIN (0 or 32 bytes)        /
   +-----+-----+-----+-----+-----+-----+-----+-----+
   /                  PUTPATH                      /
   /                 (variable length)             /
   +-----+-----+-----+-----+-----+-----+-----+-----+
   /                  GETPATH                      /
   /                 (variable length)             /
   +-----+-----+-----+-----+-----+-----+-----+-----+
   /      LAST HOP SIGNATURE (0 or 64 bytes)       /
   +-----+-----+-----+-----+-----+-----+-----+-----+
   /                  BLOCK                        /
   /                 (variable length)             /
   +-----+-----+-----+-----+-----+-----+-----+-----+

                  Figure 12: The ResultMessage Wire Format


Schanzenbach, et al.       Expires 1 July 2023                 [Page 29]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   where:

   HEADER  is the common message header.  Its MTYPE field must be set to
      the value 148 in network byte order.  Set by the initiator.  Read-
      only.

   BTYPE  is a 32-bit block type field.  The block type indicates the
      content type of the payload.  Set by the initiator.  Read-only.
      In network byte order.

   RESERVED  is a 16-bit value.  Implementations MUST set this value to
      zero when originating a result message.  Implementations MUST
      forward this value unchanged even if it is non-zero.

   FLAGS  is a 16-bit vector with binary options (see Section 6.1.2).
      Set by the initiator.

   PUTPATH_L  is a 16-bit number indicating the number of Path Elements
      recorded in PUTPATH.  As PUTPATH is optional, this value may be
      zero even if the message has traversed several peers.  Set by the
      initiator to the PATH_LEN of the PutMessage from which the block
      originated.  Modified by processing peers in case of path
      truncation.  In network byte order.

   GETPATH_L  is a 16-bit number indicating the number of Path Elements
      recorded in GETPATH.  As GETPATH is optional, this value may be
      zero even if the message has traversed several peers.  Set by the
      initiator to 0.  Modified by processing peers.  In network byte
      order.

   EXPIRATION  denotes the absolute 64-bit expiration date of the
      content.  In microseconds since midnight (0 hour), January 1, 1970
      in network byte order.  Set by the initiator to the expiration
      value as recorded from the PutMessage from which the block
      originated.  Read-only.

   QUERY_HASH  the query hash corresponding to the GetMessage which
      caused this reply message to be sent.  Set by the initiator using
      the value of the GetMessage.  Read-only.

   TRUNCATED ORIGIN  is only provided if the TRUNCATED flag is set in
      FLAGS.  If present, this is the public key of the peer just before
      the first entry on the PUTPATH and the first peer on the PUTPATH
      is not the actual origin of the message.  Thus, to verify the
      first signature on the PUTPATH, this public key must be used.
      Note that due to the truncation, this last hop cannot be verified
      to exist.  Set by processing peers.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 30]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   PUTPATH  the variable-length PUT path.  The path consists of a list
      of PUTPATH_L Path Elements.  Set by the initiator to the the
      PUTPATH of the PutMessage from which the block originated.
      Modified by processing peers in case of path truncation.

   GETPATH  the variable-length PUT path.  The path consists of a list
      of GETPATH_L Path Elements.  Set by processing peers.

   LAST HOP SIGNATURE  is only provided if the RecordRoute flag is set
      in FLAGS.  If present, this is an EdDSA signature of the sender of
      this message (using the same format as the signatures in PUTPATH)
      affirming that the sender forwarded the message from the
      predecessor (all zeros if PATH_LEN is 0, otherwise the last peer
      in PUTPATH) to the target peer.

   BLOCK  the variable-length resource record data payload.  The
      contents are defined by the respective type of the resource
      record.  Set by the initiator.  Read-only.

6.5.2.  Processing

   Upon receiving a ResultMessage from a connected peer or triggered by
   the processing of a GetMessage, an implementation MUST process it
   step by step as follows:

   1.  First, the EXPIRATION field is evaluated.  If the message is
       expired, it MUST be discarded.

   2.  If the BTYPE is supported, then the BLOCK MUST be validated
       against the requested BTYPE.  To do this, the peer checks that
       the block is valid using ValidateBlockStoreRequest.  If the
       result is BLOCK_INVALID, the message MUST be discarded.

   3.  If the PUTPATH_L or the GETPATH_L are non-zero, the local peer
       SHOULD verify the signatures from the PUTPATH and the GETPATH.
       Verification MAY involve checking all signatures or any random
       subset of the signatures.  It is RECOMMENDED that peers adapt
       their behavior to available computational resources so as to not
       make signature verification a bottleneck.  If an invalid
       signature is found, the path MUST be truncated to only include
       the elements following the invalid signature.  In particular, any
       invalid signature on the GETPATH will cause PUTPATH_L to be set
       to 0.

   4.  The peer also attempts to compute the key using DeriveBlockKey.
       This may result in NONE.  The result is used later.  Note that
       even if a key was computed, it does not have to match the
       QUERY_HASH.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 31]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   5.  If the BTYPE of the message indicates a HELLO block, the peer
       SHOULD be considered for the local routing table by using the
       peer address computed from the block usingDeriveBlockKey.  An
       implementation MAY choose to ignore the HELLO, for example
       because the routing table or the respective k-bucket is already
       full.  If the peer is a suitable candidate for insertion, the
       local peer MUST try to establish a connection to the peer
       indicated in the HELLO block using the address information from
       the HELLO block and the Underlay function TRY_CONNECT.  The
       implementation MUST instruct the Underlay to connect to all
       provided addresses using TRY_CONNECT in order to make the
       underlay aware of multiple addresses for this connection.  When a
       connection is established, the signal PEER_CONNECTED will cause
       the peer to be added to the respective k-bucket of the routing
       table (Section 5).

   6.  If the QUERY_HASH of this ResultMessage does not match an entry
       in the pending table (Section 5.5), then the message is discarded
       and processing ends.  Otherwise, processing continues for each
       entry in the table as follows.

       a)  If the FindApproximate flag was not set in the query and the
           BTYPE allowed the implementation to compute the key from the
           block, the computed key must exactly match the QUERY_HASH,
           otherwise the result does not match the pending query and
           processing continues with the next pending query.

       b)  If the BTYPE is supported, result block MUST be validated
           against the specific query using the respective
           FilterBlockResult function.  This function MUST update the
           result filter if a result is returned to the originator of
           the query.

       c)  If the BTYPE is not supported, filtering of exact duplicate
           replies MUST still be performed before forwarding the reply.
           Such duplicate filtering MAY be implemented
           probabilistically, for example using a Bloom filter.  The
           result of this duplicate filtering is always either
           FILTER_MORE or FILTER_DUPLICATE.

       d)  If the RecordRoute flag is set in FLAGS, the local peer
           address MUST be appended to the GETPATH of the message and
           the respective signature MUST be set using the query origin
           as the PEER SUCCESSOR and the response origin as the PEER
           PREDECESSOR.  If the flag is not set, the GETPATH_L and
           PUTPATH_L MUST be set to zero when forwarding the result.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 32]

Internet-Draft       The R5N Distributed Hash Table        December 2022


       e)  If the result filter result is either FILTER_MORE or
           FILTER_LAST, the message is forwarded to the origin of the
           query as defined in the entry which may either be the local
           peer or a remote peer.  In case this is a query of the local
           peer the result may have to be provided to applications
           through the overlay API.  Otherwise, the result is forwarded
           using SEND(P, ResultMessage') where ResultMessage' is the now
           modified message.  If the result was FILTER_LAST, the query
           is removed from the pending table.

   8.  Finally, the implementation SHOULD cache ResultMessages in order
       to provide already seen replies to future GetMessages.  The
       implementation MAY choose not no cache any or a limited number of
       ResultMessages for reasons such as resource limitations.

7.  Blocks

   This section describes various considerations R^5N implementations
   must consider with respect to blocks.  Specifically, implementations
   SHOULD be able to validate and persist blocks.  Implementations MAY
   not support validation for all types of blocks.  On some devices,
   storing blocks MAY also be impossible due to lack of storage
   capacity.

   Applications can and should define their own block types.  The block
   type determines the format and handling of the block payload by peers
   in PutMessages and ResultMessages.  Block types MUST be registered
   with GANA (see Section 10.1).


7.1.  Block Operations

   Block validation may be necessary for all types of DHT messages.  To
   enable these validations, any block type specification MUST define
   the following functions:

   ValidateBlockQuery(Key, XQuery) -> RequestEvaluationResult  is used
      to evaluate the request for a block as part of GetMessage
      processing.  Here, the block payload is unkown, but if possible
      the XQuery and Key SHOULD be verified.  Possible values for the
      RequestEvaluationResult are:

      REQUEST_VALID  Query is valid.

      REQUEST_INVALID  Query format does not match block type.  For
         example, a mandatory XQuery was not provided, or of the size of
         the XQuery is not appropriate for the block type.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 33]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   DeriveBlockKey(Block) -> Key | NONE  is used to synthesize the block
      key from the block payload as part of PutMessage and ResultMessage
      processing.  The special return value of NONE implies that this
      block type does not permit deriving the key from the block.  A Key
      may be returned for a block that is ill-formed.

   ValidateBlockStoreRequest(Block) -> BlockEvaluationResult  is used to
      evaluate a block payload as part of PutMessage and ResultMessage
      processing.  Possible values for the BlockEvaluationResult are:

      BLOCK_VALID  Block is valid.

      BLOCK_INVALID  Block payload does not match the block type.

   SetupResultFilter(FilterSize, Mutator) -> RF  is used to setup an
      empty result filter.  The arguments are the set of results that
      must be filtered at the initiator, and a MUTATOR value which MAY
      be used to deterministically re-randomize probabilistic data
      structures.  The specification MUST also include the wire format
      for BF.

   FilterResult(Block, Key, RF, XQuery) -> (FilterEvaluationResult,
   RF')  is used to filter results against specific queries.  This
      function does not check the validity of Block itself or that it
      matches the given key, as this must have been checked earlier.
      Thus, locally stored blocks from previously observed
      ResultMessages and PutMessages use this function to perform
      filtering based on the request parameters of a particular GET
      operation.  Possible values for the FilterEvaluationResult are:

      FILTER_MORE  Valid result, and there may be more.

      FILTER_LAST  Last possible valid result.

      FILTER_DUPLICATE  Valid result, but duplicate (was filtered by the
         result filter).

      FILTER_IRRELEVANT  Block does not satisfy the constraints imposed
         by the XQuery.

      If the main evaluation result is FILTER_MORE, the function also
      returns an updated result filter where the block is added to the
      set of filtered replies.  An implementation is not expected to
      actually differenciate between the FILTER_DUPLICATE and
      FILTER_IRRELEVANT return values: in both cases the block is
      ignored for this query.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 34]

Internet-Draft       The R5N Distributed Hash Table        December 2022


7.2.  HELLO Blocks

   For bootstrapping and peer discovery, the DHT implementation uses its
   own block type called "HELLO".  HELLO blocks are the only type of
   block that MUST be supported by every R^5N implementation.  A block
   with this block type contains the peer ID of the peer that published
   the HELLO together with a set of addresses of this peer.  The key of
   a HELLO block is the SHA-512 of the peer ID and thus the peer's
   address in the DHT.

   The HELLO block type wire format is illustrated in Figure 13.  A
   query for block of type HELLO MUST NOT include extended query data
   (XQuery).  Any implementation encountering a request for a HELLO with
   non-empty XQuery data MUST consider the request invalid and ignore
   it.

   0     8     16    24    32    40    48    56
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                     PEER-ID                   |
   |                    (32 byte)                  |
   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                    SIGNATURE                  |
   |                    (64 byte)                  |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   |                                               |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   |                   EXPIRATION                  |
   +-----+-----+-----+-----+-----+-----+-----+-----+
   /                   ADDRESSES                   /
   /               (variable length)               /
   +-----+-----+-----+-----+-----+-----+-----+-----+

                     Figure 13: The HELLO Block Format.

   PEER-ID  is the Peer-ID of the peer which has generated this HELLO.

   EXPIRATION  denotes the absolute 64-bit expiration date of the
      content.  The value specified is microseconds since midnight (0
      hour), January 1, 1970, but must be a multiple of one million (so
      that it can be represented in seconds in a HELLO URL).  Stored in
      network byte order.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 35]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   ADDRESSES  is a list of UTF-8 addresses (Section 2) which can be used
      to contact the peer.  Each address MUST be 0-terminated.  The set
      of addresses MAY be empty.

   SIGNATURE  is the signature of the HELLO.  It covers a 64-bit pseudo
      header derived from the information in the HELLO block.  The
      pseudo header includes the expiration time, a constant that
      uniquely identifies the purpose of the signature, and a hash over
      the addresses.  The wire format is illustrated in Figure 14.

      0     8     16    24    32    40    48    56
      +-----+-----+-----+-----+-----+-----+-----+-----+
      |         SIZE          |       PURPOSE         |
      +-----+-----+-----+-----+-----+-----+-----+-----+
      |                   EXPIRATION                  |
      +-----+-----+-----+-----+-----+-----+-----+-----+
      |                   H_ADDRS                     |
      |                  (64 byte)                    |
      |                                               |
      |                                               |
      |                                               |
      |                                               |
      |                                               |
      |                                               |
      +-----+-----+-----+-----+-----+-----+-----+-----+

             Figure 14: The Wire Format of the HELLO for Signing.

      SIZE  A 32-bit value containing the length of the signed data in
         bytes in network byte order.  The length of the signed data
         MUST be 80 bytes.

      PURPOSE  A 32-bit signature purpose flag.  This field MUST be 7
         (in network byte order).

      EXPIRATION  denotes the absolute 64-bit expiration date of the
         HELLO.  In microseconds since midnight (0 hour), January 1,
         1970 in network byte order.

      H_ADDRS  a SHA-512 hash over the addresses in the HELLO.  H_ADDRS
         is generated over the ADDRESSES field as provided in the HELLO
         block using SHA-512 [RFC4634].

   The HELLO block functions MUST be implemented as follows:

   ValidateBlockQuery(Key, XQuery) -> RequestEvaluationResult  To


Schanzenbach, et al.       Expires 1 July 2023                 [Page 36]

Internet-Draft       The R5N Distributed Hash Table        December 2022


      validate a block query for a HELLO is to simply check that the
      XQuery is empty.  If it is empty, REQUEST_VALID ist returned.
      Otherwise, REQUEST_INVALID.

   DeriveBlockKey(Block) -> Key | NONE  To derive a block key for a
      HELLO is to simply hash the peer ID from the HELLO.  The result of
      this function is always the SHA-512 hash over the PEER-ID.

   ValidateBlockStoreRequest(Block) -> BlockEvaluationResult  To
      validate a block store request is to verify the EdDSA SIGNATURE
      over the hashed ADDRESSES against the public key from the peer ID
      field.  If the signature is valid BLOCK_VALID is returned.
      Otherwise BLOCK_INVALID.

   SetupResultFilter(FilterSize, Mutator) -> RF  The RESULT_FILTER for
      HELLO blocks is implemented using a Bloom filter following the
      definition from Appendix A and consists of a variable number of
      buckets L.  L depends on the number of connected peers |E| known
      to the peer creating a HELLO block from its own addresses: L is
      set to the minimum of 2^18 bits (2^15 bytes) and the lowest power
      of 2 that is strictly larger than 2*K*|E| bits (K*|E|/4 bytes).

      The k-value for the Bloom filter is 16.  The elements used in the
      Bloom filter consist of an XOR between the H_ADDRS field (as
      computed using SHA-512 over the ADDRESSES) and the SHA-512 hash of
      the MUTATOR field from a given HELLO block.  The mapping function
      M(H_ADDRS XOR MUTATOR) is defined as follows:

      M(e = H_ADDR XOR MUTATOR) -> e as uint32[]

      M is an identity function and returns the 512-bit XOR result
      unmodified.  This resulting byte string is interpreted as k=16
      32-bit integers in network byte order which are used to set and
      check the bucket bits in B using BF-SET and BF-TEST.  The 32-bit
      Mutator is prepended to the L-bit Bloom filter bucket field
      HELLO_BF containing B to create the result filter for a HELLO
      block:

      0     8     16    24    32    40    48    56
      +-----+-----+-----+-----+-----+-----+-----+-----+
      |        MUTATOR        |  HELLO_BF             /
      +-----+-----+-----+-----+  (variable length)    /
      /                                               /
      +-----+-----+-----+-----+-----+-----+-----+-----+

                  Figure 15: The HELLO Block Result Filter.

      where:


Schanzenbach, et al.       Expires 1 July 2023                 [Page 37]

Internet-Draft       The R5N Distributed Hash Table        December 2022


      MUTATOR  The 32-bit mutator for the result filter.

      HELLO_BF  The L-bit Bloom filter buckets byte array.

      The MUTATOR value is used to additionally "randomize" the
      computation of the Bloom filter while remaining deterministic
      across peers.  It is only ever set by the peer initiating the GET
      request, and changed every time the GET request is repeated.
      Peers forwarding GET requests MUST not change the mutator value
      included in the RESULT_FILTER as they might not be able to
      recalculate the result filter with a different MUTATOR value.

      Consequently, repeated requests have statistically independent
      probabilities of creating false-positives in a result filter.
      Thus, even if for one request a result filter may exclude a result
      as a false-positive match, subsequent requests are likely to not
      have the same false-positives.

      HELLO result filters can be merged if the Bloom filters have the
      same size and MUTATOR by setting all bits to 1 that are set in
      either Bloom filter.  This is done whenever a peer receives a
      query with the same MUTATOR, predecessor and Bloom filter size.

   FilterResult(Block, Key, RF, XQuery) -> (FilterEvaluationResult,
   RF')  The H_ADDRS field is XORed with the SHA-512 hash of the MUTATOR
      field from the HELLO block and the resulting value is checked
      against the Bloom filter in RF.  Consequently, HELLOs with
      completely identical sets of addresses will be filtered and
      FILTER_DUPLICATE is returned.  Any small variation in the set of
      addresses will cause the block to no longer be filtered (with high
      probability) and FILTER_MORE is returned.

7.3.  Persistence

   An implementation SHOULD provide a local persistence mechanism for
   blocks.  Embedded systems that lack storage capability MAY use
   connection-level signalling to indicate that they are merely a client
   utilizing a DHT and are not able to participate with storage.  The
   local storage MUST provide the following functionality:

   Store(Key, Block)  Stores a block under the specified key.  If an
      block with identical payload exists already under the same key,
      the meta data should be set to the maximum expiration time of both
      blocks and use the corresponding PUTPATH (and if applicable
      TRUNCATED ORIGIN) of that version of the block.

   Lookup(Key) -> List of Blocks  Retrieves blocks stored under the
      specified key.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 38]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   LookupApproximate(Key) -> List of Blocks  Retrieves the blocks stored
      under the specified key and any blocks under keys close to the
      specified key, in order of decreasing proximity.

7.3.1.  Approximate Search Considerations

   Over time a peer may accumulate a significant number of blocks which
   are stored locally in the persistence layer.  Due to the expected
   high number of blocks, the method to retrieve blocks close to the
   specified lookup key in the LookupApproximate API must be implemented
   with care with respect to efficiency.

   It is RECOMMENDED to limit the number of results from the
   LookupApproximate procedure to a result size which is easily
   manageable by the local system.

   In order to efficiently find a suitable result set, the
   implementation SHOULD follow the following procedure:

   1.  Sort all blocks by the block key in ascending (decending) order.
       The block keys are interpreted as integer.

   2.  Alternatingly select a block with a key larger and smaller from
       the sortings.  The resulting set is sorted by XOR distance.  The
       selection process continues until the upper bound for the result
       set is reached and both sortings do not yield any closer blocks.

   An implementation MAY decide to use a custom algorithm in order to
   find the closest blocks in the local storage.  But, especially for
   more primitive approaches, such as only comparing XOR distances for
   all blocks in the storage, the procedure may become ineffective for
   large storages.

7.3.2.  Caching Strategy Considerations

   An implementation MUST implement an eviction strategy for blocks
   stored in the block storage layer.

   In order to ensure the freshness of blocks, an implementation MUST
   evict expired blocks in favor of new blocks.

   An implementation MAY preserve blocks which are often requested.
   This approach can be expensive as it requires the implementation to
   keep track of how often a block is requested.

   An implementation MAY preserve blocks which are close to the local
   peer ID.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 39]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   An implementation MAY provide configurable storage quotas and adapt
   its eviction strategy based on the current storage size or other
   constrained resources.

8.  Security Considerations

   If an upper bound to the maximum number of neighbors in a k-bucket is
   reached, the implementation MUST prefer to preserve the oldest
   working connections instead of new connections.  This makes Sybil
   attacks less effective as an adversary would have to invest more
   resources over time to mount an effective attack.

   The ComputeOutDegree function limits the REPL_LVL to a maximum of 16.
   This imposes an upper limit on bandwidth amplification an attacker
   may achieve for a given network size and topology.

8.1.  Approximate Result Filtering

   When a FindApproximate request is encountered, a peer will try to
   respond with the closest block it has that is not filtered by the
   result bloom filter.  Implementations MUST ensure that the cost of
   evaluating any such query is reasonably small.  For example,
   implementations MAY consider to avoid an exhaustive search of their
   database.  Not doing so can lead to denial of service attacks as
   there could be cases where too many local results are filtered by the
   result filter.

8.2.  Access control

   By design R^5N does not rely on strict admission control through the
   use of either centralized enrollment servers or pre-shared keys.
   This is a key distintion over protocols that do rely on this kind of
   access control such as [RFC6940] which, like R^5N, provides a peer-
   to-peer (P2P) signaling protocol with extensible routing and topology
   mechanisms.  Some decentralized applications such as the GNU Name
   System ([I-D.schanzen-gns]) require a more open system that enables
   ad-hoc participation and other means to prevent common attacks on P2P
   overlays.  GNS, for example, would be in conflict with its goals of
   providing a solution to the issues of a "Single Hierarchy with a
   Centrally Controlled Root" and "Distribution and Management of Root
   Servers" in DNS as raised in [RFC8324].

9.  IANA Considerations

   IANA maintains a registry called the "Uniform Resource Identifier
   (URI) Schemes" registry.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 40]

Internet-Draft       The R5N Distributed Hash Table        December 2022


9.1.  GNUnet URI Scheme Registration

   IANA maintains the "Uniform Resource Identifier (URI) Schemes"
   registry.  The registry should be updated to include an entry for the
   'gnunet' URI scheme.  IANA is requested to update that entry to
   reference this document when published as an RFC.

9.2.  R5N URI Scheme Registration

   IANA maintains the "Uniform Resource Identifier (URI) Schemes"
   registry.  The registry should be updated to include an entry for the
   'r5n+udp+ip' URI scheme.  IANA is requested to update that entry to
   reference this document when published as an RFC.

10.  GANA Considerations

10.1.  Block Type Registry

   GANA [GANA] is requested to create a "DHT Block Types" registry.  The
   registry shall record for each entry:

   *  Name: The name of the block type (case-insensitive ASCII string,
      restricted to alphanumeric characters

   *  Number: 32-bit

   *  Comment: Optionally, a brief English text describing the purpose
      of the block type (in UTF-8)

   *  Contact: Optionally, the contact information of a person to
      contact for further information

   *  References: Required, references (such as an RFC) specifying the
      block type and its block functions

   The registration policy for this sub-registry is "First Come First
   Served", as described in [RFC8126].  GANA created the registry as
   follows:

   Number| Name           | References | Description
   ------+----------------+------------+-------------------------
   0       ANY              [This.I-D]   Reserved
   13      DHT_HELLO        [This.I-D]   Address data for a peer

   Contact: r5n-registry@gnunet.org

                    Figure 16: The Block Type Registry.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 41]

Internet-Draft       The R5N Distributed Hash Table        December 2022


10.2.  GNUnet URI schema Subregistry

   GANA [GANA] is requested to create a "gnunet://" sub-registry.  The
   registry shall record for each entry:

   *  Name: The name of the subsystem (case-insensitive ASCII string,
      restricted to alphanumeric characters

   *  Comment: Optionally, a brief English text describing the purpose
      of the subsystem (in UTF-8)

   *  Contact: Optionally, the contact information of a person to
      contact for further information

   *  References: Optionally, references describing the syntax of the
      URL (such as an RFC or LSD)

   The registration policy for this sub-registry is "First Come First
   Served", as described in [RFC8126].  GANA created this registry as
   follows:

   Name           | References | Description
   ---------------+------------+-------------------------
   HELLO            [This.I-D]   How to contact a peer.
   ADDRESS          N/A          Network address.
   Contact: gnunet-registry@gnunet.org

                   Figure 17: GNUnet scheme Subregistry.

10.3.  GNUnet Signature Purpose Registry

   GANA amended the "GNUnet Signature Purpose" registry as follows:

   Purpose | Name            | References | Description
   --------+-----------------+------------+---------------
   6         DHT PATH Element  [This.I-D]   DHT message routing data
   7         HELLO Payload     [This.I-D]   Peer contact information

             Figure 18: The Signature Purpose Registry Entries.

10.4.  GNUnet Message Type Registry

   GANA is requested to amend the "GNUnet Message Type" registry as
   follows:


Schanzenbach, et al.       Expires 1 July 2023                 [Page 42]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   Type    | Name            | References | Description
   --------+-----------------+------------+---------------
   146       DHT PUT          [This.I-D]    Store information in DHT
   147       DHT GET          [This.I-D]    Request information from DHT
   148       DHT RESULT       [This.I-D]    Return information from DHT
   157       HELLO Message    [This.I-D]    Peer contact information

               Figure 19: The Message Type Registry Entries.

11.  Test Vectors

12.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC3629]  Yergeau, F., "UTF-8, a transformation format of ISO
              10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November
              2003, <https://www.rfc-editor.org/info/rfc3629>.

   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
              Resource Identifier (URI): Generic Syntax", STD 66,
              RFC 3986, DOI 10.17487/RFC3986, January 2005,
              <https://www.rfc-editor.org/info/rfc3986>.

   [RFC4634]  Eastlake 3rd, D. and T. Hansen, "US Secure Hash Algorithms
              (SHA and HMAC-SHA)", RFC 4634, DOI 10.17487/RFC4634, July
              2006, <https://www.rfc-editor.org/info/rfc4634>.

   [RFC5234]  Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
              Specifications: ABNF", STD 68, RFC 5234,
              DOI 10.17487/RFC5234, January 2008,
              <https://www.rfc-editor.org/info/rfc5234>.

   [RFC6940]  Jennings, C., Lowekamp, B., Ed., Rescorla, E., Baset, S.,
              and H. Schulzrinne, "REsource LOcation And Discovery
              (RELOAD) Base Protocol", RFC 6940, DOI 10.17487/RFC6940,
              January 2014, <https://www.rfc-editor.org/info/rfc6940>.

   [RFC8126]  Cotton, M., Leiba, B., and T. Narten, "Guidelines for
              Writing an IANA Considerations Section in RFCs", BCP 26,
              RFC 8126, DOI 10.17487/RFC8126, June 2017,
              <https://www.rfc-editor.org/info/rfc8126>.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 43]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

   [RFC8324]  Klensin, J. and RFC Publisher, "DNS Privacy,
              Authorization, Special Uses, Encoding, Characters,
              Matching, and Root Structure: Time for Another Look?",
              RFC 8324, DOI 10.17487/RFC8324, February 2018,
              <https://www.rfc-editor.org/info/rfc8324>.

   [I-D.schanzen-gns]
              Schanzenbach, M., Grothoff, C., and B. Fix, "The GNU Name
              System", Work in Progress, Internet-Draft, draft-schanzen-
              gns-21, 7 August 2022, <https://www.ietf.org/archive/id/
              draft-schanzen-gns-21.txt>.

   [ed25519]  Bernstein, D., Duif, N., Lange, T., Schwabe, P., and B.
              Yang, "High-Speed High-Security Signatures", 2011,
              <http://link.springer.com/
              chapter/10.1007/978-3-642-23951-9_9>.

   [GANA]     GNUnet e.V., "GNUnet Assigned Numbers Authority (GANA)",
              April 2020, <https://gana.gnunet.org/>.

13.  Informative References

   [R5N]      Evans, N. S. and C. Grothoff, "R5N: Randomized recursive
              routing for restricted-route networks", 2011,
              <https://doi.org/10.1109/ICNSS.2011.6060022>.

   [Kademlia] Maymounkov, P. and D. Mazieres, "Kademlia: A peer-to-peer
              information system based on the xor metric.", 2002,
              <http://css.csail.mit.edu/6.824/2014/papers/kademlia.pdf>.

   [cadet]    Polot, B. and C. Grothoff, "CADET: Confidential ad-hoc
              decentralized end-to-end transport", 2014,
              <https://doi.org/10.1109/MedHocNet.2014.6849107>.

Appendix A.  Bloom filters in R^5N

   R^5N uses Bloom filters in several places.  This section gives some
   general background on Bloom filters and defines functions on this
   data structure shared by the various use-cases in R^5N.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 44]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   A Bloom filter (BF) is a space-efficient probabilistic datastructure
   to test if an element is part of a set of elements.  Elements are
   identified by an element ID.  Since a BF is a probabilistic
   datastructure, it is possible to have false-positives: when asked if
   an element is in the set, the answer from a BF is either "no" or
   "maybe".

   Bloom filters are defined as a string of L bits called "buckets".
   The buckets are initially always empty, meaning that the bits are set
   to zero.  There are two functions which can be invoked on the Bloom
   filter "bf": BF-SET(bf, e) and BF-TEST(bf, e) where "e" is an element
   that is to be added to the Bloom filter or queried against the set.

   A mapping function M is used to map each ID of each element from the
   set to a subset of k buckets.  In the original proposal by Bloom, M
   is non-injective and can thus map the same element multiple times to
   the same bucket.  The type of the mapping function can thus be
   described by the following mathematical notation:

           ------------------------------------
           # M: E->B^k
           ------------------------------------
           # L = Number of buckets
           # B = 0,1,2,3,4,...L-1 (the buckets)
           # k = Number of buckets per element
           # E = Set of elements
           ------------------------------------
           Example: L=256, k=3
           M('element-data') = {4,6,255}


                 Figure 20: Bloom filter mapping function.

   When adding an element to the Bloom filter bf using BF-SET(bf,e),
   each integer n of the mapping M(e) is interpreted as a bit offset n
   mod L within bf and set to 1.

   When testing if an element may be in the Bloom filter bf using BF-
   TEST(bf,e), each bit offset n mod L within bf MUST have been set to
   1.  Otherwise, the element is not considered to be in the Bloom
   filter.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 45]

Internet-Draft       The R5N Distributed Hash Table        December 2022


Appendix B.  Overlay Operations

   An implementation of this specification commonly exposes the two
   overlay operations "GET" and "PUT".  The following are non-normative
   examples of APIs for those operations.  Their behaviour is described
   prosaically in order to give implementers a fuller picture of the
   protocol.

B.1.  GET operation

   A basic GET operation interface may be exposed as:

   GET(Query-Key, Block-Type) -> Results as List

   The procedure typically takes at least two arguments to initiate a
   lookup:

   QueryKey:  is the 512-bit key to look for in the DHT.

   Block-Type:  is the type of block to look for, possibly "any".

   The GET procedure may allow a set of optional parameters in order to
   control or modify the query:

   Replication-Level:  is an integer which controls how many nearest
      peers the request should reach.

   Flags:  is a 16-bit vector which indicates certain processing
      requirements for messages.  Any combination of flags as defined in
      Section 6.1.2 may be specified.

   eXtended-Query (XQuery):  is medatadata which may be required
      depending on the respective Block-Type.  A Block-Type must define
      if the XQuery can or must be used and what the specific format of
      its contents should be.  Extended queries are in general used to
      implement domain-specific filters.  These might be particularly
      useful in combination with FindApproximate to add a well-defined
      filter by an application-specific distance.  Regardless, the DHT
      does not define any particular semantics for an XQuery.  See also
      Section 7.

   Result-Filter:  is data for a Block-type-specific filter which allows
      applications to indicate results which are not relevant anymore to
      the caller (see Section 6.4.2).


Schanzenbach, et al.       Expires 1 July 2023                 [Page 46]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   The GET procedure should be implemented as an asynchronous operation
   that returns individual results as they are found in the DHT.  It
   should terminate only once the application explicitly cancels the
   operation.  A single result commonly consists of:

   Block-Type:  is the desired type of block in the result.

   Block-Data:  is the application-specific block payload.  Contents are
      specific to the Block-Type.

   Block-Expiration:  is the expiration time of the block.  After this
      time, the result should no longer be used.

   Key:  is the key under which the block was stored.  This may be
      different from the key that was queried if the flag
      FindApproximate was set.

   GET-Path:  is a signed path of the IDs of peers which the query
      traversed through the network.  The DHT will try to make the path
      available if the RecordRoute flag was set by the application
      calling the PUT procedure.  The reported path may have been
      silently truncated from the beginning.

   PUT-Path:  is a signed path of the IDs of peers which the result
      message traversed.  The DHT will try to make the path available if
      the RecordRoute flag was set for the GET procedure.  The reported
      path may have been silently truncated from the beginning.  As the
      block was cached by the node at the end of this path, this path is
      more likely to be stale compared to the GET-Path.

B.2.  PUT operation

   A PUT operation interface may be exposed as:

   PUT(Key, Block-Type, Block-Expiration, Block-Data)

   The procedure typically takes at least four parameters:

   Key:  is the key under which to store the block.

   Block-Type:  is the type of the block to store.

   Block-Expiration:  specifies when the block should expire.

   Block-Data:  is the application-specific payload of the block to
      store.


Schanzenbach, et al.       Expires 1 July 2023                 [Page 47]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   The PUT procedure may allow a set of optional parameters in order to
   control or modify the query:

   Replication-Level:  is an integer which controls how many nearest
      peers the request should reach.

   Flags:  is a bit-vector which indicates certain processing
      requirements for messages.  Any combination of flags as defined in
      Section 6.1.2 may be specified.

   The PUT procedure does not necessarily yield any information.

Appendix C.  HELLO URLs

   The general format of a HELLO URL uses "gnunet://" as the scheme,
   followed by "hello/" for the name of the GNUnet subsystem, followed
   by "/"-separated values with the GNS Base32 encoding
   ([I-D.schanzen-gns]) of the Peer ID, a Base32-encoded EdDSA
   signature, and an expiration time in seconds since the UNIX Epoch in
   decimal format.  After this a "?" begins a list of key-value pairs
   where the key is the URI scheme of one of the peer's addresses and
   the value is the URL-escaped payload of the address URI without the
   "://".

   For example, consider the following URL:

   gnunet://hello/RH1M20EPK834M6MHZ72\
   G3CMBSF3ECKNY4W0T9VAQP9Z7SZEM6Y3G/\
   NGRTAH6RA04X467CGCH7M7CEXR5F9CV5HT\
   ZFK0G9BWETY3CCE2QWGVT4WA7JN5M9HMWG\
   60A00R71F1PJP8N5628EKGHHBAGA7M8JW3\
   0/1647134480?udp=127.0.0.1%3A2086

   FIXME: signature is invalid, should
   maybe generate proper test vector.


                                 Figure 21

   It specifies that the peer with the ID "RH1M...6Y3G" is reachable via
   "udp" at 127.0.0.1 on port 2086 until 1647134480 seconds after the
   Epoch.  Note that "udp" here is underspecified and just used as a
   simple example.  In practice, the key (addr-name) refers to a scheme
   supported by a DHT Underlay.

   The general syntax of HELLO URLs specified using Augmented Backus-
   Naur Form (ABNF) of [RFC5234] is:


Schanzenbach, et al.       Expires 1 July 2023                 [Page 48]

Internet-Draft       The R5N Distributed Hash Table        December 2022


   hello-URL = "gnunet://hello/" meta [ "?" addrs ]
   meta = pid "/" sig "/" exp
   pid = *bchar
   sig = *bchar
   exp = *DIGIT
   addrs = addr *( "&" addr )
   addr = addr-name "=" addr-value
   addr-name = scheme
   addr-value = *pchar
   bchar = *(ALPHA / DIGIT)


                                 Figure 22

   'scheme' is defined in [RFC3986] in Section 3.1.  'pchar' is defined
   in [RFC3986], Appendix A.

Authors' Addresses

   Martin Schanzenbach
   Fraunhofer AISEC
   Lichtenbergstrasse 11
   85748 Garching
   Germany
   Email: martin.schanzenbach@aisec.fraunhofer.de


   Christian Grothoff
   Berner Fachhochschule
   Hoeheweg 80
   CH-2501 Biel/Bienne
   Switzerland
   Email: grothoff@gnunet.org


   Bernd Fix
   GNUnet e.V.
   Boltzmannstrasse 3
   85748 Garching
   Germany
   Email: fix@gnunet.org


Schanzenbach, et al.       Expires 1 July 2023                 [Page 49]