TRANS L. Nordberg Internet-Draft NORDUnet Intended status: Experimental D. Gillmor Expires: April 22, 2016 ACLU T. Ritter October 20, 2015 Gossiping in CT draft-ietf-trans-gossip-01 Abstract The logs in Certificate Transparency are untrusted in the sense that the users of the system don't have to trust that they behave correctly since the behaviour of a log can be verified to be correct. This document tries to solve the problem with logs presenting a "split view" of their operations. It describes three gossiping mechanisms for Certificate Transparency: SCT Feedback, STH Pollination and Trusted Auditor Relationship. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on April 22, 2016. Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of Nordberg, et al. Expires April 22, 2016 [Page 1] Internet-Draft Gossiping in CT October 2015 publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Defining the problem . . . . . . . . . . . . . . . . . . . . 3 3. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4. Terminology and data flow . . . . . . . . . . . . . . . . . . 5 5. Who gossips with whom . . . . . . . . . . . . . . . . . . . . 6 6. What to gossip about and how . . . . . . . . . . . . . . . . 6 7. Gossip Mechanisms . . . . . . . . . . . . . . . . . . . . . . 6 7.1. SCT Feedback . . . . . . . . . . . . . . . . . . . . . . 6 7.1.1. HTTPS client to server . . . . . . . . . . . . . . . 7 7.1.2. HTTPS server to auditors . . . . . . . . . . . . . . 9 7.1.3. SCT Feedback data format . . . . . . . . . . . . . . 10 7.2. STH pollination . . . . . . . . . . . . . . . . . . . . . 10 7.2.1. HTTPS Clients and Proof Fetching . . . . . . . . . . 12 7.2.2. STH Pollination without Proof Fetching . . . . . . . 13 7.2.3. Auditor and Monitor Action . . . . . . . . . . . . . 13 7.2.4. STH Pollination data format . . . . . . . . . . . . . 13 7.3. Trusted Auditor Stream . . . . . . . . . . . . . . . . . 14 7.3.1. Trusted Auditor data format . . . . . . . . . . . . . 14 8. 3-Method Ecosystem . . . . . . . . . . . . . . . . . . . . . 14 8.1. SCT Feedback . . . . . . . . . . . . . . . . . . . . . . 15 8.2. STH Pollination . . . . . . . . . . . . . . . . . . . . . 15 8.3. Trusted Auditor Relationship . . . . . . . . . . . . . . 16 8.4. Interaction . . . . . . . . . . . . . . . . . . . . . . . 17 9. Security considerations . . . . . . . . . . . . . . . . . . . 17 9.1. Censorship/Blocking considerations . . . . . . . . . . . 17 9.2. Privacy considerations . . . . . . . . . . . . . . . . . 19 9.2.1. Privacy and SCTs . . . . . . . . . . . . . . . . . . 19 9.2.2. Privacy in SCT Feedback . . . . . . . . . . . . . . . 19 9.2.3. Privacy for HTTPS clients performing STH Proof Fetching . . . . . . . . . . . . . . . . . . . . . . 20 9.2.4. Privacy in STH Pollination . . . . . . . . . . . . . 20 9.2.5. Privacy in STH Interaction . . . . . . . . . . . . . 21 9.2.6. Trusted Auditors for HTTPS Clients . . . . . . . . . 21 9.2.7. HTTPS Clients as Auditors . . . . . . . . . . . . . . 22 10. Policy Recommendations . . . . . . . . . . . . . . . . . . . 22 10.1. Mixing Recommendations . . . . . . . . . . . . . . . . . 22 10.2. Blocking Recommendations . . . . . . . . . . . . . . . . 24 10.2.1. Frustrating blocking . . . . . . . . . . . . . . . . 24 10.2.2. Responding to possible blocking . . . . . . . . . . 24 Nordberg, et al. Expires April 22, 2016 [Page 2] Internet-Draft Gossiping in CT October 2015 11. IANA considerations . . . . . . . . . . . . . . . . . . . . . 25 12. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 25 13. ChangeLog . . . . . . . . . . . . . . . . . . . . . . . . . . 25 13.1. Changes between ietf-00 and ietf-01 . . . . . . . . . . 25 13.2. Changes between -01 and -02 . . . . . . . . . . . . . . 25 13.3. Changes between -00 and -01 . . . . . . . . . . . . . . 25 14. Normative References . . . . . . . . . . . . . . . . . . . . 26 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 26 1. Introduction The purpose of the protocols in this document, collectively referred to as CT Gossip, is to detect certain misbehavior by CT logs. In particular, CT Gossip aims to detect logs that are providing incosistent views to different log clients and logs failing to include submitted certificates within the time period stipulated by MMD. [TODO: enumerate the interfaces used for detecting misbehaviour?] One of the major challenges of any gossip protocol is limiting damage to user privacy. The goal of CT gossip is to publish and distribute information about the logs and their operations, but not to leak any additional information about the operation of any of the other participants. Privacy of consumers of log information (in particular, of web browsers and other TLS clients) should not be undermined by gossip. This document presents three different, complementary mechanisms for non-log elements of the CT ecosystem to exchange information about logs in a manner that preserves the privacy of HTTPS clients. They should provide protective benefits for the system as a whole even if their adoption is not universal. 2. Defining the problem When a log provides different views of the log to different clients this is described as a partitioning attack. Each client would be able to verify the append-only nature of the log but, in the extreme case, each client might see a unique view of the log. The CT logs are public, append-only and untrusted and thus have to be monitored for consistency, i.e., they should never rewrite history. Additionally, monitors and other log clients need to exchange information about monitored logs in order to be able to detect a partitioning attack (as described above). Nordberg, et al. Expires April 22, 2016 [Page 3] Internet-Draft Gossiping in CT October 2015 Gossiping about log responses to queries helps address the problem of detecting malicious or compromised logs with respect to a partitioning attack. We want some side of the partitioned tree, and ideally both sides, to see the other side. Disseminating information about a log poses a potential threat to the privacy of end users. Some data of interest (e.g. SCTs) are linkable to specific log entries and thereby to specific sites, which makes sharing them with others privacy-sensitive. Gossiping about this data has to take privacy considerations into account in order not to leak associations between users of the log (e.g., web browsers) and certificate holders (e.g., web sites). Even sharing STHs (which do not link to specific log entries) can be problematic - user tracking by fingerprinting through rare STHs is one potential attack (see Section 7.2). 3. Overview SCT Feedback enables HTTPS clients to share Signed Certificate Timestamps (SCTs) (Section 3.3 of [RFC-6962-BIS]) with CT auditors in a privacy-preserving manner by sending SCTs to originating HTTPS servers which in turn share them with CT auditors. In STH Pollination, HTTPS clients use HTTPS servers as pools sharing Signed Tree Heads (STHs) (Section 3.6 of [RFC-6962-BIS]) with other connecting clients in the hope that STHs will find their way to auditors and monitors. HTTPS clients in a Trusted Auditor Relationship share SCTs and STHs with trusted auditors or monitors directly, with expectations of privacy sensitive data being handled according to whatever privacy policy is agreed on between client and trusted party. Despite the privacy risks with sharing SCTs there is no loss in privacy if a client sends SCTs for a given site to the site corresponding to the SCT. This is because the site's logs would already indicate that the client is accessing that site. In this way a site can accumulate records of SCTs that have been issued by various logs for that site, providing a consolidated repository of SCTs that could be shared with auditors. Auditors can use this information to detect logs that misbehaves by not including certificates within the time period stipulated by the MMD metadata. Sharing an STH is considered reasonably safe from a privacy perspective as long as the same STH is shared by a large number of other log clients. This "safety in numbers" can be achieved by requiring gossiping of STHs only of a certain "freshness" while also Nordberg, et al. Expires April 22, 2016 [Page 4] Internet-Draft Gossiping in CT October 2015 refusing to gossip about STHs from logs with too high an STH issuance frequency (see Section 7.2). 4. Terminology and data flow This document relies on terminology and data structures defined in [RFC-6962-BIS], including STH, SCT, Version, LogID, SCT timestamp, CtExtensions, SCT signature, Merkle Tree Hash. The following picture shows how certificates, SCTs and STHs flow through a CT system with SCT Feedback and STH Pollination. It does not show what goes in the Trusted Auditor Relationship stream. +- Cert ---- +----------+ | | CA | ----------+ | + SCT -> +----------+ | v | Cert [& SCT] +----------+ | | Log | ---------- SCT -----------+ +----------+ v | ^ +----------+ | | SCT & Certs --- | Website | | |[1] | +----------+ | |[2] STH ^ | | |[3] v | | | | +----------+ | | | +--------> | Auditor | | HTTPS traffic | +----------+ | | | / | SCT | / SCT & Certs | Log entries / | | | / STH STH v /[4] | | +----------+ | v | Monitor | +----------+ +----------+ | Browser | +----------+ # Auditor Log [1] |--- get-sth ------------------->| |<-- STH ------------------------| [2] |--- leaf hash + tree size ----->| |<-- index + inclusion proof --->| [3] |--- tree size 1 + tree size 2 ->| |<-- consistency proof ----------| [4] SCT, cert and STH among multiple Auditors and Monitors Nordberg, et al. Expires April 22, 2016 [Page 5] Internet-Draft Gossiping in CT October 2015 5. Who gossips with whom o HTTPS clients and servers (SCT Feedback and STH Pollination) o HTTPS servers and CT auditors (SCT Feedback) o CT auditors and monitors (Trusted Auditor Relationship) Additionally, some HTTPS clients may engage with an auditor who they trust with their privacy: o HTTPS clients and CT auditors (Trusted Auditor Relationship) 6. What to gossip about and how There are three separate gossip streams: o SCT Feedback - transporting SCTs and certificate chains from HTTPS clients to CT auditors/monitors via HTTPS servers. o STH Pollination - HTTPS clients and CT auditors/monitors using HTTPS servers as STH pools for exchanging STHs. o Trusted Auditor Stream, HTTPS clients communicating directly with trusted CT auditors/monitors sharing SCTs, certificate chains and STHs. 7. Gossip Mechanisms 7.1. SCT Feedback The goal of SCT Feedback is for clients to share SCTs and certificate chains with CT auditors and monitors while still preserving the privacy of the end user. The sharing of SCTs contribute to the overall goal of detecting misbehaving logs by providing auditors and monitors with SCTs from many vantage points, making it possible to catch a higher number of violations of MMD and also catch logs presenting inconsistent views. SCT Feedback is the most privacy-preserving gossip mechanism, as it does not directly expose any links between an end user and the sites they've visisted to any third party. [Here's an alternative to that paragraph: SCT Feedback is the most privacy-preserving gossip mechanism, as it does not create any potential cross-origin tracking mechanisms. ] Nordberg, et al. Expires April 22, 2016 [Page 6] Internet-Draft Gossiping in CT October 2015 HTTPS clients store SCTs and certificate chains they see, and later send them to the originating HTTPS server by posting them to a well- known URL (associated with that server), as described in Section 7.1.1. Note that clients will send the same SCTs and chains to servers multiple times with the assumption that a potential man- in-the-middle attack eventually will cease, and an honest server will receive collected malicious SCTs and certificate chains. HTTPS servers store SCTs and certificate chains received from clients and later share them with CT auditors by either posting them to auditors or making them available via a well-known URL. This is described in Section 7.1.2. 7.1.1. HTTPS client to server When an HTTPS client connects to an HTTPS server, the client receives a set of SCTs as part of the TLS handshake. The client MUST discard SCTs that are not signed by a log known to the client and SHOULD store the remaining SCTs together with the corresponding certificate chain for later use in SCT Feedback. When the client later reconnects to any HTTPS server for the same domain, it again receives a set of SCTs. The client MUST add new SCTs from known logs to its store of SCTs for the server. The client MUST send to the server any SCTs in the store that are associated with that server but which were not received from that server. [TODO: fix the above paragraph - it is vague and confusing. maybe an example including a client caching at most one SCT per host+log would clarify] [TODO: define "same domain"] Note that the SCT store also contains SCTs received in certificates. The client MUST NOT send the same set of SCTs to the same server more often than TBD. [benl says: "sent to the server" only really counts if the server presented a valid SCT in the handshake and the certificate is known to be unrevoked (which will be hard for a MitM to sustain)] [TODO: expand on rate/resource limiting motivation] Refer to Section 10.1 for recommendations about strategies. An SCT MUST NOT be sent to any other HTTPS server than one serving the domain to which the certificate signed by the SCT refers. Not Nordberg, et al. Expires April 22, 2016 [Page 7] Internet-Draft Gossiping in CT October 2015 following this constraint would lead to two types of privacy leaks. First, the server receiving the SCT would learn about other sites visited by the HTTPS client. Secondly, auditors or monitors receiving SCTs from the HTTPS server would learn information about the other HTTPS servers visited by its clients. If the HTTPS client has configuration options for not sending cookies to third parties, SCTs of third parties MUST be treated as cookies with respect to this setting. This prevents third party tracking through the use of SCTs/certificates, which would bypass the cookie policy. SCTs and corresponding certificates are POSTed to the originating HTTPS server at the well-known URL: https:///.well-known/ct/v1/sct-feedback The data sent in the POST is defined in Section 7.1.3. HTTPS servers perform a number of sanity checks on SCTs from clients before storing them: 1. if a bit-wise compare of an SCT plus chain matches a pair already in the store, this SCT and chain pair MAY be discarded 2. if the SCT can't be verified to be a valid SCT for the accompanying leaf cert, issued by a known log, the SCT SHOULD be discarded 3. if the leaf cert is not for a domain for which the server is authoritative, the SCT MUST be discarded Check number 1 is for detecting duplicates and minimizing processing and storage by the server. It's important to note that the check should be on pairs of SCT and chain in order to catch different chains accompanied by the same SCT. This mis-matched chain information may be useful as a diagnostic tool for HTTPS server operators. Check number 2 is to prevent DoS attacks where an adversary can fill up the store prior to attacking a client, or a denial of service attack on the server's storage space. Check number 3 is to help malfunctioning clients from leaking which sites they visit and additionally to prevent DoS attacks. Note that an HTTPS server MAY choose to store a submitted SCT and the accompanying certificate chain even when the SCT can't be verified Nordberg, et al. Expires April 22, 2016 [Page 8] Internet-Draft Gossiping in CT October 2015 according to check number 2. One such case would be when a certificate chain validation is performed and the chain ends in a trust anchor configured on the server. In this instance, the server could also be configured to not bother with known-to-be-good (i.e. administratively-vetted) leaf certificates, and only store unknown leaf certificates that chain to a known trust anchor. The risk of spamming and denial of service can be mitigated by configuring the server with all known acceptable certificates (or certificate hashes) applicable to this server. This information may enable a HTTPS server operator to detect attacks or unusual behavior of Certificate Authorities even outside the Certificate Transparency ecosystem. 7.1.2. HTTPS server to auditors HTTPS servers receiving SCTs from clients SHOULD share SCTs and certificate chains with CT auditors by either serving them on the well-known URL: https:///.well-known/ct/v1/collected-sct-feedback or by HTTPS POSTing them to a set of preconfigured auditors. This allows an HTTPS server to choose between an active push model or a passive pull model. The data received in a GET of the well-known URL or sent in the POST is defined in Section 7.1.3. HTTPS servers SHOULD share all SCTs and accompanying certificate chains they see that pass the checks in Section 7.1.1. If this is an infeasible amount of data, the server may choose to expire submissions according to an undefined policy. Suggestions for such a policy can be found in Section 10.1. HTTPS servers MUST NOT share any other data that they may learn from the submission of SCT Feedback by HTTPS clients, like the HTTPS client IP address or the time of submission. Auditors SHOULD provide the following URL accepting HTTPS POSTing of SCT feedback data: https:///ct/v1/sct-feedback Auditors SHOULD regularly poll HTTPS servers at the well-known collected-sct-feedback URL. The frequency of the polling and how to determine which domains to poll is outside the scope of this document. However, the selection MUST NOT be influenced by potential HTTPS clients connecting directly to the auditor. For example, if a poll to example.com occurs directly after a client submits an SCT for Nordberg, et al. Expires April 22, 2016 [Page 9] Internet-Draft Gossiping in CT October 2015 example.com, an adversary observing the auditor can trivially conclude the activity of the client. 7.1.3. SCT Feedback data format The data shared between HTTPS clients and servers, as well as between HTTPS servers and CT auditors/monitors, is a JSON object [RFC7159] with the following content: o sct_feedback: An array of objects consisting of * x509_chain: An array of base64-encoded X.509 certificates. The first element is the end-entity certificate, the second chains to the first and so on. * sct_data: An array of objects consisting of the base64 representation of the binary SCT data as defined in [RFC-6962-BIS] Section 3.3. The 'x509_chain' element MUST contain at least the leaf certificate and SHOULD contain the full chain to a root accepted by all of the logs in the set of logs issuing all the SCTs in the 'sct_data' element. Some clients have trust anchors that are locally added (e.g. by an administrator or by the user themselves). A local trust anchors is potentially privacy-sensitive since it may carry information about the specific computer or user. If a certificate is covered by SCTs issued by publicly trusted logs, but it chains to a privacy-sensitive local trust anchor, the client SHOULD submit it as an "x509\_chain" consisting only of the leaf certificate. [TBD: Be strict about what sct_data may contain or is this sufficiently implied by previous sections?] [TBD: There was discussion about including a few field for client->server reporting, which is the exact set and order of certificates sent by the HTTPS server to the client. This is additional diagnostic information that a HTTPS server could use to check it's deployment... but is pretty much useless to CT or gossip. Right now we're not including this, but we're polling server operators to see if they would welcome this data.] 7.2. STH pollination The goal of sharing Signed Tree Heads (STHs) through pollination is to share STHs between HTTPS clients, CT auditors, and monitors in while still preserving the privacy of the end user. The sharing of Nordberg, et al. Expires April 22, 2016 [Page 10] Internet-Draft Gossiping in CT October 2015 STHs contribute to the overall goal of detecting misbehaving logs by providing CT auditors and monitors with SCTs from many vantage points, making it possible to detect logs that are presenting inconsistent views. HTTPS servers supporting the protocol act as STH pools. HTTPS clients and CT auditors and monitors in the possession of STHs should pollinate STH pools by sending STHs to them, and retrieving new STHs to send to other STH pools. CT auditors and monitors should perform their auditing and monitoring duties by retrieving STHs from pools. STH Pollination is carried out by sending STHs to HTTPS servers supporting the protocol, and retrieving new STHs. In the case of HTTPS clients, STHs SHOULD be sent in an already established TLS session. This makes it hard for an attacker to disrupt STH gossiping without also disturbing ordinary secure browsing (https://). This is discussed more in Section 10.2.1. HTPS clients send STHs to HTTPS servers by POSTing them to the well- known URL: https:///.well-known/ct/v1/sth-pollination The data sent in the POST is defined in Section 7.2.4. The response contains zero or more STHs in the same format, described in Section 7.2.4. An HTTPS client may acquire STHs by several methods: o in replies to pollination POSTs; o asking logs that it recognises for the current STH, either directly (v2/get-sth) or indirectly (for example over DNS) o resolving an SCT and certificate to an STH via an inclusion proof o resolving one STH to another via a consistency proof HTTPS clients (who have STHs), CT auditors, and monitors SHOULD pollinate STH pools with STHs. Which STHs to send and how often pollination should happen is regarded as undefined policy with the exception of privacy concerns explained in the next section. Suggestions for the policy may be found in Section 10.1. An HTTPS client could be tracked by giving it a unique or rare STH. To address this concern, we place restrictions on different components of the system to ensure an STH will not be rare. Nordberg, et al. Expires April 22, 2016 [Page 11] Internet-Draft Gossiping in CT October 2015 o HTTPS clients sliently ignore STHs from logs with an STH issuance frequency of more than one STH per hour. Logs use the STH Frequency Count metadata to express this ([RFC-6962-BIS] sections 3.6 and 5.1). o HTTPS clients silently ignore STHs which are not fresh. An STH is considered fresh iff its timestamp is less than 14 days in the past. Given a maximum STH issuance rate of one per hour, an attacker has 336 unique STHs per log for tracking. Clients MUST ignore STHs older than 14 days. We consider STHs within this validity window to be personally identifiable data, and STHs outside this window not personally identifiable. A log may cease operation, in which case there will soon be no STH within the validity window. Clients SHOULD perform all three methods of gossip about a log that has ceased operation - it is possible the log was still compromised and gossip can detect that. STH Pollination is the one mechanism where a client must know about a log shutdown. A client who does not know about a log shutdown MUST NOT attempt any heuristic to detect a shutdown. Instead the client MUST be informed about the shutdown from a verifiable source (e.g. a software update). The client SHOULD be provided the final STH issued by the log and SHOULD resolve SCTs and STHs to this final STH. If an SCT or STH cannot be resolved to the final STH... XXX? When multiplied by the number of logs from which a client accepts STHs, this number of unique STHs grow and the negative privacy implications grow with it. It's important that this is taken into account when logs are chosen for default settings in HTTPS clients. This concern is discussed upon in Section 9.2.5. 7.2.1. HTTPS Clients and Proof Fetching There are two types of proofs a client may retrieve. An HTTPS client will retrieve SCTs from an HTTPS server, and must obtain an inclusion proof to an STH in order to verify the promise made by the SCT. An HTTPS client may receive SCT bundled with an inclusion proof to a historical STH via an unspecified future mechanism. Because this historical STH is considered personally identifiable information per above, the client must obtain a consistency proof to a more recent STH. If a client requested either proof directly from a log or auditor, it would reveal the client's browsing habits to a third party. To Nordberg, et al. Expires April 22, 2016 [Page 12] Internet-Draft Gossiping in CT October 2015 mitigate this risk, an HTTPS client MUST retrieve the proof in a manner that disguises the client. Depending on the client's DNS provider, DNS may provide an appropriate intermediate layer that obfuscates the linkability between the user of the client and the request for inclusion (while at the same time providing a caching layer for oft-requested inclusion proofs.) [TODO: Add a reference to Google's DNS mechanism more proper than http://www.certificate-transparency.org/august-2015-newsletter] Anonymity networks such as Tor also present a mechanism for a client to anonymously retrieve a proof from an auditor or log. 7.2.2. STH Pollination without Proof Fetching An HTTPS client MAY participate in STH Pollination without fetching proofs. In this situation, the client receives STHs from a server, applies the same validation logic to them (signed by a known log, within a validity window) and will later pass them to a HTTPS server. When operating in this fashion, the HTTPS client is promoting gossip for Certificate Transparency, but derives no direct benefit itself. In comparison, a client who resolves SCTs or historical STHs to recent STHs and pollinates them is assured that if it was attacked, there is a probability that the ecosystem will detect and respond to the attack (by distrusting the log). 7.2.3. Auditor and Monitor Action Auditors and Monitors participate in STH pollination by retrieving STHs from HTTPS servers. They verify that the STH is valid by checking the signature, and requesting a consistency proof from the STH to the most recent STH. After retrieving the consistency proof to the most recent STH, they SHOULD pollinate this new STH among participating HTTPS Servers. In this way, as STHs "age out" and are no longer fresh, their "lineage" continues to be tracked in the system. 7.2.4. STH Pollination data format The data sent from HTTPS clients and CT monitors and auditors to HTTPS servers is a JSON object [RFC7159] with the following content: o sths - an array of 0 or more fresh SignedTreeHead's as defined in [RFC-6962-BIS] Section 3.6.1. Nordberg, et al. Expires April 22, 2016 [Page 13] Internet-Draft Gossiping in CT October 2015 [XXX An STH is considered fresh iff TBD.] 7.3. Trusted Auditor Stream HTTPS clients MAY send SCTs and cert chains, as well as STHs, directly to auditors. Note that there are privacy implications in doing so, these are outlined in Section 9.2.1 and Section 9.2.6. The most natural trusted auditor arrangement arguably is a web browser that is "logged in to" a provider of various internet services. Another equivalent arrangement is a trusted party like a corporation to which an employee is connected through a VPN or by other similar means. A third might be individuals or smaller groups of people running their own services. In such a setting, retrieving proofs from that third party could be considered reasonable from a privacy perspective. The HTTPS client does its own auditing and might additionally share SCTs and STHs with the trusted party to contribute to herd immunity. Here, the ordinary [RFC-6962-BIS] protocol is sufficient for the client to do the auditing while SCT Feedback and STH Pollination can be used in whole or in parts for the gossip part. Another well established trusted party arrangement on the internet today is the relation between internet users and their providers of DNS resolver services. DNS resolvers are typically provided by the internet service provider (ISP) used, which by the nature of name resolving already know a great deal about which sites their users visit. As mentioned in Section XXX, in order for HTTPS clients to be able to retrieve proofs in a privacy preserving manner, logs could expose a DNS interface in addition to the ordinary HTTPS interface. An informal writeup of such a protocol can be found at XXX. 7.3.1. Trusted Auditor data format [TBD specify something here or leave this for others?] 8. 3-Method Ecosystem The use of three distinct methods for monitoring logs may seem excessive, but each represents a needed component in the CT ecosystem. To understand why, the drawbacks of each component must be outlined. In this discussion we assume that an attacker knows which mechanisms an HTTPS client and HTTPS server implement. Nordberg, et al. Expires April 22, 2016 [Page 14] Internet-Draft Gossiping in CT October 2015 8.1. SCT Feedback SCT Feedback requires the cooperation of HTTPS clients and more importantly HTTPS servers. Although SCT Feedback does require a significant amount of server-side logic to respond to the corresponding APIs, this functionality does not require customization, so it may be pre-provides and work out of the box. However, to take full advantage of the system, an HTTPS server would wish to perform some configuration to optimize its operation: o Minimize its disk commitment by whitelisting known SCTs and certificate chains o Maximize its chance of detecting a misissued certificate by configuring a trust store of CAs o Establish a "push" mechanism for POSTing SCTs to Auditors and Monitors These configuration needs, and the simple fact that it would require some deployment of software, means that some percentage of HTTPS servers will not deploy SCT Feedback. If SCT Feedback was the only mechanism in the ecosystem, any server that did not implement the feature, would open itself and its users to attack without any possibility of detection. If SCT Feedback was not deployed, users who wished to have the strongest measure of privacy protection (by disabling STH Pollination Proof Fetching and forgoing a Trusted Auditor) could be attacked without risk of detection. 8.2. STH Pollination STH Pollination requires the cooperation of HTTPS clients, HTTPS servers, and logs. For a client to fully participate in STH Pollination, and have this mechanism detect attacks against it, the client must have a way to safely perform Proof Fetching in a privacy preserving manner. The client may pollinate STHs it receives without performing Proof Fetching, but we do not consider this option in this section. HTTPS Servers must deploy software (although, as in the case with SCT Feedback this logic can be pre-provided) and commit some configurable amount of disk space to the endeavor. Nordberg, et al. Expires April 22, 2016 [Page 15] Internet-Draft Gossiping in CT October 2015 Logs must provide access to clients to query proofs in a privacy preserving manner, most likely through DNS. Unlike SCT Feedback, the STH Pollination mechanism is not hampered if only a minority of HTTPS servers deploy it. However, it makes an assumption that an HTTPS client performs anonymized Proof Fetching (such as the DNS mechanism discussed). However, any manner that is anonymous for some (such as clients who use shared DNS services such as a large ISP), may not be anonymous for others. For instance, DNS leaks a considerable amount of information (including what data is already present in the cache) in plaintext over the network. For this reason, some percentage of HTTPS clients may choose to not enable the Proof Fetching component of STH pollination. (Although they can still request and send STHs among participating HTTPS servers, as mentioned earlier this affords them no direct benefit.) If STH Pollination was the only mechanism deployed, users that disable it would be able to be attacked without risk of detection. If STH Pollination was not deployed, HTTPS Clients visiting HTTPS Servers who did not deploy SCT Feedback could be attacked without risk of detection. 8.3. Trusted Auditor Relationship The Trusted Auditor Relationship is expected to be the rarest gossip mechanism, as an HTTPS Client is providing an unadulterated report of its browsing history to a third party. While there are valid and common reasons for doing so, there is no appropriate way to enter into this relationship without retrieving informed consent from the user. However, the Trusted Auditor Relationship mechanism still provides value to a class of HTTPS Clients. For example, web crawlers have no concept of a "user" and no expectation of privacy. Organizations already performing network monitoring for anomalies or attacks can run their own Trusted Auditor for the same purpose with marginal increase in privacy concerns. The ability to change one's Trusted Auditor is a form of Trust Agility that allows a user to choose who to trust, and be able to revise that decision later without consequence. A Trusted Auditor connection can be made more confidential than DNS (through the use of TLS), and can even be made (somewhat) anonymous through the use of anonymity services such as Tor. (Note that this does ignore the de- Nordberg, et al. Expires April 22, 2016 [Page 16] Internet-Draft Gossiping in CT October 2015 anonymization possibilities available from viewing a user's browsing history.) If the Trusted Auditor relationship was the only mechanism deployed, users who do not enable it (the majority) would be able to be attacked without risk of detection. If the Trusted Auditor relationship was not deployed, crawlers and organizations would build it themselves for their own needs. By standardizing it, users who wish to opt-in (for instance those unwilling to participate fully in STH Pollination) can have an interoperable standard they can use to choose and change their trusted auditor. 8.4. Interaction The interactions of the mechanisms is thus outlined: HTTPS Clients can be attacked without risk of detection if they do not participate in any of the three mechanisms. HTTPS Clients are afforded the greatest chance of detecting an attack when they either participate in STH Pollination with Proof Fetching or have a Trusted Auditor relationship. Participating in SCT Feedback enables a HTTPS Client to assist in detecting the exact target of an attack, although they do not gain any direct benefit from it. HTTPS Servers that omit SCT Feedback may never learn about targeted attacks against them, even if the attack occurred and the log distrusted. They do gain some herd immunity, enabling them to detect attacks, through their clients participating in STH Pollination or a Trusted Auditor Relationship. When HTTPS Servers omit SCT feedback, it allow a portion of their users to be attacked without detection; the vulnerable users are those who do not participate in STH Pollination with Proof Fetching and that not have a Trusted Auditor relationship. 9. Security considerations 9.1. Censorship/Blocking considerations We assume a network attacker who is able to fully control the client's internet connection for some period of time - including selectively blocking requests to certain hosts and truncating TLS connections based on information observed or guessed about client Nordberg, et al. Expires April 22, 2016 [Page 17] Internet-Draft Gossiping in CT October 2015 behavior. In order to successfully detect log misbehavior, the gossip mechanisms must still work even in these conditions. There are several gossip connections that can be blocked: 1. Clients sending SCTs to servers in SCT Feedback 2. Servers sending SCTs to auditors in SCT Feedback (server push mechanism) 3. Servers making SCTs available to auditors (auditor pull mechanism) 4. Clients fetching proofs in STH Pollination 5. Clients sending STHs to servers in STH Pollination 6. Servers sending STHs to clients in STH Pollination 7. Clients sending SCTs to Trusted Auditors If a party cannot connect to another party, it can be assured that the connection did not succeed. While it may not have been maliciously blocked, it knows the transaction did not succeed. Mechanisms which result in a positive affirmation from the recipient that the transaction succeeded allow confirmation that a connection was not blocked. In this situation, the party can factor this into strategies suggested in Section 10.1 and in Section 10.2.2. The connections that allow positive affirmation are 1, 2, 4, 5, and 7. More insidious is blocking the connections that do not allow positive confirmation: 3 and 6. An attacker may truncate a or drop a response from a server to a client, such that the server believes it has shared data with the recipient, when it has not. However, in both scenatios (3 and 6), the server cannot distinguish the client as a cooperating member of the CT ecosystem or as an attacker performing a sybil attack, aiming to flush the server's data store. Therefore the fact that these connections can be undetectably blocked does not actually alter the threat model of servers responding to these requests. The choice of algorithm to release data is crucial to protect against these attacks, strategies are suggested in Section 10.1. Handling censorship and network blocking (which is indistinguishable from network error) is relegated to the implementation policy chosen Nordberg, et al. Expires April 22, 2016 [Page 18] Internet-Draft Gossiping in CT October 2015 by clients. Suggestions for client behavior are specified in Section 10.2. 9.2. Privacy considerations CT Gossip deals with HTTPS Clients which are trying to share indicators that correspond to their browsing history. The most sensitive relationships in the CT ecosystem are the relationships between HTTPS clients and HTTPS servers. Client-server relationships can be aggregated into a network graph with potentially serious implications for correlative de-anonymisation of clients and relationship-mapping or clustering of servers or of clients. There are, however, certain clients that do not require privacy protection. Examples of these clients are web crawlers or robots but even in this case, the method by which these clients crawl the web may in fact be considered sensitive information. In general, it is better to err on the side of safety, and not assume a client is okay with giving up its privacy. 9.2.1. Privacy and SCTs An SCT contains information that links it to a particular web site. Because the client-server relationship is sensitive, gossip between clients and servers about unrelated SCTs is risky. Therefore, a client with an SCT for a given server should transmit that information in only two channels: to a server associated with the SCT itself; and to a trusted CT auditor, if one exists. 9.2.2. Privacy in SCT Feedback SCTs introduce yet another mechanism for HTTPS servers to store state on an HTTPS client, and potentially track users. HTTPS clients which allow users to clear history or cookies associated with an origin MUST clear stored SCTs associated with the origin as well. Auditors should treat all SCTs as sensitive data. SCTs received directly from an HTTPS client are especially sensitive, because the auditor is a trusted by the client to not reveal their associations with servers. Auditors MUST NOT share such SCTs in any way, including sending them to an external log, without first mixing them with multiple other SCTs learned through submissions from multiple other clients. Suggestions for mixing SCTs are presented in Section 10.1. There is a possible fingerprinting attack where a log issues a unique SCT for targeted log client(s). A colluding log and HTTPS server operator could therefore be a threat to the privacy of an HTTPS Nordberg, et al. Expires April 22, 2016 [Page 19] Internet-Draft Gossiping in CT October 2015 client. Given all the other opportunities for HTTPS servers to fingerprint clients - TLS session tickets, HPKP and HSTS headers, HTTP Cookies, etc. - this is acceptable. The fingerprinting attack described above would be mitigated by a requirement that logs MUST use a deterministic signature scheme when signing SCTs ([RFC-6962-BIS] Section 2.1.4). A log signing using RSA is not required to use a deterministic signature scheme. Since logs are allowed to issue a new SCT for a certificate already present in the log, mandating deterministic signatures does not stop this fingerprinting attack altogether. It does make the attack harder to pull off without being detected though. There is another similar fingerprinting attack where an HTTPS server tracks a client by using a variation of cert chains. The risk for this attack is accepted on the same grounds as the unique SCT attack described above. [XXX any mitigations possible here?] 9.2.3. Privacy for HTTPS clients performing STH Proof Fetching An HTTPS client performing Proof Fetching should only request proofs from a CT log that it accepts SCTs from. An HTTPS client should regularly [TBD how regularly? This has operational implications for log operators] request an STH from all logs it is willing to accept, even if it has seen no SCTs from that log. The actual mechanism by which Proof Fetching is done carries considerable privacy concerns. Although out of scope for the document, DNS is a mechanism currently discussed. DNS leaks data in plaintext over the network (including what sites the user is visiting and what sites they have previously visited) - thus it may not be suitable for some. 9.2.4. Privacy in STH Pollination An STH linked to an HTTPS client may indicate the following about that client: o that the client gossips; o that the client has been using CT at least until the time that the timestamp and the tree size indicate; o that the client is talking, possibly indirectly, to the log indicated by the tree hash; o which software and software version is being used. Nordberg, et al. Expires April 22, 2016 [Page 20] Internet-Draft Gossiping in CT October 2015 There is a possible fingerprinting attack where a log issues a unique STH for a targeted HTTPS client. This is similar to the fingerprinting attack described in Section 9.2.2, but can operate cross-origin. If a log (or HTTPS Server cooperating with a log) provides a unique STH to a client, the targeted client will be the only client pollinating that STH cross-origin. It is mitigated partially because the log is limited in the number of STHs it can issue. It must 'save' one of its STHs each MMD to perform the attack. 9.2.5. Privacy in STH Interaction An HTTPS client may pollinate any STH within the last 14 days. An HTTPS Client may also pollinate an STH for any log that it knows about. When a client pollinates STHs to a server, it will release more than one STH at a time. It is unclear if a server may 'prime' a client and be able to reliably detect the client at a later time. It's clear that a single site can track a user any way they wish, but this attack works cross-origin and is therefore more concerning. Two independent sites A and B want to collaborate to track a user cross- origin. A feeds a client Carol some N specific STHs from the M logs Carol trusts, chosen to be older and less common, but still in the validity window. Carol visits B and chooses to release some of the STHs she has stored, according to some policy. Modeling a representation for how common older STHs are in the pools of clients, and examining that with a given policy of how to choose which of those STHs to send to B, it should be possible to calculate statistics about how unique Carol looks when talking to B and how useful/accurate such a tracking mechanism is. Building such a model is likely impossible without some real world data, and requires a given implementation of a policy. To combat this attack, suggestions are provided in Section 10.1 to attempt to minimize it, but follow-up testing with real world deployment to improvise the policy will be required. 9.2.6. Trusted Auditors for HTTPS Clients Some HTTPS clients may choose to use a trusted auditor. This trust relationship leaks a large amount of information from the client to the auditor. In particular, it will identify the web sites that the client has visited to the auditor. Some clients may already share this information to a third party, for example, when using a server to synchronize browser history across devices in a server-visible way, or when doing DNS lookups through a trusted DNS resolver. For Nordberg, et al. Expires April 22, 2016 [Page 21] Internet-Draft Gossiping in CT October 2015 clients with such a relationship already established, sending SCTs to a trusted auditor run by the same organization does not appear to leak any additional information to the trusted third party. Clients who wish to contact an auditor without associating their identities with their SCTs may wish to use an anonymizing network like Tor to submit SCT Feedback to the auditor. Auditors SHOULD accept SCT Feedback that arrives over such anonymizing networks. Clients sending feedback to an auditor may prefer to reduce the temporal granularity of the history leakage to the auditor by caching and delaying their SCT Feedback reports. This elaborated upon in XXX Mixing. This strategy is only as effective as the granularity of the timestamps embedded in the SCTs and STHs. 9.2.7. HTTPS Clients as Auditors Some HTTPS Clients may choose to act as Auditors themselves. A Client taking on this role needs to consider the following: o an Auditing HTTPS Client potentially leaks their history to the logs that they query. Querying the log through a cache or a proxy with many other users may avoid this leakage, but may leak information to the cache or proxy, in the same way that an non- Auditing HTTPS Client leaks information to a trusted auditor. o an effective Auditor needs a strategy about what to do in the event that it discovers misbehavior from a log. Misbehavior from a log involves the log being unable to provide either (a) a consistency proof between two valid STHs or (b) an inclusion proof for a certificate to an STH any time after the log's MMD has elapsed from the issuance of the SCT. The log's inability to provide either proof will not be externally cryptographically- verifiable, as it may be indistinguishable from a network error. 10. Policy Recommendations This section is intended as suggestions to implementors of HTTPS Clients, HTTPS Servers, and Auditors. It is not a requirement for technique of implementation, so long as privacy considerations established above are obeyed. 10.1. Mixing Recommendations In several components of the CT Gossip ecosystem, the recommendation is made that data from multiple sources be ingested, mixed, provided to a third party, stored for an indeterminate period of time, and Nordberg, et al. Expires April 22, 2016 [Page 22] Internet-Draft Gossiping in CT October 2015 eventually deleted. The instances of these recommendations in this draft are: o When a client receives SCTs during SCT Feedback, it should store the SCTs and Certificates for some amount of time, provide some of them back to the server at some point, and eventually remove them from its store o When a client receives STHs during STH Pollination, it should store them for some amount of time, mix them with other STHs, release some of them them to various servers at some point, resolve some of them to new STHs, and eventually remove them from its store o When a server receives SCTs during SCT Feedback, it should store them for some period of time, provide them to auditors some number of times, and may eventually remove them o When a server receives STHs during STH Pollination, it should store them for some period of time, mix them with other STHs, provide some of them to connecting clients, may resolve them to new STHs via Proof Fetching, and eventually remove them from its store o When a Trusted Auditor receives SCTs or historical STHs from clients, it should store them for some period of time, mix them with SCTs received from other clients, and act upon them at some period of time Each of these instances have specific requirements for user privacy, and each have options that may not be invoked. As one example, a HTTPS client should not mix SCTs from server A with SCTs from server B and release server B's SCTs to Server A. As another example, a HTTPS server may choose to resolve several STHs to a single more current STH via proof fetching, but it is under no obligation to do so. These requirements should be met, but the general problem of aggregating multiple pieces of data, choosing when and how many to release, and when to remove is shared. This problem has been previously been considered in the case of Mix Networks and Remailers, including papers such as [X], [Y], and [Z]. Certain common recommendations can be made: o When choosing how many times to release data before expiring it from a cache, use a random number chosen from a distribution, rather than a fixed number. This prevents an adversary from Nordberg, et al. Expires April 22, 2016 [Page 23] Internet-Draft Gossiping in CT October 2015 knowing with certainty that it has successfully flushed a cache of a potentially incriminating piece of data. o [TODO Enumerating the problems of different types of mixes vs Cottrell Mix] o [TODO Integrating the IP address into the algorithm for releasing data] o [TODO Prefer aggregating multiple piece of data into a single STH when possible] o [TODO The importance of Flushing Attacks, and tying in network connection, and time interval] 10.2. Blocking Recommendations 10.2.1. Frustrating blocking When making gossip connections to HTTPS Servers or Trusted Auditors, it is desirable to minimize the plaintext metadata in the connection that can be used to identify the connection as a gossip connection and therefore be of interest to block. Additionally, introducing some randomness into client behavior may be important - we assume that the adversary is able to inspect the behavior of the HTTPS client and understand how it makes gossip connections. As an example, if a client, after establishing a TLS connection (and receiving an SCT, but not making it's own HTTPS request yet), immediately opens a second TLS connection for the purpose of gossip - the adversary can reliably block this second connection to block gossip without affecting normal browsing. For this reason it is recommended to run the gossip protocols over an existing connection to the server, making use of connection multiplexing such as HTTP Keep-Alives or SPDY. Truncation is also a concern -if a client always establishes a TLS connection, makes a request, receives a response, and then always attempts a gossip communication immediately following the first response - truncation will allow an attacker to block gossip reliably. 10.2.2. Responding to possible blocking [Not sure here. Maybe this section will get folded up into the above. Or maybe it relates to the escape valve. -tjr] Nordberg, et al. Expires April 22, 2016 [Page 24] Internet-Draft Gossiping in CT October 2015 11. IANA considerations [TBD] 12. Contributors The authors would like to thank the following contributors for valuable suggestions: Al Cutter, Ben Laurie, Benjamin Kaduk, Karen Seo, Magnus Ahltorp, Steven Kent, Yan Zhu. 13. ChangeLog 13.1. Changes between ietf-00 and ietf-01 o Improve langugage and readability based on feedback from Stephen Kent. o STH Pollination Proof Fetching defined and indicated as optional. o 3-Method Ecosystem section added. o Cases with Logs ceasing operation handled. o Text on tracking via STH Interaction added. o Section with some early recommendations for mixing added. o Section detailing blocking connections, frustrating it, and the implications added. 13.2. Changes between -01 and -02 o STH Pollination defined. o Trusted Auditor Relationship defined. o Overview section rewritten. o Data flow picture added. o Section on privacy considerations expanded. 13.3. Changes between -00 and -01 o Add the SCT feedback mechanism: Clients send SCTs to originating web server which shares them with auditors. o Stop assuming that clients see STHs. Nordberg, et al. Expires April 22, 2016 [Page 25] Internet-Draft Gossiping in CT October 2015 o Don't use HTTP headers but instead .well-known URL's - avoid that battle. o Stop referring to trans-gossip and trans-gossip-transport-https - too complicated. o Remove all protocols but HTTPS in order to simplify - let's come back and add more later. o Add more reasoning about privacy. o Do specify data formats. 14. Normative References [RFC-6962-BIS] Laurie, B., Langley, A., Kasper, E., Messeri, E., and R. Stradling, "Certificate Transparency", October 2015, . [RFC7159] Bray, T., "The JavaScript Object Notation (JSON) Data Interchange Format", RFC 7159, March 2014. Authors' Addresses Linus Nordberg NORDUnet Email: linus@nordu.net Daniel Kahn Gillmor ACLU Email: dkg@fifthhorseman.net Tom Ritter Email: tom@ritter.vg Nordberg, et al. Expires April 22, 2016 [Page 26]