Internet-Draft Replay Resistant ARC January 2023
Chuang, et al. Expires 24 July 2023 [Page]
Workgroup:
Independent Stream
Internet-Draft:
draft-chuang-replay-resistant-arc-05
Published:
Intended Status:
Experimental
Expires:
Authors:
W. Chuang
Google, Inc.
B. Gondwana
Fastmail Pty Ltd
A. Robin
Google, Inc.

Replay Resistant Authenticated Receiver Chain

Abstract

DKIM (RFC6376) is an IETF standard for the cryptographic protocol to authenticate email at the domain level and protect the integrity of messages during transit. Section 8.6 defines a vulnerability called DKIM Replay as a spam message sent through a SMTP MTA DKIM signer, that then is sent to many more recipients, leveraging the reputation of the signer. We propose two Replay Resistant cryptographic domain based protocols that leverage ARC (RFC8617). The first technique discloses all SMTP recipients as signed RFC822 headers by the sender which allows a receiver to verify this. The second technique defines a SMTP extension that allows both the sender and receiver to participate in signing the message signature. It includes a path building technique that accurately defines the SMTP forwarding path of the message. Both techniques permit a receiver to detect DKIM and ARC replay attacks and other attacks. This specification also defines a relay flow identifier to prevent spammers from using relay forwarding, We do this by having relays categorize their authenticated traffic flows and publish to receivers identifiers associated with those flows. Receivers can use this identifier to help categorize traffic through the relay and use that identifier to apply fine-grain anti-abuse policies instead of on the entire traffic through the relay.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 24 July 2023.

Table of Contents

1. Introduction

This protocol provides two different techniques to authenticate email by domain that is replay resistant. It leverages the features of ARC to name ADMD in the email forwarding path and the intermediate results. The first technique discloses all SMTP recipients as signed RFC822 headers by the sender which allows a receiver to verify this. The second technique defines a SMTP extension that allows both the sender and receiver to participate in signing the message signature, and can be built into a path of signing ADMDs called chaining. The results of the two techniques MAY be used by spam filtering to apply some local policy, and/or applied to DMARC policy evaluation as one of its input email authenticators. These two techniques are independent of each other with different methods, benefits and limitations in tackling replay. Both are presented in case the limitations of one precludes using it. This also depends upon ARC improvements to ensure that the results of the first two techniques are propagated correctly to the receiver.

Spammers utilize relays to obfuscate their identities and often to spoof some other identity with email receivers. For example a spammer may exploit the shared tenancy vulnerability of SPF to spoof some identity as follows. It finds a relay that hosts many different enterprise customers who include the relay's IPs in their SPF policies. The spammer then sends traffic through the relay assuming the identity of one of those customers i.e. it spoofs the MAIL FROM identity of the victim domain. While the SPF validation (if done) of the initial send by the spammer to the relay fails, a subsequent SPF validation when forwarded to some other victim receiver from the relay will pass SPF because the IPs are contained in the victim's SPF policy. At some point, the receiver notices the spam via the relay and wants to apply anti-abuse counter measures. With existing authentication methods, this policy would impact all mail flows through that relay, both innocent and malicious. A better approach would be to selectively apply anti-abuse counter measures to the spammer's flow which is what this proposal enables.

1.1. Terminology and Definitions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

Acronyms

  • Authenticated Received Chain (ARC) [RFC8617] - is a protocol that is meant to resolve some of the issues for DMARC [RFC7489] to fix the problems that DMARC policy rejects caused by mail forwarding, and is based on DKIM, but suffers from similar issues as DKIM replay. ARC defines digital signatures and Authentication-Results by ADMD and a versioning mechanism for them.
  • Authentication-Results [RFC8601]- A header containing a list of validation results and comments.
  • DomainKeys Identified Mail (DKIM) [RFC6376]- IETF standard for the cryptographic protocol to authenticate email at the domain level and protect the integrity of messages during transit.

    • DKIM replay- [RFC6376] section 8.6 defines a vulnerability called DKIM Replay as a spam message sent through a SMTP MTA DKIM signer, that then is sent to many more recipients, leveraging the reputation of the signer.
  • Sender Policy Framework (SPF) [RFC7208]- IETF standard for authenticating sending servers typically based on IP address.
  • ADministrative Management Domain (ADMD) [RFC5598] defines Mail Submission Agent (MSA), Mail Transfer Agent (MTA) and Mail Delivery Agent (MDA).
  • Domain-based Message Authentication, Reporting, and Conformance (DMARC) [RFC7489]- Defines a sender's domain identity and from that a sender's message handling policy when messages are being spoofed. It defines using SPF and DKIM as methods to determine the authenticity of messages.
  • Signed RFC822 header recipients- These identities are defined by To, Cc and Forwarded-to headers, and MUST be a signed headers present in the ARC-Seal.
  • SMTP recipients- The RFC5321 MAIL FROM recipients are disclosed during the SMTP transmission. These identities define the inboxes that the message is intended for.

2. ARC Set Improvements

This protocol leverages the concepts and features of ARC [RFC8617] for propagating authentication results and protecting the integrity of the headers and message body. It adds additional new ARC-Seal tag/values to describe protocol settings and new ARC-Authentication-Results status and comments as described in later sections. As the ARC chain identifies the message traversal over the forwarding path, this uses ARC set number as the per ADMD versioning. Unlike ARC, this proposal mandates that the aware ADMDs explicitly identify themselves as an ARC set in the identified path or makes explicit when the message exits the identified path into some naive, unaware ADMD as described later.

The identified path traverses ADMDs starting with the MSA, optionally traverses one or more MTA, and terminates delivery into the inbox at the MDA. The MSA ADMD i.e. the responsible originating sender of the message is identified as the initial ARC set "i=1". This protocol requires that the From header domain MUST be the same as the ARC-Seal d= domain i.e. tying the sender's identity to the cryptographic signer that claims that. As the originator has no email authentication results, the ARC-Authentication-Results MUST be empty. Similarly when the message is delivered to the inbox by the MDA ADMD at ARC set "i=N", it alters the ARC set to make termination identifiable and to make it more difficult to replay the ARC sets. The MDA strips the ARC-Seal and ARC-Message-Signature but leaves behind the ARC-Authentication-Result before sending the message to the MUA. A message lacking ARC-Seal and ARC-Message-Signatures has been delivered to the inbox, and the last ARC set ARC-Authentication-Result present indicates the MDA ARC set. ADMDs that act as MTA will upon receiving a message, forward to some new destination. Note that an ADMD MAY be both a MTA and MDA i.e MAY forward the message and deliver to some inbox.

To prevent reuse of ARC headers from one message in another, this protocol mandates generating the ARC-Message-Signature signature upon any outbound traversal from one ADMD to a different ADMD. In addition there MAY be ARC signing internal to the ADMD. Having this outbound message body signing invariant permits the receiver to verify the integrity of the message as sent by the prior ADMD. To verify the integrity of the ARC sets then, a receiver MUST verify the previous ARC set's ARC-Message-Signature and verify each ARC set's ARC-Seal signature from "i=N" (sender's ARC set number) to "i=1" as well as the presence of all headers in the ARC set. Failure from the immediate sender at "i=N" is treated differently than failures from prior senders at "i=N-1" or earlier with the intention that verifiers MAY potentially use the subsequent ARC set verification results hence differentiated. If the receiver sees a verification failure from the immediate sender's ARC-Message-Signature or ARC-Seal, or from missing headers, that is "hard" fail. Prior failures from "i=N-1" to "i=1" ARC-Seal are treated as softfail. The result of the verification is published in the Authentication-Result and the ARC-Authentication-Result with a tag "arc=". Even if ARC verification fails, this specification asks the receiver to continue ARC generation and verification to provide forensics evidence for subsequent receivers via the authentication results. For example the SPF authentication results of the potentially malicious sender MAY help identify that sender to some subsequent receiver. The propagated ARC verification failure will help prevent inadvertent use of the authentication results by subsequent receivers.

3. Declare All Recipients and Affirm (DARA)

3.1. Concepts

We can harden the protocol against replay attacks by explicitly identifying all recipients in the headers, including when the recipient is "hidden" such as Bcc or Mailing-lists. That way when a signed message arrives, the receiver can check if the RCPT TO recipient correctly is a subset of the recipient in the signed message header. If not, then the message MAY be part of a replay attack. For blind carbon copy, while a Bcc header MAY be added, it can be stripped by subsequent forwarders. Instead we create a new Forwarded-to header that includes an ARC set versioning number to indicate which ADMD sent the message to a new recipient.

Forwarded-to: i=1; user@example.com

The Forwarded-to header MUST be signed by the ARC-Message-Signature i.e. be present in the "h=", then prepended to the headers of the message. For privacy, if there are multiple recipients, the message MUST be split and signed exclusively for each Forwarded-to recipient to maintain privacy between recipients. Subsequent forwarders MUST NOT strip the Forwarded-to header from the message. To handle the email forwarder and mailing list scenario, we also use the Forwarded-to header to indicate that a message is sent to a new recipient. Messages sent to a new ADMD but with the same recipient identity disclosed by Forwarded-to MAY reuse the prior header.

Senders and receivers MAY variously support the Declare All Recipients and Affirm (DARA) protocol or not, so the protocol needs to be tolerant of naive ADMDs. For example a naive mailing list sender sending to a protocol aware receiver SHOULD NOT have traffic rejected simply because it didn't follow the protocol. Yet simultaneously, the DARA protocol needs to discourage abuse by spammers seeking to use the naive ADMD path for DKIM replay. In this protocol, that sender publishes their capability in the ARC-Seal as "dara=" tag/value, and whether the receiver SHOULD validate recipients. A value of "v" indicates that the receiver MUST validate the recipients, and if it fails verification, treat the message as DARA unauthenticated with the implication that the message is being replayed. As with other email authentication methods, the verifier is free to apply a locally defined policy against unauthenticated email. A value of "d" indicates that the receiver MAY choose to discretionally validate the recipients. If a receiver validates the recipients, it SHOULD treat recipient verification failure as neutral and SHOULD treat success as pass. The discretionary validation mode is to support the scenario of sending to a naive ADMD that does not support ARC or the DARA protocol. Because such naive forwarders might not add any indication of its presence e.g. adding an ARC set, the sender MUST protect subsequent DARA aware receivers from misinterpreting prior settings while allowing for recipient updates that MAY otherwise trigger false positive verification failures. All ADMD supporting the DARA protocol MUST publish a DNS txt policy record as described below. The sender fetches the receiver's policy record to determine whether to select the required verification "dara=v" which is done when the receiver supports the DARA protocol, otherwise the sender selects the optional "dara=d" validation profile. In addition when the receiver does not support the protocol, the sender always identifies the individual signed recipient. This MAY be needed when the recipient is in the To, or Cc headers, and in this case also adds a Forwarded-to header per recipient, then signs the message only for that recipient. Unique identification of the recipient and the receiving domain allows a receiver to adjust the reputation system in case there is a replay attack. Instead of penalizing the sender that is DARA aware, the receiver MAY elect to apply the reputation penalty to the receiving domain that is naive to DARA.

The receiver's verification process is to collect all the recipients in the To, Cc and Forwarded-to headers. It verifies that they are signed appropriately in the sender's ARC-Message-Signature and if so, put them into a set of signed headers. The receiver then collects all the RCPT TO recipients as the envelope recipients. The receiver then verifies that the envelope recipients are a subset of the signed headers. It applies the policy depending on the sender's capabilities as described in the ARC-Seal "dara=" tag/value. The result of this check SHOULD be published in the ARC-Authentication-Results as "dara" [RFC8601] method as pass or fail or neutral.

The DARA DNS policy record identifies whether an ADMD supports the protocol. It is a TXT DNS record located at the same domain name as the MX record. Quite likely it will share the policy record with SPF. Such a policy record starts with a SeRCi version number "dara_version=" which MUST be set to "ver1.0" indicating that ADMD supports DARA. While usually the sender looks up the DARA TXT DNS record, a receiver MAY elect to check the sender's policy if it suspects that a MiTM has stripped off the sender's DARA policy. If it detects a DARA declaration in the DNS policy, but not in the message, the receiver MAY elect to treat the message as spam.

3.2. Definitions

DNS TXT Policy tags

  • "dara_version=": Value of "ver1.0" if the ADMD supports the DARA verification protocol.

ARC-Seal tags

  • "dara=": Value of "v" if the sender mandates that the receiver verify the recipients. Value "d" if the sender asks the receiver to optionally verify the recipients, and writes a pass if the recipient verification passes.

ARC-Authentication-Results method

  • "dara=": Value of pass if recipient validation passes, otherwise fail. In some circumstances this tag/value may be unset or be treated as neutral.

3.3. Header Examples

3.3.1. MBP ==> Mailing-List ==> MBP

First MBP outbound (after ARC seal)

ARC-Seal: i=1; dara=v
ARC-Authentication-Results: i=1
To: mailing.list@example.com

Mailing-List inbound (after ARC seal)

ARC-Seal: i=2; dara=v
ARC-Authentication-Results: i=2; dara=pass (rcpt.to
    mailing.list@example.com matches signed header)
ARC-Seal: i=1; dara=v
ARC-Authentication-Results: i=1
To: mailing.list@example.com

Mailing-List outbound (after ARC seal)

Forwarded-to: i=2; user@example.com
ARC-Seal: i=2; dara=v
ARC-Authentication-Results: i=2; dara=pass (rcpt.to
    mailing.list@example.com matches signed header)
ARC-Seal: i=1; dara=v
ARC-Authentication-Results: i=1
To: mailing.list@example.com

Final MBP inbound (after ARC seal)

ARC-Seal: i=3; dara=v;
ARC-Authentication-Results: i=3; dara=pass (rcpt.to
    user@example.com matches signed header)
Forwarded-to: i=2; user@example.com
ARC-Seal: i=2; dara=v;
ARC-Authentication-Results: i=2; dara=pass (rcpt.to
    mailing.list@example.com matches signed header)
ARC-Seal: i=1; dara=v
ARC-Authentication-Results: i=1
To: mailing.list@example.com

3.3.2. MBP ==> MBP-Replay ==> MBP

First MBP outbound (after ARC seal)

ARC-Seal: i=1; dara=v
ARC-Authentication-Results: i=1
To: user@example.com

Second MBP inbound (after ARC seal)

ARC-Seal: i=2; dara=v;
ARC-Authentication-Results: i=2; dara=pass (rcpt.to
    user@example.com matches signed header)
ARC-Seal: i=1; dara=v
ARC-Authentication-Results: i=1
To: user@example.com

Above message captured by spammer, modified (add additional headers) and then resent. A spammer might send the message to john.doe@example.com which would be unspecified in the headers.

Victim (last) MBP inbound (after ARC seal)

ARC-Seal: i=3; dara=v
ARC-Authentication-Results: i=3; dara=fail (rcpt.to
    john.doe@example.com mismatches signed header);
ARC-Seal: i=2; dara=v
ARC-Authentication-Results: i=2; dara=pass (rcpt.to
    user@example.com matches signed header);
ARC-Seal: i=1; dara=v
ARC-Authentication-Results: i=1
To: user@example.com

3.3.3. MBP ==> Naive-Forwarder ==> MBP

This describes a forwarder that doesn't not support DARA.

First MBP outbound (after ARC seal)

Forwarded-to: i=1; user@naive.example.com
ARC-Seal: i=1; dara=d
ARC-Authentication-Results: i=1
To: user@example.com

Forwarder headers will be the same as above as the forwarder is naive to the protocol.

Final MBP inbound (after ARC seal). In this case the envelope recipient will change weihaw@example.com. The declared recipient user@other.example.com will mismatch the envelope recipient, and fail DARA. However the protocol is set to optional verification with DARA=d, and so does not report the failure.

ARC-Seal: i=2; dara=v
ARC-Authentication-Results: i=2
Forwarded-to: i=1; user@naive.example.com
ARC-Seal: i=1; dara=d
ARC-Authentication-Results: i=1
To: user@naive.example.com

4. Sender Receiver Co-Signing (SeRCi)

4.1. Concepts

We can create a challenge response system using cryptographic signing orchestrated between the sender and receiver of an SMTP transaction. The receiver challenges the sender to sign a mutually agreed upon value with their secret key, and can demonstrate a proof of that SMTP client-server relationship to 3rd parties. One problem is that the receiver can't proactively issue the challenge, so as part of the EHLO, the server issues the challenge as an optional SMTP extension argument. The sender can respond with the signature incorporating the shared value as a SMTP extension verb. Another problem is preventing a malicious party from intercepting a message and trying to replicate the challenge. We propose using a timestamp that can't be used in the future i.e. both parties make sure the timestamp reasonably represents the current time. This cryptographic challenge needs to sign a hash that ties the signature back to the message, and for this proposal, we take the whole message hash from the ARC-message-signature. In addition the destination domain is specified to reduce the risk for this signature to be intercepted and reused for other communications with other destination domains.

Such a protocol can help authenticate to a receiver that some sender sent a message without risk of replay via some third party. Sender Receiver Co-Signing (SeRCi pronounced Cersei as in the GoT character) could be used similarly to SPF [RFC7208] but without the risk of shared tenancy attack, IP reuse attack, and BPG vulnerabilities. Moreover a third party can independently verify the result that some sender and receiver sent the given message at the given time. This obviates the need to trust the ARC-Authentication-Results. Later we use SeRCi metadata to describe the forwarding path of the message.

This protocol signs the messages content at the exit to the ADMD to protect the SMTP transaction and yet be insensitive to message body or header modifications by the ADMD. This is necessary to tolerate the changes that a legitimate forwarder may make such as a mailing list adding a footer or adding the name of the mailing list to the Subject header. Other forwarders may alter the Content-Transfer-Encoding or delete attachments which this protocol also tolerates. However malicious forwarders may add or replace malicious content to otherwise benign messages and this must be detected. SeRCi identifies message body changes via different body hashes between the originator and the destination. If a message is unchanged between the originator to destination, then malicious content is attributed to the originator. If a message is changed and there is malicious content, then the originator and the mutating ADMDs are assigned responsibility. Potentially the attribution can affect the receiver reputation given to the ADMDs. The existing ARC protocol can do this, however it is a risky endeavor due to the potential for ARC replay and looseness around when ARC does its ARC-Message-Signature.

The SMTP extension for SeRCi for generating the hash and then publishing it, is meant to prove that the sender and receiver collaborated to create the hash. The protocol is advertised as a SMTP extension in the SMTP EHLO named SERCI with a timestamp argument. That timestamp will be in UTC seconds. If the timestamp is acceptable to the sender, then it SHOULD sign a tuple of url-safe base64 [RFC4648] message hash used in the outbound ARC-Message-Signature, destination domain as defined in the next paragraph, and timestamp. (Subsequent base64 operations are assumed to be url-safe encoded base64 [RFC4648] to avoid quoted-string) That signature then SHOULD be base64 encoded and disclosed to the receiver as:

SERCI-SIGNATURE <sender domain> <selector> <header-body hash>
<message-body hash> <signature>

where signature is upon a hash of the formatted SeRCi result comment string to be presented by the receiver minus the signature. Note there are no white spaces in the hashed string. To create the canonical version whitespace they are removed. Thus the signature is:

base64(signsender(sha256(<sender domain><selector><header hash>
<message body hash><timestamp>)))

where domain corresponds to the sender's DKIM domain and selector that is used to find the DKIM public key DNS record. It also discloses the header hash and body hash that is used to compute the message hash, and are present to allow detection of differences between ARC sets. If the timestamp is not acceptable, the sender can report this as SERCI-SIGNATURE "out-of-time" and potentially the receiver will return a new timestamp. The sender is allowed to do this once, and after that the receiver MUST report an error. To prevent eavesdropping and potential spoofing, this protocol MUST be secured by SMTP TLS. Upon obtaining the signature, the receiver MUST then validate the SeRCi signature. It looks at the sender's ARC-Message-Signature hash to see if that is acceptable, meaning matches a hash the receiver generates of the message. Next it checks if the timestamp is the same as provided to the sender, and if the destination domain is the same as the receiver's ARC-Seal "d=" domain. The SERCI-SIGNATURE command returns OK on success, otherwise some error code.

An example SMTP transaction might look like:

EHLO sender.example.com
250-sender.example.com at your service, [1.2.3.4]
250-SIZE 157286400
...
250 SMTPUTF8
250 SERCI <timestamp>
MAIL FROM:<sender>`
RCPT TO:<recipient>
DATA <message>
SERCI-SIGNATURE <sender domain> <selector> \
base64(<message body hash>) base64(<header hash>) \
base64(signsender(sha256(<sender domain><selector><header hash> \
<message body hash><timestamp>)))

The sender discovers the receiver's support for this protocol by a DNS txt policy lookup upon the recipient email address domain. Within this policy record MAY be a tag value indicating which SeRCi version number "SERCI_version=" which MUST be set to "ver1.0" when that ADMD indicates it supports SeRCi. The lookup also discovers the normalized destination domain name, and that destination domain MUST match the ADMD ARC-Seal "d=" domain [RFC8617] which enables tracing this domain From sender to receiver as described later. The domain name is specified "serci=<domain>" in the DNS policy record. Once discovered this domain is put in the sender's ARC-Seal as" serci=<domain>", which indicates support by the receiver for the SeRCi as well as identify the intended receiver domain. If no such DNS txt policy record is found, then the receiver does not support the SeRCi protocol. This is indicated in the ARC-Seal by a SeRCi naive receiver tag/value of "snr=" and From header domain for path building described later. Further the "snr=" tag indicates to subsequent SeRCi aware receivers that there was an intermediate naive forwarder. If a domain advertises a SMTP SeRCi-SIGNATURE extension but does not publish a DNS txt policy, the sender MUST NOT call the SeRCi-SIGNATURE command as the receiver is declaring their intent to not participate in SeRCi.

The SeRCi aware receiver will verify the signature after the SeRCi-SIGNATURE verb. Assuming the receiver agrees with the signature (i.e. verifies it), the receiver will add to the ARC-Authentication-Result a new authentication-results method "serci" that has a pass result or fail result otherwise. It also adds as authentication-results [RFC8601] properties, the values needed to contribute to the signature verification. The [RFC8601] ptype is "smtp". The sender domain property is "sd". The selector is "s". The message body hash is "bh" and the value is encoded in base64. The header hash is "hh" and the value is encoded in base64. The timestamp is "t". This is illustrated as:

ARC-Authentication-Results i=1; serci=<pass|fail> (<comment>)
     smtp.sd=<sender domain>
     smtp.s=<sender domain>
     smtp.bh=base64(<message body hash>)
     smtp.hh=base64(<header body hash>)
     smtp.t=<timestamp>
     smtp.s=<selector>
     smtp.b=base64(<signature>)

4.1.1. Definitions

DNS TXT Policy tags

  • "serci_version=": Value of "ver1.0" if the ADMD supports the SeRCi verification protocol.
  • "serci=": Value of the receiver's ARC-Seal "d=" domain

ARC-Seal tags

  • "serci=": Value of the receiver's ARC-Seal "d=" domain when the receiver is SeRCi capable.
  • "snr=": Value of RFC822 recipient's domain name when the receiver is naive of SeRCi.

ARC-Authentication-Results method and ptype-properties

  • "serci=": Value of "pass" if sender/recipient signing validation succeeds, otherwise "fail".
  • "smtp.sd=": sender domain
  • "smtp.s=": selector
  • "smtp.bh=": body hash in base64
  • "smtp.hh=": body hash in base64
  • "smtp.t=": timestamp in seconds from UTC
  • "smtp.b=": signature in base64

4.2. Header Example

ARC-Seal: i=2; d=destination.example.com
ARC-Authentication-Results: i=2; serci=pass (comment)
     smtp.sd=source.example.com smtp.s=selector
     smtp.bh=message_body_hash_base64 smtp.t=1664511950175
     smtp.s=signature_base64
ARC-Seal: i=1; d=source.example.com; serci=destination.example.com
ARC-Authentication-Results: i=1
To: user@destination.example.com

5. Relay Flow Identifier

This specification defines an identifier name for mail traversing a relay. Typically the relay uses password authentication such as methods provided for in [RFC4954] but other methods MAY be possible. This identifier MAY also be used for authenticated forwarding flows such as mailing lists and with other authentication methods such DKIM or SPF that verify who the sender is. Because some traffic may have originated at the relay, which traditionally may be DKIM signed, this document provides a specification for DKIM [RFC6376]. In other instances, the relay forwards traffic originated elsewhere, and these are typically not DKIM signed by the relay, so instead this document provides a specification using ARC [RFC8617].

Email Service Providers can delegate relay and forwarding services to enterprise customers, typically associated with some customer domain. Spammers utilize these features either by acting as an enterprise customer or by hijacked accounts. This specification proposes naming flows by enterprise customers to help the email receiver with categorization and application of anti-abuse counter measures. As some mechanisms for mail forwarding such as mailing lists are often opaque after being sent and problematic for debug and abuse protection, this offers a naming scheme to help identify those mechanisms.

5.1. Flow Identification Token

The relaying service choosing to use this specification MUST categorize and name relayed traffic flows such that receivers can do anti-abuse analysis upon them if necessary. In order for the identifier to be effective, it SHOULD be persistent in time and uniquely named across all flows through the relay. As relayed traffic flow is often associated with a delegated domain, the first part of the identifier MUST either include a domain associated url-safe base64 [RFC4648] token, or be empty if no such delegated domain is present. It MAY include a local part url-safe base64 [RFC4648] token after the domain token and separated by a period '.'. This local part token can help describe the mail forwarding mechanism. Combined the domain token and the optional local token form the relay flow identifier name. If a message is associated with more than one flow, the relay SHOULD select the more specific flow based on local policy. That name MUST NOT be any relay internal name though MAY be a secure cryptographic hash of such. Also that name MUST NOT contain or be associated with any Personally Identifiable Information (PII). The parser should ignore commas '+' whose use may be specified in the future.

Example valid names:

0123456789
0123456789.abcdwxyz
.abcdwxyz
<empty>

5.2. ARC Authentication-Result Method

This proposes a new ARC [RFC8617] ARC-Authentication-Result defined method [RFC8601] that identifies the presence of a relay flow and its property that identifies a relay flow identifier name. The defined method is "relay", which when present, takes a single result value of "pass" that indicates the relay was authenticated. The relay method will have a propspec tag-value with a policy ptype with a "rfid" property i.e "policy.rfid" that takes a single token value. That token value consists of a domain url-safe base64 token and the optional local url-safe base64 token separated by a period. The token parsers MUST ignore a reserved plus that may be further specified in the future.

Example:

ARC-Authentication-Results: i=1; auth.example.com;
     relay=pass (comments) policy.rfid=0123456789.abcdwxyz

6. Chaining

The local results of SeRCi can be combined into a path of verified ADMD domains as defined by the ARC-Seal "d=" domains. Path building can help detect if replay potentially occurred, that is a receiver MAY check that a message was forwarded from the originator to it with verification errors. If there are Chain building verification errors, then it indicates either there is a protocol unaware forwarder, or that there is a malicious sender attempting to take the message and reinject it along a new path outside the intent of the originator. A verifier can then check for some prior sender SeRCi declaration of "snr=" vs "serci=" which clarifies definitively which of these two scenarios applies. At that point, it is up to the receiver's local policy to determine what receiver does with the result. The protocol for this verification is described in more detail in subsequent paragraphs.

The verified path that the message traverses can be used as the message flow identifier in a reputation system. Unlike purely domain based reputation systems, a path based one can help differentiate benign message flows from malicious ones to help identify replay or other abuse by identifying the spammer forwarding malicious content. In the Header Examples we describe a scenario where the spammer exploits some gullible but otherwise benign intermediate forwarder in an attempt to hide their tracks and path based reputation can be particularly helpful in uncovering them.

6.1. Chain Building Algorithm

The following defines an algorithm for path building using SeRCi identifiers. We define the nodes of a path as the ARC-Seal "d=" identities and who form edges as domain identities pairs. Because building the edges of a path is a repeated process across edges that are like links, we call this Chain building. It starts at the destination at ARC set "i=N", and walks through the ARC headers to the originator ARC set "i=1". The edge is defined as a pair of nodes (dN , dN-1) where the sender's ARC-Seal "serci=" domain is dN and the receiver's ARC-Seal "d=" domain d'N MUST match. If so, edge building considers this a local pass. If the "serci=" result is missing, the verifier checks if there is instead a "snr=" tag at this or prior ARC set, then specifies an edge result of neutral, otherwise as fail. Next the receiver's ARC-Set number MUST be "i=N-1" and if so then the ARC-Seal "d=" domain is defined as dN-1. This recursively is extended for (dN-1 , dN-2) i.e. for ARC set "i=N-2" and so forth for each "i=n" to d1.

Local Chain verifier is done for each ARC set n following the above edge building from "i=N" to "i=1" and builds two vectors. One vector keeps the local chain results and the other ARC-Seal "d=" domains. The verifier assumes that results from ARC header and message-body signature verification, SeRCi and potentially DARA verifications have already run and the results already populate the ARC-Authentication-Results. For ARC set "i=N" to ARC set "i=2", the verifier MUST evaluate the local result, meaning the ARC result (i.e. from ARC seal verification and sometime ARC message-signature verification), edge building result, and SeRCi verification result, and optionally the DARA verification result, and take the AND of those results. If all of them are pass, the local Chain result is pass. Otherwise if any of them are neutral or SeRCi is softfail, and the rest pass, the result is neutral. Otherwise the result is failure. Further local policy MAY modify the ARC message-signature result (perhaps due to future work around this draft or this one) As with ARC improvements, this protocol recommends continuing Chain verification even if the sender's Chain result is failure or neutral, to provide forensics evidence for subsequent receivers. Receivers SHOULD independently verify the SeRCi signature rather than taking the result from ARC-Authentication-Result and having to trust an externally generated result. At the originator ARC set "i=1" corresponding to d1, the verifier first verifies alignment between header From domain and the ARC-Seal "serci=" domain. That domain defines d1, and the verifier looks up the SeRCi policy associated with the domain which MUST exist. If they are not aligned, then the message is not considered originated at "i=1", and local Chain verification is considered neutral as likely the message was forwarded to some SeRCi aware domain. In addition the ARC seal validation for "i=1" MUST pass or local Chain verification is considered fail. Once these checks pass, then Chain building for "i=1" is considered to pass. The local Chain results is added onto the result vector at that index for all indexes, and similarly the ARC-Seal "d=" domain onto the domain vector. If relay flow identifier is available for that ARC set, the relay flow identifier may be used instead of the domain per local policy.

To compute the global Chain result, there is a walk over the vector of results. The global Chain result is initialized to pass. Starting from "i=N" index to "i=1", if the local result is fail then the global result is fail, else if local result is neutral then the global is neutral. If the local result is fail, then the domain result is cleared from that index to i=1. This will attempt to extend the domain list by looking at the prior ARC sets SPF result. If that has a SPF pass, then the SPF domain is placed in that index, otherwise this inserts a failure indication with the cause e.g. "arc-fail" at that index. If there are multiple failures, this chooses the most specific error as the cause e.g "serci-fail" over "arc-fail". This then truncates cleared domain entries from the domain list. If the local result is fail, this walk halts. If the local result is neutral, and there is a "snr=" then this inserts the domain in the domain list after the current index which helps identify it in the constructed path. A synthetic neutral _result is also inserted in the result path. This also similarly extends the path when "i=1" and the message doesn't originate at that domain (missing alignment between the _From header domain and ARC-Seal "d=" domain) to better identify the flow. If so and the SPF result is a pass, this prepends the SPF domain and synthetic result into the respective vectors. If there is a non-passing SPF result, this prepends a SPF status string such as "spf-neutral" to the domain vector and the status to status vector. The global Chain result is published ARC-Authentication-Results as a "chain=". result. If the result is pass, then the message is considered to be authenticated by SeRCi, otherwise unauthenticated.

6.2. Modified Body Algorithm

The protocol can detect when a message is modified along the forwarding path by looking at the current and previous message body hash and comparing them to find for changes. If the message content is considered spammy and phishy, then ADMDs that may have contributed to that problematic message body content will have their reputation per domain reputation of ADMDs negatively reduced. Other ADMDs that are proven to not have contributed message content SHOULD NOT be affected. A per domain reputation sensitive to message modification is useful when the path based reputation has not been established, and instead the per domain reputation can initialize the reputation of the sender. For this we keep a reputation table indexed by domains. We collect the domains that modify the messages in a forwarding path including the sender, and update their reputation collectively and equally based on the spam and phishing scores. Alternatively the path identifier can be further specialized by adding an indicator whether a forwarding ADMD modified the message. That differentiates path sensitive reputation by whether a forwarder modified the message or not.

6.3. Definitions

ARC-Authentication-Results tags

  • "chain=": Value of pass if local results and prior nodes are all passes, otherwise if "snr=" was present in the flow then neutral, else fail.

7. Chaining Header Examples

The following two examples illustrate working SeRCi/Chain-Building verification. This is followed by an example of DKIM replay attack. The second to last example is illustrative of how this protocol behaves with a SPF upgrade attack. The last example demonstrates a modified message body by a forwarder. (Other examples do not have a forwarder that modifies the message) .

7.1. MBP ==> Mailing-List ==> MBP

This is an example of mail being sent from one Mail-Box-Provider to another through a Mailing-List where all ADMDs participate in SeRCi. In this illustrative example, we show the construction of the headers.

First MBP outbound (after ARC seal)

ARC-Seal: i=1; d=originator.example.com;
    serci=mailinglist.example.com
ARC-Authentication-Results: i=1
To: mailing.list@mailinglist.example.com

Mailing-List outbound (after ARC seal)

ARC-Seal: i=2; d=mailinglist.example.com;
    serci=destination.example.com
ARC-Authentication-Results: i=2; serci=pass; chain=pass
ARC-Seal: i=1; d=originator.example.com;
    serci=mailinglist.example.com
ARC-Authentication-Results: i=1
To: mailing.list@mailinglist.example.com

Final MBP inbound (after ARC seal)

ARC-Seal: i=3; d=destination.example.com
ARC-Authentication-Results: i=3; serci=pass; chain=pass
ARC-Seal: i=2; d=mailinglist.example.com;
    serci=destination.example.com
ARC-Authentication-Results: i=2; serci=pass; chain=pass
ARC-Seal: i=1; d=originator.example.com;
    serci=mailinglist.example.com
ARC-Authentication-Results: i=1
To: mailing.list@mailinglist.example.com

The global Chain verification result is pass and the message is considered SeRCi authenticated. The constructed path is [originator.example.com, mailinglist.example.com, destination.example.com].

7.2. MBP ==> Naive-Forwarder ==>Aware-Forwarder ==>MBP

This demonstrates a naive forwarder naive.example.com that doesn't not support SeRCi. The headers represent what would be seen after inbound delivery to the destination MBP.

ARC-Seal: i=3; d=destination.example.com
ARC-Authentication-Results: i=3; serci=pass; chain=neutral
ARC-Seal: i=2; d=intermediate.example.com;
    serci=destination.example.com
ARC-Authentication-Results: i=2; chain=neutral
ARC-Seal: i=1; d=source.example.com; snr=naive.example.com
ARC-Authentication-Results: i=1
To: user@destination.example.com

The global Chain verification result is neutral and the message is considered SeRCi unauthenticated. The constructed path is [source.example.com, naive.example.com, intermediary.example.com, destination.example.com].

7.3. MBP ==> Spammer ; Replay ==> MBP

Final headers as seen by the victim ADMD after replay injection to victim.example.com domain.

ARC-Seal: i=3; d=victim.example.com
ARC-Authentication-Results: i=3; chain=fail
ARC-Seal: i=2; d=destination.example.com
ARC-Authentication-Results: i=2; serci=pass; chain=pass
ARC-Seal: i=1; d=source.example.com;
    serci=destination.example.com
ARC-Authentication-Results: i=1
To: user@destination.example.com

Due to the path discontinuity, the global Chain verification result is fail and the message is considered SeRCi unauthenticated. The constructed path is [serci-fail].

7.4. Spammer ==> Gullible Forwarder ==> MBP

In this example the spammer does not participate in ARC or this protocol. The spammer forwards a message through an permissive cloud provider gullible.forwarder.com to reach the inbox of some user at desination.example.com. Spammer selects a victim domain that uses email services of gullible.forwarder.com such that they include the IPs of gullible.forwarder.com in their SPF policy. While the spammer cannot SPF authenticate at inbound to gullible.forwarder.com, they can SPF authenticate at inbound to destination.example.com, hence the SPF upgrade attack.

ARC-Seal: i=2; d=destination.example.com
ARC-Authentication-Results: i=2; spf=pass; serci=pass; chain=pass
ARC-Seal: i=1; d=gullible.forwarder.com;
    serci=destination.example.com
ARC-Authentication-Results: i=1; spf=neutral
To: user@destination.example.com
From: spoofed_user@victim.example.com

While SPF and consequently DMARC is pass at the destination, SeRCi/Chain verification result is neutral because the message was not originated at victim.example.com. A DMARC evaluation would likely pick the SPF result. Instead a better approach might be to use the path based reputation system. The spammy forwarding path is [spf-neutral, gullible.forwarder.com, destination.example.com] which include evidence of the spammer. Contrasts that to the path from a normal message delivery by victim.example.com using their cloud provider which either would look like [victim.example.com, destination.example.com] or [victim.example.com, gullible.forwarder.com, destination.example.com]. Both would be distinct from the spammer forwarding flow in a path aware reputation system.

The spammer may attempt to confuse the receiver by replaying ARC headers before forwarding to gullible.forwarder.com. This would change the SeRCi/Chain verification result to fail and the constructed path very much [arc-fail, gullible.forwarder.com, destination.example.com]. As gullible.forwarder.com is ARC and SeRCi aware, it would indicate that the replayed ARC headers would not pass ARC verification.

7.5. MBP ==> Modifying Forwarder ==>MBP

This demonstrates a spammy message where the forwarder modifies the message content, representing for example a mailing list adding a footer.

ARC-Seal: i=3; d=destination.example.com
ARC-Authentication-Results: i=3; serci=pass; chain=neutral
ARC-Seal: i=2; d=modifying.example.com;
    serci=destination.example.com
ARC-Authentication-Results: i=2; chain=pass
ARC-Seal: i=1; d=source.example.com;
    serci=modifying.example.com
ARC-Authentication-Results: i=1
To: user@destination.example.com

While the global Chain verification result is pass and the message is considered SeRCi authenticated, the modified message body change is visible via the modified body algorithm. The constructed path is [source.example.com, modified-message-body, intermediary.example.com, destination.example.com] where we embellish the path with the modification result. The set of contributing domains associated with the spammy message is {source.example.com, modifying.example.com}.

A different message may travel along the same forwarding path but is not modified by the forwarder. That non-modifying forwarder constructed path is: [source.example.com, intermediary.example.com, destination.example.com], and is distinct from above. The set of contributing domains associated with the message content is now {source.example.com}.

7.6. Spammer ==> Relay ==>MBP

This demonstrates a spammer sending a message through a relay to a destination MBP.

ARC-Seal: i=2; d=destination.example.com
ARC-Authentication-Results: i=2; spf=pass; serci=pass; chain=pass
ARC-Seal: i=1; d=relay.forwarder.com;
    serci=destination.example.com
ARC-Authentication-Results: i=1; spf=neutral; relay=pass
    policy.rfid=relay+flow+id
To: user@destination.example.com
From: spoofed_user@victim.example.com

As with the above, a better approach might be to use the path based reputation system where the relay flow identifier is used to replace the domain in the path . The spammy forwarding path is [spf-neutral, relay+flow+id, destination.example.com]. Reputation analysis using this identifier with the relay flow identifier will be more specific than the domain based approach.

8. DMARC

These protocols can act as authenticators for DMARC [RFC7489]. In particular SeRCi can act similarly to requires alignment with the From header domain and SeRCi originator's ARC-Seal "d=" domains at ARC set "i=1". Assuming From alignment, a SeRCi/Chain building global pass on a message will indicate a DMARC pass. As noted in the prior example, when available SeRCi/Chain can provide more accurate authentication than SPF or DKIM, and it is up to local policy to determine preferencing or exclusion of results.

9. Privacy Considerations

10. Security Considerations

11. IANA Considerations

This document has no IANA actions yet.

12. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
[RFC4648]
Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, DOI 10.17487/RFC4648, , <https://www.rfc-editor.org/rfc/rfc4648>.
[RFC4954]
Siemborski, R., Ed. and A. Melnikov, Ed., "SMTP Service Extension for Authentication", RFC 4954, DOI 10.17487/RFC4954, , <https://www.rfc-editor.org/rfc/rfc4954>.
[RFC5598]
Crocker, D., "Internet Mail Architecture", RFC 5598, DOI 10.17487/RFC5598, , <https://www.rfc-editor.org/rfc/rfc5598>.
[RFC6376]
Crocker, D., Ed., Hansen, T., Ed., and M. Kucherawy, Ed., "DomainKeys Identified Mail (DKIM) Signatures", STD 76, RFC 6376, DOI 10.17487/RFC6376, , <https://www.rfc-editor.org/rfc/rfc6376>.
[RFC7208]
Kitterman, S., "Sender Policy Framework (SPF) for Authorizing Use of Domains in Email, Version 1", RFC 7208, DOI 10.17487/RFC7208, , <https://www.rfc-editor.org/rfc/rfc7208>.
[RFC7489]
Kucherawy, M., Ed. and E. Zwicky, Ed., "Domain-based Message Authentication, Reporting, and Conformance (DMARC)", RFC 7489, DOI 10.17487/RFC7489, , <https://www.rfc-editor.org/rfc/rfc7489>.
[RFC8601]
Kucherawy, M., "Message Header Field for Indicating Message Authentication Status", RFC 8601, DOI 10.17487/RFC8601, , <https://www.rfc-editor.org/rfc/rfc8601>.
[RFC8617]
Andersen, K., Long, B., Ed., Blank, S., Ed., and M. Kucherawy, Ed., "The Authenticated Received Chain (ARC) Protocol", RFC 8617, DOI 10.17487/RFC8617, , <https://www.rfc-editor.org/rfc/rfc8617>.

Appendix A. Acknowledgments

Thanks goes to Brandon Long, John R. Levine, Murray S. Kucherawy, Emanuel Schorsch and Bruce Nan for their knowledgeable feedback.

Authors' Addresses

Weihaw Chuang
Google, Inc.
Bron Gondwana
Fastmail Pty Ltd
Allen Robin
Google, Inc.