Network Working Group                               M. S. Kucherawy, Ed.
Internet-Draft                                            1 October 2022
Intended status: Experimental                                           
Expires: 4 April 2023


     Replay-Resistant DomainKeys Identified Mail (DKIM) Signatures
                  draft-kucherawy-dkim-anti-replay-01

Abstract

   DomainKeys Identified Mail (DKIM) provides a digital signature
   mechanism for Internet messages, allowing a domain name owner to
   affix its domain name in a way that can be cryptographically
   validated.

   DKIM signatures protect the integrity of the message header and body
   only.  By design, it decoupled itself from the transport and storage
   mechanisms used to handle messages.  This gives rise to a possible
   replay attack, but the original DKIM specification fell short of
   providing a mitigation strategy.  This document presents an optional
   method for binding a signature to a specific recipient or set of
   recipients so that broader replay attacks can be mitigated.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 4 April 2023.

Copyright Notice

   Copyright (c) 2022 IETF Trust and the persons identified as the
   document authors.  All rights reserved.


Kucherawy                 Expires 4 April 2023                  [Page 1]

Internet-Draft      DKIM Anti-Replay Canonicalization       October 2022


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Definitions . . . . . . . . . . . . . . . . . . . . . . . . .   3
     2.1.  Recommended Reading . . . . . . . . . . . . . . . . . . .   3
     2.2.  Requirements Language . . . . . . . . . . . . . . . . . .   3
   3.  The 'e' Tag . . . . . . . . . . . . . . . . . . . . . . . . .   3
     3.1.  Syntax  . . . . . . . . . . . . . . . . . . . . . . . . .   3
     3.2.  General Definition  . . . . . . . . . . . . . . . . . . .   3
       3.2.1.  Modified Algorithm  . . . . . . . . . . . . . . . . .   4
     3.3.  Example . . . . . . . . . . . . . . . . . . . . . . . . .   5
   4.  Discussion  . . . . . . . . . . . . . . . . . . . . . . . . .   6
     4.1.  Recipient Mutations . . . . . . . . . . . . . . . . . . .   7
     4.2.  Envelope Splitting  . . . . . . . . . . . . . . . . . . .   7
   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   7
   6.  Security Considerations . . . . . . . . . . . . . . . . . . .   8
   7.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   8
     7.1.  Normative References  . . . . . . . . . . . . . . . . . .   8
     7.2.  Informative References  . . . . . . . . . . . . . . . . .   8
   Appendix A.  Acknowledgments  . . . . . . . . . . . . . . . . . .   9
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   9

1.  Introduction

   DomainKeys Identified Mail (DKIM) provides a digital signature
   mechanism for Internet messages, allowing a domain name owner to
   affix its domain name to a message in a way that can be
   cryptographically validated.

   [RFC4686] presents the original threat model DKIM was meant to
   address, and the environment in which it was expected to work.
   Notably, DKIM decoupled itself from the transport of the message.
   The theory suggests it should be possible to validate a signature
   whether a message is in situ (i.e., in an inbox on disk), in transit
   between mail servers, or being retrieved through a mailbox access
   protocol.


Kucherawy                 Expires 4 April 2023                  [Page 2]

Internet-Draft      DKIM Anti-Replay Canonicalization       October 2022


   In particular, this meant a DKIM signature can validate irrespective
   of what is in the SMTP [RFC5321] envelope containing it, or even when
   there is no envelope to consider.  This means a message and its
   signature can be re-sent to anyone simply by changing the set of
   recipients in the envelope and passing the message back to a Mail
   Transport Agent (MTA) or Mail Submission Agent (MSA).  As the message
   itself is unaltered, any DKIM signature(s) on it will continue to
   validate.  This is a form of replay attack, and it relies for its
   success on the perceived value (i.e., reputation) of the domain(s)
   named in the signature(s).

   This document describes a mechanism by which a signature and a
   message can be coupled such that successful replays to other
   recipient sets are not possible, as the signature will no longer
   validate.

2.  Definitions

2.1.  Recommended Reading

   Several terms used in this document are based on their definitions in
   [RFC5598].

   The term "envelope recipient" is, using the notation proposed in that
   document, an RFC5321.RcptTo address.

2.2.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

3.  The 'e' Tag

3.1.  Syntax

   Using ABNF [RFC5234], the syntax for the new tag is:

       sig-e-tag = %x65 [FWS] "=" %x79

3.2.  General Definition

   This section introduces the "e" (for "envelope") tag, a new DKIM
   signature tag that can be used by a signer to indicate that signature
   will only validate for a specific envelope recipient set, namely the
   one associated with the message at the time it was signed.


Kucherawy                 Expires 4 April 2023                  [Page 3]

Internet-Draft      DKIM Anti-Replay Canonicalization       October 2022


   DKIM signers and verifiers to date have no reason to be interested in
   any aspect of the envelope used to transport a message.  This sort of
   verification is not possible without that context being available,
   which may prove to be a challenge to some operating environments.
   Also, this will make it impossible to validate a DKIM signature using
   this algorithm in a context where no envelope exists, such as when
   retrieving a message from a mailbox.

   The expected value of the tag is simply the character "y", though
   other values may be introduced by future work.  The value has no
   particular meaning; the presence of the tag is the important signal.

   [FOR DISCUSSION] Maybe this should be "r", indicating "recipients",
   to allow later extensions to include other parts of the envelope that
   might be helpful to include.

   The presence of this tag in a DKIM signature indicates that the
   signer executed a modified version of the algorithm described in
   Section 3.7 of [RFC6376], and the verifier MUST do the same.  The
   modification inserts the envelope recipients available at signing or
   verification time into the data fed to the hash algorithm to either
   produce or verify the DKIM signature.

3.2.1.  Modified Algorithm

   This section specifies the modified version of the algorithm defined
   in Section 3.7 of [RFC6376].

   The pseudo-code of "data-hash" is replaced as follows:

     OLD:

       data-hash = hash-alg (h-headers, D-SIG, body-hash)

     NEW:

       data-hash = hash-alg (recipients, h-headers, D-SIG, body-hash)

   The definition of "data-hash" is replaced as follows:


Kucherawy                 Expires 4 April 2023                  [Page 4]

Internet-Draft      DKIM Anti-Replay Canonicalization       October 2022


  OLD:

    data-hash: is the output from using the hash-alg algorithm, to hash
               the header including the DKIM-Signature header, and the
               body hash.

  NEW:

    data-hash: is the output from using the hash-alg algorithm to hash
               the recipients, the header including the DKIM-Signature header,
               and the body hash.

   "recipients" is determined as follows:

   1.  Collect all envelope recipients into a list.

   2.  Sort them in typical lexical ASCII order.

   3.  Format the list by concatenating them all in this sorted order,
       separated by CRLF strings (ASCII 13 followed by ASCII 10), and
       with the last one terminated by a CRLF.

   The signing and verifying processes defined for DKIM are otherwise
   unmodified.

3.3.  Example

   Consider the following SMTP transaction, wherein "C" denotes
   something sent by an SMTP client, "S" denotes something sent by an
   SMTP server, and terminating CRLFs in both directions are omitted:

     C: MAIL FROM:<msk@example.net>
     S: 250 Sender OK
     C: RCPT TO:<bob@example.com>
     S: 250 Recipient OK
     C: RCPT TO:<alice@example.com>
     S: 250 Recipient OK
     C: DATA
     S: 354 Go ahead
     [message header omitted]

     [message body omitted]
     .
     C: 250 Message delivered


Kucherawy                 Expires 4 April 2023                  [Page 5]

Internet-Draft      DKIM Anti-Replay Canonicalization       October 2022


   Compared to the standard signatures that would be generated or
   verified in the absence of this tag, the process described above
   would work the same way as the standard signing process would, except
   that the content fed to the hash algorithm would be preceded by:

     alice@example.com<CR><LF>
     bob@example.com<CR><LF>

4.  Discussion

   Use of this tag guarantees that a signature will not verify unless
   sent to exactly the same set of envelope recipients as was present in
   the envelope when the message was prepared for signing.  The fact
   that the recipient set is sorted allows verifiers to tolerate any
   reordering of the envelope that may be done in transit.  However, if
   any original recipient is removed, or any new recipient added, the
   signature will not validate because the content passed to the hash
   step at the verifier will differ from what was done at the signer.
   Thus, in the replay scenario described in Section 1, the signature no
   longer validates.

   Anecdotal evidence suggests that the bulk of Internet message traffic
   is single-recipient traffic already, which implies the success of
   this proposal.  However, since the messaging standards both permit
   and even encourage this "common factoring" of traffic, and this
   evidence has not been broadly verified, it is appropriate to consider
   all possibilities.

   In the absence of an SMTP envelope in the verification environment,
   the DKIM implementation SHOULD indicate that the signature cannot be
   verified, as distinct from considering such validation to have
   failed.

   If the need to be able to validate a signature from storage (without
   an envelope) needs to be preserved, the signer can still add a second
   signature not using this tag, which therefore does not need the
   envelope context to verify.  This, however, requires the verifier to
   understand when it is appropriate to use which signature.

   Since [RFC6376] stipulates that unknown tags are to be ignored, there
   will be a possibly substantial time period during which the tag is
   unknown to receivers.  Operators should expect these signatures to
   fail broadly during any early deployment period, even for non-replay
   messages, and it may be some time before meaningful signal begins to
   appear.


Kucherawy                 Expires 4 April 2023                  [Page 6]

Internet-Draft      DKIM Anti-Replay Canonicalization       October 2022


   Note that this mechanism is fragile in the modern Internet message
   ecosystem.  Some scenarios that will yield false negatives with this
   method are described below.

4.1.  Recipient Mutations

   If a receiving MTA notes that one of the envelope recipients refers
   to a mailbox in a domain for which it has administrative authority,
   but is known to be an alias, it may rewrite that envelope into its
   canonical form.  For instance, if a receiving MTA is officially known
   as the mail server for "example.com", but also accepts mail for its
   users when addressed to "example.net", it may alter that latter
   address in the envelope to refer to its canonical name.  This alters
   the recipient list, and thus alters the content passed to the hash
   algorithm when validating the signature, leading to a failure.

   Since hostnames are generally case-insensitive on the Internet, a
   relay MTA might (improperly) fold a hostname to lowercase.  This too
   would invalidate a signature making use of this protocol.

4.2.  Envelope Splitting

   If a message contains envelope recipients at domains served by
   separate MTAs, [RFC5321] compels the handling MTA to split the
   message, creating two envelopes containing identical content.  The
   first of these will be addressed to one recipient and sent on its
   way; the second will be addressed to the other and sent via its own
   route.

   Upon arrival at either DKIM verifier, the recipient list has
   effectively been altered since signing.  This alters the content
   passed to the hash algorithm when validating the signature, leading
   to a failure.

   This can be avoided by arranging that no envelope ever has more than
   a single recipient, but this renders useless an important "common
   factoring" feature of SMTP.  In the case of a mailing list server
   that may need to distribute a single message to a very large number
   of recipients, this method can impose significant compute or storage
   costs.

5.  IANA Considerations

   IANA is asked to make the following entry in the "DKIM-Signature Tag
   Specifications" sub-registry of the "DKIM Parameters" registry group:

   Type:  e


Kucherawy                 Expires 4 April 2023                  [Page 7]

Internet-Draft      DKIM Anti-Replay Canonicalization       October 2022


   Reference:  [this document]

   Status:  active

6.  Security Considerations

   All of the security considerations of [RFC6376] apply when applying
   the modification described here.

   A signer that is forced to generate independently signed messages for
   each recipient in a situation where large recipient lists are common
   could be exploited to cause a denial-of-service attack simply from
   the fact that there is an amplication of work being done.

   The loss of the ability to verify messages signed using this tag when
   extracted from their mailboxes will have unknown security impact.
   Although DKIM intentionally supports this capability, it is not known
   whether it is widely used.

7.  References

7.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC5234]  Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
              Specifications: ABNF", STD 68, RFC 5234,
              DOI 10.17487/RFC5234, January 2008,
              <https://www.rfc-editor.org/info/rfc5234>.

   [RFC5321]  Klensin, J., "Simple Mail Transfer Protocol", RFC 5321,
              DOI 10.17487/RFC5321, October 2008,
              <https://www.rfc-editor.org/info/rfc5321>.

   [RFC6376]  Crocker, D., Ed., Hansen, T., Ed., and M. Kucherawy, Ed.,
              "DomainKeys Identified Mail (DKIM) Signatures", STD 76,
              RFC 6376, DOI 10.17487/RFC6376, September 2011,
              <https://www.rfc-editor.org/info/rfc6376>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

7.2.  Informative References


Kucherawy                 Expires 4 April 2023                  [Page 8]

Internet-Draft      DKIM Anti-Replay Canonicalization       October 2022


   [RFC4686]  Fenton, J., "Analysis of Threats Motivating DomainKeys
              Identified Mail (DKIM)", RFC 4686, DOI 10.17487/RFC4686,
              September 2006, <https://www.rfc-editor.org/info/rfc4686>.

   [RFC5598]  Crocker, D., "Internet Mail Architecture", RFC 5598,
              DOI 10.17487/RFC5598, July 2009,
              <https://www.rfc-editor.org/info/rfc5598>.

Appendix A.  Acknowledgments

   The author wishes to thank Dave Crocker for his contributions to this
   work.

Author's Address

   Murray S. Kucherawy (editor)
   Email: superuser@gmail.com


Kucherawy                 Expires 4 April 2023                  [Page 9]