EXTRA B. Gondwana, Ed.
Internet-Draft FastMail
Intended status: Standards Track April 20, 2018
Expires: October 22, 2018

IMAP Extension for object identifiers
draft-ietf-extra-imap-objectid-00

Abstract

This document adds new properties to IMAP mailboxes and messages to allow clients to more efficiently re-use cached data for resources which have changed location on the server.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on October 22, 2018.

Copyright Notice

Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

IMAP stores are often used by many clients. Each client may cache data from the server so that they don't need to re-download information. [RFC3501] defines that a mailbox can be uniquely referenced by its name and UIDVALIDITY, and a message within that mailbox can be uniquely referenced by its mailbox (name + UIDVALIDITY) and UID. The triple of mailbox name, UIDVALIDITY and UID is guaranteed to be immutable.

[RFC4315] defines a COPYUID response which allows a client which copies messages to know the mapping between the UIDs in the source and destination mailboxes, and hence update its local cache.

If a mailbox is successfully renamed, the client knows that the same messages exist in the destination mailbox name as previously existed in the source mailbox name.

The result is that the client which copies (or [RFC6851] moves) messages or renames a mailbox can update its local cache, but any other client connected to the same store can not know with certainty that the messages are identical, and so will re-download everything.

This extension adds new properties to a message (EMAILID) and mailbox (MAILBOXID) which allow a client to quickly identify messages or mailboxes which have been renamed by another client.

This extension also adds an optional thread identifier (THREADID) to messages, which can be used by the server to indicate messages which it has identified to be related.

2. Conventions Used In This Document

In examples, "C:" indicates lines sent by a client that is connected to a server. "S:" indicates lines sent by the server to the client.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119] when they appear in ALL CAPS. These words may also appear in this document in lower case as plain English words, absent their normative meanings.

3. CAPABILITY Identification

IMAP servers that support this extension MUST include "OBJECTID" in the response list to the CAPABILITY command.

4. MAILBOXID object identifier

The MAILBOXID is a server-allocated unique identifer for each mailbox.

The server SHOULD return the same MAILBOXID for a server with the same mailbox name and UIDVALIDITY. This is almost MUST, but weakened to allow for the possibility of loss of OBJECTID data during disaster recovery while still keeping the name and UIDVALIDITY the same.

The server MUST NOT report the same MAILBOXID for two mailboxes at the same time.

The server MUST NOT reuse the same MAILBOXID for a mailbox which does not obey all the invarients that [RFC3501] defines for a mailbox which does not change name or UIDVALIDITY.

The server MAY choose to create a MAILBOXID value in a way that does not survive RENAME (e.g. a digest of name + uidvalidity could be used), however the client then loses the major benefit of having an identifier.

4.1. New response code for CREATE

This document extends the CREATE command to have the response code MAILBOXID on successful mailbox creation.

Syntax: "MAILBOXID" SP "(" <objectid> ")"

Response code in tagged OK for successful CREATE command.

Example:

C: 3 create foo
S: 3 OK [MAILBOXID (F2212ea87-6097-4256-9d51-716686338625)] Completed
C: 4 create bar
S: 4 OK [MAILBOXID (F6352ae03-b7f5-463c-896f-d47cf8b48ee3)] Completed
C: 5 create foo
S: 5 NO Mailbox already exists

4.2. New OK Untagged Response for SELECT and EXAMINE

This document adds a new untagged response code to any successful SELECT or EXAMINE command. A server advertising OBJECTID capability MUST return this untagged response to all successful SELECT and EXAMINE commands.

Syntax: "OK" SP "[" "MAILBOXID" SP "(" <objectid> ")" "]" text

Untagged OK response to SELECT or EXAMINE.

Example:

C: 27 select "foo"
[...]
S: * OK [UIDVALIDITY 1524195797] Ok
S: * OK [MAILBOXID (F2212ea87-6097-4256-9d51-716686338625)] Ok
[...]
S: 27 OK [READ-WRITE] Completed

4.3. New attribute for STATUS

This document adds the MAILBOXID attribute to the STATUS command using the extended syntax defined in [RFC4466].

Syntax: "MAILBOXID"

The attribute in the STATUS command.

Syntax: "MAILBOXID" SP "(" <objectid> ")"

The response item in the STATUS response contains the objectid
assigned by the server for this mailbox.

Example:

C: 6 status foo (mailboxid)
S: * STATUS foo (MAILBOXID (F2212ea87-6097-4256-9d51-716686338625))
S: 6 OK Completed
C: 7 status bar (mailboxid)
S: * STATUS bar (MAILBOXID (F6352ae03-b7f5-463c-896f-d47cf8b48ee3))
S: 7 OK Completed
C: 8 rename foo renamed
S: * OK rename foo renamed
S: 8 OK Completed
C: 9 status renamed (mailboxid)
S: * STATUS renamed (MAILBOXID (F2212ea87-6097-4256-9d51-716686338625))
S: 9 OK Completed
C: 10 status bar (mailboxid)
S: * STATUS bar (MAILBOXID (F6352ae03-b7f5-463c-896f-d47cf8b48ee3))
S: 10 OK Completed

When the LIST-STATUS IMAP capability [RFC5819] is also available, the STATUS command can be combined with the LIST command to further improve efficiency. This way, the MAILBOXIDs of many mailboxes can be queried with just one LIST command.

Example:

C: 11 list "" "*" return (status (mailboxid))
S: * LIST (\HasNoChildren) "." INBOX
S: * STATUS INBOX (MAILBOXID (Ff8e3ead4-9389-4aff-adb1-d8d89efd8cbf))
S: * LIST (\HasNoChildren) "." bar
S: * STATUS bar (MAILBOXID (F6352ae03-b7f5-463c-896f-d47cf8b48ee3))
S: * LIST (\HasNoChildren) "." renamed
S: * STATUS renamed (MAILBOXID (F2212ea87-6097-4256-9d51-716686338625))
S: 11 OK Completed (0.001 secs 3 calls)

5. EMAILID object identifier and THREADID correlator

This document adds two additional data items to individual email messages within a mailbox, EMAILID and THREADID.

5.1. EMAILID identifier for identical messages

The EMAILID data item is an objectid which uniquely identifies the content of a single message. Anything which must remain immutable on a {name, uidvalidity, uid} triple must also be the same between messages with the same EMAILID.

The server SHOULD return the same EMAILID for the same UID triple. As with MAILBOXID above, this is almost a MUST, but allows for the possibility of loss of OBJECTID data in disaster recovery without having to change UIDVALIDITY.

The server SHOULD return the same EMAILID for the exact same message content in different folders after a COPY or [RFC6851] MOVE command.

The server MAY us content detection to assign the same EMAILID upon APPEND of an identical message.

5.2. THREADID identifer for related messages

The THREADID data item is an objectid which uniquely identifies as set of messages which the server believes are related in some way.

THREADID calculation is generally based on some combination of References, In-Reply-To and Subject, but the exact logic is left up to the server implementation.

The server MUST return the same THREADID for all messages with the same EMAILID.

The server SHOULD return the same THREADID for related messages even if they are in different mailboxes.

THREADID is optional, if the server is unable to calculate relationships between messages, it MUST return NIL to in all FETCH responses for the THREADID data item, and it MUST NOT match any messages for any value of THREADID.

The server MAY use the same value for both EMAILID and THREADID, for example the THREADID could be the EMAILID of the first message seen in each thread.

5.3. New Message Data Items in FETCH and UID FETCH Commands

This document specifies two new FETCH items for messages:

Syntax: EMAILID

The EMAILID message data item causes the server to return EMAILID
FETCH response data items.

Syntax: THREADID

The THREADID message data item causes the server to return THREADID
FETCH response data items.

And the following responses:

Syntax: EMAILID ( <objectid> )

The EMAILID response data item contains the server-assigned objectid
for each message.

Syntax: THREADID ( <objectid> )

The THREADID response data item contains the server-assigned objectid
for the set of related messages to which this message belongs.

Syntax: THREADID NIL

The NIL value to the THREADID response data item is returned when
the server mailbox does not support THREADID calculation.

Example:

C: 5 append inbox "20-Mar-2018 03:07:37 +1100" {733}
[...]
Subject: Message A
Message-ID: <fake.1521475657.54797@hotmail.com>
[...]
S: 5 OK [APPENDUID 1521475658 1] Completed

C: 11 append inbox "20-Mar-2018 03:07:37 +1100" {793}
[...]
Subject: Re: Message A
Message-ID: <fake.1521475657.21213@gmail.com>
References: <fake.1521475657.54797@hotmail.com>
[...]
S: 11 OK [APPENDUID 1521475658 2] Completed

C: 17 append inbox "20-Mar-2018 03:07:37 +1100" {736}
[...]
Subject: Message C
Message-ID: <fake.1521475657.60280@hotmail.com>
[...]
S: 17 OK [APPENDUID 1521475658 3] Completed

C: 22 fetch 1:* (emailid threadid)
S: * 1 FETCH (EMAILID (Md8976d99ac3275bb4e) THREADID (T4964b478a75b7ea9))
S: * 2 FETCH (EMAILID (Mdd3c288836c4c7a762) THREADID (T4964b478a75b7ea9))
S: * 3 FETCH (EMAILID (Mf2e25fdc09b49ea703) THREADID (T6311863d02dd95b5))
S: 22 OK Completed (0.000 sec)

C: 23 move 2 foo
S: * OK [COPYUID 1521475659 2 1] Completed
S: * 2 EXPUNGE
S: 23 OK Completed

C: 24 fetch 1:* (emailid threadid)
S: * 1 FETCH (EMAILID (Md8976d99ac3275bb4e) THREADID (T4964b478a75b7ea9))
S: * 2 FETCH (EMAILID (Mf2e25fdc09b49ea703) THREADID (T6311863d02dd95b5))
S: 24 OK Completed (0.000 sec)
C: 25 select "foo"

C: 25 select "foo"
[...]
S: 25 OK [READ-WRITE] Completed
C: 26 fetch 1:* (emailid threadid)
S: * 1 FETCH (EMAILID (Mdd3c288836c4c7a762) THREADID (T4964b478a75b7ea9))
S: 26 OK Completed (0.000 sec)

Example: (no THREADID support)

C: 26 fetch 1:* (emailid threadid)
S: * 1 FETCH (EMAILID (M00000001) THREADID NIL)
S: * 2 FETCH (EMAILID (M00000002) THREADID NIL)
S: 26 OK Completed (0.000 sec)

6. New Filters on SEARCH command

This extension defines the EMAILID and THREADID filters for the SEARCH command.

EMAILID <objectid>

Messages whose EMAILID is exactly the specified objectid.

THREADID <objectid>

Messages whose THREADID is exactly the specified objectid.

Example: (as if run before the MOVE above when the mailbox had 3 messages)

C: 27 search emailid Md8976d99ac3275bb4e918af4
S: * SEARCH 1
S: 27 OK Completed (1 msgs in 0.000 secs)
C: 28 search threadid T4964b478a75b7ea9
S: * SEARCH 1 2
S: 28 OK Completed (2 msgs in 0.000 secs)

7. Formal syntax

The following syntax specification uses the Augmented Backus-Naur Form (ABNF) [RFC4234] notation. Elements not defined here can be found in the formal syntax of the ABNF [RFC4234], IMAP [RFC3501], and IMAP ABNF extensions [RFC4466] specifications.

Except as noted otherwise, all alphabetic characters are case- insensitive. The use of upper- or lowercase characters to define token strings is for editorial clarity only. Implementations MUST accept these strings in a case-insensitive fashion.

capability =/ "OBJECTID"

fetch-att =/ "EMAILID" / "THREADID"

fetch-emailid-resp = "EMAILID" SP "(" objectid ")" ;; follows tagged-ext production from [RFC4466]

fetch-threadid-resp = "THREADID" SP "(" objectid ")" ;; follows tagged-ext production from [RFC4466]

objectid = 1*255(ALPHA / DIGIT / "_" / "-") ;; characters in object identifiers are case ;; significant

resp-text-code =/ "MAILBOXID" SP "(" objectid ")" ;; incorporated before the expansion rule of ;; atom [SP 1*<any TEXT-CHAR except "]">] ;; that appears in [RFC3501]

search-key =/ "EMAILID" SP objectid / "THREADID" SP objectid

status-att =/ "MAILBOXID"

status-att-value =/ "MAILBOXID" SP "(" objectid ")" ;; follows tagged-ext production from [RFC4466]

8. Implementation considerations

8.1. Assigning object identifiers

All objectid values are allocated by the server.

In the interests of reducing the possibilities of encoding mistakes, objectids are restricted to a safe subset of possible byte values, and in order to allow clients to allocate storage, they are restricted in length.

An objectid is a string of 1 to 255 characters from the following set of 64 codepoints. a-z, A-Z, 0-9, '_', '-'. These characters are safe to use in almost any context (e.g. filesystems, URIs, IMAP atoms).

For maximum safety, servers SHOULD also follow defensive allocation strategies to avoid creating risks where glob completion or data type detection may be present (e.g. on filesystems or in spreadsheets). In particular it is wise to avoid:

A good solution to these issues is to prefix every ID with a single alphabetical character.

8.2. Interaction with special cases

The case of RENAME INBOX may need special handling for unique ids.

It is advisable (though not required) to have MAILBOXID be globally unique, but they it is only required to be unique within a single server.

8.3. Ideas for implementing object identifiers

This section contains non-normative suggestions.

Ideas for implementing MAILBOXID:

Ideas for implementing EMAILID:

Ideas for implementing THREADID:

There is a need to index and look up reference/in-reply-to data at message creation to efficiently find matching messages for threading. Threading may be either across folders, or within each folder only. The server has significant leeway here.

9. Future considerations

This extension is intentionally defined to be compatible with the data model in [I-D.ietf-jmap-mail].

A future extension could be proposed to give a way to SELECT a mailbox by MAILBOXID rather than name.

An extension to allow fetching message content directly via EMAILID and message listings by THREADID could be proposed.

10. IANA Considerations

The IANA is requested to add "OBJECTID" to the "IMAP Capabilities" registry located at <http://www.iana.org/assignments/imap-capabilities>.

11. Security Considerations

It is strongly advised that servers generate OBJECTIDs which are safe to use as filesystem names, and unlikely to be auto-detected as numbers. See implementation considerations.

The use of a digest for ID generation may be used as proof that a particular sequence of bytes was seen by the server.

12. Changes

To be removed by the editor before publication

12.1. 01

13. Acknowledgments

The EXTRA working group at IETF.

The gmail team's X-GM-THRID and X-GM-MSGID implementation.

14. References

14.1. Normative References

[I-D.ietf-jmap-mail] Jenkins, N., "JMAP for Mail", Internet-Draft draft-ietf-jmap-mail-04, March 2018.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997.
[RFC3501] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION 4rev1", RFC 3501, DOI 10.17487/RFC3501, March 2003.
[RFC4234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 4234, DOI 10.17487/RFC4234, October 2005.
[RFC4315] Crispin, M., "Internet Message Access Protocol (IMAP) - UIDPLUS extension", RFC 4315, DOI 10.17487/RFC4315, December 2005.
[RFC4466] Melnikov, A. and C. Daboo, "Collected Extensions to IMAP4 ABNF", RFC 4466, DOI 10.17487/RFC4466, April 2006.
[RFC5819] Melnikov, A. and T. Sirainen, "IMAP4 Extension for Returning STATUS Information in Extended LIST", RFC 5819, DOI 10.17487/RFC5819, March 2010.
[RFC6851] Gulbrandsen, A. and N. Freed, "Internet Message Access Protocol (IMAP) - MOVE Extension", RFC 6851, DOI 10.17487/RFC6851, January 2013.

14.2. Informative References

[RFC4122] Leach, P., Mealling, M. and R. Salz, "A Universally Unique IDentifier (UUID) URN Namespace", RFC 4122, DOI 10.17487/RFC4122, July 2005.

Author's Address

Bron Gondwana (editor) FastMail Level 2, 114 William St Melbourne, VIC 3000 Australia EMail: brong@fastmailteam.com URI: https://www.fastmail.com