Internet-Draft | Linearized Matrix API | June 2023 |
Ralston & Hodgson | Expires 8 December 2023 | [Page] |
Matrix is an existing openly specified decentralized secure communications protocol able to provide a framework for instant messaging interoperability. However, the existing model can be complex to reason about for simple interoperability usecases. With modifications to the room model, Matrix can support those simpler usecases more easily.¶
This document explores "Linearized Matrix": the modified room model still backed by Matrix.¶
This note is to be removed before publishing as an RFC.¶
The latest revision of this draft can be found at https://turt2live.github.io/ietf-mimi-linearized-matrix/draft-ralston-mimi-linearized-matrix.html. Status information for this document may be found at https://datatracker.ietf.org/doc/draft-ralston-mimi-linearized-matrix/.¶
Discussion of this document takes place on the More Instant Messaging Interoperability Working Group mailing list (mailto:mimi@ietf.org), which is archived at https://mailarchive.ietf.org/arch/browse/mimi/. Subscribe at https://www.ietf.org/mailman/listinfo/mimi/.¶
Source for this draft and an issue tracker can be found at https://github.com/turt2live/ietf-mimi-linearized-matrix.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 8 December 2023.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Alongside messaging, Matrix operates as an openly federated communications protocol for VoIP, IoT, and more. The existing Matrix network uses fully decentralized access control within rooms (conversations) and is highly extensible in its structure. These features are not critically important to a strict focus on messaging interoperability, however.¶
This document describes "Linearized Matrix": a modified room model based upon Matrix's existing room model. This document does not explore how to interconnect Linearized Matrix with the existing Matrix room model - interested readers may wish to review MSC3995 [MSC3995] within the Matrix Specification process.¶
This document uses [I-D.ralston-mimi-terminology] where possible.¶
This document additionally uses the following definitions:¶
Further terms are introduced in-context within this document.¶
TODO: We should move/copy those definitions up here anyways.¶
For a given conversation/room:¶
In this diagram, Server A is acting as a hub for the other two servers. Servers B and C do not converse directly when sending events to the room: those events are instead sent to the hub which then distributes them back out to all participating servers.¶
Clients are shown in the diagram here for demonstrative purposes only. No client-server API is specified as part of Linearized Matrix, and the clients can be pre-existing or newly created for messaging. The objects given to clients are implementation-dependent, though for simplicity may be events.¶
This leads to two distinct roles:¶
OPEN QUESTION: Should we support having multiple hubs for increased trust between participant and hub? (participant can pick the hub it wants to use rather than being forced to use a single hub)¶
Throughout this document servers are referred to as having a "domain name" or "server name". A server name MUST be compliant with RFC 1123 (Section 2.1) [RFC1123].¶
TODO: Should we incorporate Matrix's IPv6 extension, or are we able to assume that everyone will be using non-literal hostnames?¶
TODO: Do we really need to make this case sensitive? Matrix does, but is that correct?¶
A room is a conceptual place where users send and receive events. Events are sent to a room, and all users which have sufficient access will receive that event.¶
Rooms have a single internal "Room ID" to identify them from another room:¶
!<opaque>:<domain>¶
For example, !abc:example.org
.¶
The opaque portion of the room ID, called the localpart, must not be empty and must consist
entirely of the characters [0-9a-zA-Z._~-]
.¶
The domain portion of a room ID does NOT indicate the room is "hosted" or served by that domain. The domain is used as a namespace to prevent another server from maliciously taking over a room. The server represented by that domain may no longer be participating in the room.¶
The total length (including the sigil and domain) of a room ID MUST NOT exceed 255 characters.¶
Room IDs are case sensitive.¶
As described by [I-D.ralston-mimi-terminology], a user is typically a human which operates a client. In Linearized Matrix, all users have a User ID to distinguish them:¶
@<localpart>:<domain>¶
The localpart portion of the user ID is expected to be human-readable, MUST NOT be empty,
and MUST consist solely of [0-9a-z._=-/]
characters. Note that user IDs cannot contain
uppercase letters in the localpart.¶
The domain portion indicates which server allocated the ID, or would allocate the resource
if the user doesn't exist yet. @alice:first.example.org
is a different user on a different
server from @alice:second.example.org
, for example.¶
The total length (including the sigil and domain) of a user ID MUST NOT exceed 255 characters.¶
User IDs are case sensitive.¶
Note: User IDs are sometimes informally referenced as "MXIDs", short for "Matrix User IDs".¶
Author's note: This draft assumes that an external system will resolve phone number to
user ID, somehow. Or that @18005552222:example.org
will resolve to +1 800 555 2222
on
a given server, or similar.¶
Each user can have zero or more devices/active clients. These devices are intended to be members of the MLS group and thus have their own key package material associated with them.¶
TODO: Do we need to define grammar and such for device IDs, or is that covered by MLS already?¶
All data exchanged over Linearized Matrix is expressed as an "event". Each client action
(such as sending a message) correlates with exactly one event. All events have a type
to distinguish them, and use reverse domain name notation to namespace custom events
(for example, org.example.appname.eventname
). Event types specified by Linearized Matrix
itself use m.
as their namespace.¶
When events are traversing a transport to another server they are often referred to as a Persistent Data Unit or PDU.¶
An event has many other fields:¶
room_id
(string; required) - The room ID for where the event is being sent.¶
type
(string; required) - A UTF-8 [RFC3629] string to distinguish different data types
being carried by events. All event types use a reverse domain name notation to namespace
themselves (for example, org.example.appname.eventname
). Event types specified by
Linearized Matrix itself use m
as their namespace (for example, m.room.member
).¶
state_key
(string; optional) - A UTF-8 [RFC3629] string to further distinguish an event
type from other related events. Only specified on State Events (discussed later). Can be
empty.¶
sender
(string; required) - The user ID which is sending this event.¶
origin_server_ts
(integer; required) - The milliseconds since the unix epoch for when this
event was created.¶
hub_server
(string; technically optional) - The domain name of the hub server which is
sending this event to the remainder of the room. Note that all events created within Linearized
Matrix will have this field set.¶
content
(object; required) - The event content. The schema of this is specific to the event
type, and should be considered untrusted data until verified otherwise. Malicious servers and
clients can, for example, exclude important fields, use invalid value types, or otherwise
attempt to disrupt a client - receivers should treat the event with care while processing.¶
hashes
(object; required) - Keyed by hash algorithm, the content hash for the event.¶
signatures
(object; required) - Keyed first by domain name then by key ID, the signatures for
the event.¶
auth_events
(array of strings; required) - The event IDs which prove the sender is able to
send this event in the room. Which specific events are put here are defined by the auth events
selection algorithm.¶
prev_events
(array of strings; required) - The event IDs which precede the event. Note that all
events generated within Linearized Matrix will only ever have a single event ID here.¶
unsigned
(object; optional) - Additional metadata not covered by the signing algorithm.¶
Note that an event ID is not specified on the schema. Event IDs are calculated to ensure accuracy
and consistency between servers. To calculate an event ID, calculate the reference hash of the
event, encode it using URL-safe Unpadded Base64, and prefix it with the event ID sigil, $
.¶
If both the sender and receiver are implementing the algorithms correctly, the event ID will be
the same. When different, the receiver will have issues accepting the event (none of the auth_events
will make sense, for example). Both sender and receiver should review their algorithm implementation
to verify everything is according to the specification in this case.¶
Events are treated as JSON [RFC8259] within the protocol, but can be encoded and represented by any binary-compatible format. Additional overhead may be introduced when converting between formats, however.¶
An example may be:¶
{ "room_id": "!abc:example.org", "type": "m.room.member", "state_key": "@alice:first.example.org", "sender": "@bob:second.example.org", "origin_server_ts": 1681340188825, "hub_server": "first.example.org", "content": { "membership": "invite" }, "hashes": { "sha256": "<unpadded base64>" }, "signatures": { "first.example.org": { "ed25519:1": "<unpadded base64 for signature covering whole event>" }, "second.example.org": { "ed25519:1": "<unpadded base64 for signature covering LPDU>" } }, "auth_events": ["$first", "$second"], "prev_events": ["$parent"], "unsigned": { "arbitrary": "fields" } }¶
The hub server is responsible for ensuring events are linearly added to the room from all participants,
which means participants cannot set fields such as prev_events
on their events. Additionally,
participant servers are not expected to store past conversation history or even "current state" for
the room, further making participants unable to reliably populate auth_events
and prev_events
.¶
To avoid these problems, the participant server does not populate the following fields on events they are sending to the hub:¶
auth_events
- the participant cannot reliably determine what allows it to send the event.¶
prev_events
- the participant cannot reliably know what event precedes theirs.¶
hashes
- the hashes cover the above two fields.¶
The participant server will receive an echo of the fully-formed event from the hub once appended. To ensure authenticity, the participant server signs this "Linearized PDU" or "LPDU" using the normal event signing algorithm.¶
TODO: While a signature is great, it doesn't cover the content. We need to fix hashes
to
actually support an LPDU hash alongside a full-blown content hash.¶
State events track metadata for the room, such as name, topic, and members. State is keyed by a
tuple of type
and state_key
, noting that an empty string is a valid state key. State in the
room with the same key-tuple will be overwritten.¶
State events are otherwise processed like regular events in the room: they're appended to the room history and can be referenced by that room history.¶
"Current state" is the state at the time being considered (which is often the implied HEAD
of
the room). In Linearized Matrix, a simple approach to calculating current state is to iterate
over all events in order, overwriting the key-tuple for state events in an adjacent map. That
map becomes "current state" when the loop is finished.¶
Linearized Matrix defines the following event types:¶
m.room.create
The very first event in the room. It MUST NOT have any auth_events
or prev_events
, and the
domain of the sender
MUST be the same as the domain in the room_id
. The state_key
MUST
be an empty string.¶
The content
for a create event MUST have at least a room_version
field to denote what set
of algorithms the room is using. This document as a whole describes a single room version
identified as I.1
.¶
Implementation note: Currently I.1
is not a real thing. Use
org.matrix.i-d.ralston-mimi-linearized-matrix.00
when testing against other Linearized Matrix
implementations. This room version may be updated later.¶
TODO: Describe room versions more?¶
m.room.join_rules
Defines whether users can join without an invite and other similar conditions. The state_key
MUST be an empty string.¶
The content
for a join rules event MUST have at least a join_rule
field to denote the
join policy for the room. Allowable values are:¶
public
- anyone can join without an invite.¶
knock
- users must receive an invite to join, and can request an invite (knock) too.¶
invite
- users must receive an invite to join.¶
TODO: Describe restricted
(and knock_restricted
) rooms?¶
m.room.member
Defines the membership for a user in the room. If the user does not have a membership event then
they are presumed to be in the leave
state.¶
The state_key
MUST be a non-empty string denotating the user ID the membership is affecting.¶
The content
for a membership event MUST have at least a membership
field to denote the
membership state for the user. Allowable values are:¶
leave
- not participating in the room. If the state_key
and sender
do not match, this was
a kick rather than voluntary leave.¶
join
- participating in the room.¶
knock
- requesting an invite to the room.¶
invite
- invited to participate in the room.¶
ban
- implies kicked/not participating. Cannot be invited or join the room without being
unbanned first (moderator sends a kick, essentially).¶
The auth rules define how these membership states interact and what legal transitions are possible. For example, preventing users from unbanning themselves falls under the auth rules.¶
m.room.power_levels
Defines what given users can and can't do, as well as which event types they are able to send. The enforcement of these power levels is determined by the auth rules.¶
The state_key
MUST be an empty string.¶
The content
for a power levels event SHOULD have at least the following:¶
ban
(integer) - the level required to ban a user. Defaults to 50
if unspecified.¶
kick
(integer) - the level required to kick a user. Defaults to 50
if unspecified.¶
invite
(integer) - the level required to invite a user. Defaults to 0
if unspecified.¶
redact
(integer) - the level required to redact an event sent by another user. Defaults
to 50
if unspecified.¶
events
(map) - keyed by event type string, the level required to send that event type to
the room. Defaults to an empty map if unspecified.¶
events_default
(integer) - the level required to send events in the room. Overridden by
the events
map. Defaults to 0
if unspecified.¶
state_default
(integer) - the level required to send state events in the room. Overridden
by the events
map. Defaults to 50
if unspecified.¶
users
(map) - keyed by user ID, the level of that user. Defaults to an empty map if
unspecified.¶
users_default
(integer) - the level for users. Overridden by the users
map. Defaults to
0
if unspecified.¶
TODO: Include notifications for at-room here too?¶
Note that if no power levels event is specified in the room then the room creator (sender
of
the m.room.create
state event) has a default power level of 100.¶
TODO: m.room.name
, m.room.topic
, m.room.avatar
, m.room.encryption
, m.room.history_visibility
¶
TODO: Drop m.room.encryption
and pack it into the create event instead?¶
MIMI has a chartered requirement to use MLS for encryption, and MLS requires that all group members (devices) know of all other devices. If we consider each Matrix room to have an MLS group, we encounter scenarios where the room and group membership might diverge or otherwise not be equivalent.¶
In a traditional Matrix room, membership is not managed at a per-device level but rather a per-user level. Devices are authenticated to use the room by being attached to a user. This model doesn't work in MLS, though.¶
A couple of options present themselves:¶
At this stage of drafting in the document, it is not clear which would be preferred. Both are explored.¶
In this model, servers handle the room state on behalf of devices. This gives the server an ability to apply access control at a user level, and instruct other devices on when/how to add or remove devices from the underlying MLS group. The server does not have an ability to participate in the MLS group directly.¶
This is how traditional Matrix rooms work by handling state changes (user membership, etc) in cleartext for everyone to see. A user's devices would be tracked and added/removed from the MLS group as needed.¶
The exact rules for how a user's devices become engaged with the MLS group is not yet defined.¶
An advantage over this model compared to client-side is the server is able to reduce the client's traffic by rejecting events earlier and deal with conflicts that may arise, keeping the conversation as linear as possible for the client.¶
A clear disadvantage is that without cross-signing or other cryptographic mechanism, the server would be able to add malicious devices to its users and therefore the MLS group. A precise mitigation strategy is not yet defined by this document, but would involve building verifiable trust in a device before it is allowed to participate in the MLS group.¶
The existing model used by Linearized Matrix is covered by "Event Signing & Authorization" later in this document.¶
TODO: We might also need DMLS to handle some of the server-side conflicts?¶
Here, the room's state is completely managed within the MLS group. This provides a key advantage where servers become message-passing nodes (in essence), but increases implementation complexity on the clients/devices.¶
Much of this model is based around the server-side model discussed above: event authorization rules, redactions, etc still behave the same, but on the client-side instead. The server would likely be responsible for ensuring incoming events are properly signed, but otherwise leave it up to clients to accept or reject them into their internal linked list.¶
A potential consequence of this model is clients needing to implement a conflict resolution algorithm despite having linear room history. This is due to clients receiving MLS messages out of guaranteed order.¶
TODO: This could be DMLS, state res, or both.¶
TODO: This section, if we want a single canonical hub in the room. Some expected problems in this area are: who signs the transfer event? who sends the transfer event? how does a transfer start?¶
TODO: Is this section better placed in the MSC for now?¶
TODO: This section, though this is likely (should be?) to be a dedicated I-D.¶
Topics: * Server discovery * Publishing of signing keys * Sending events between servers * Media handling * etc¶
Matrix currently uses an HTTPS+JSON transport for this.¶
TODO: Expand upon this section.¶
The m.*
namespace likely needs formal registration in some capacity.¶
Thank you to the Matrix Spec Core Team (SCT), and in particular Richard van der Hoff, for exploring how Matrix rooms could be represented as a linear structure, leading to this document.¶