XCON WG C. Jennings
Internet-Draft Cisco Systems
Expires: August 9, 2004 B. Rosen
Marconi
February 9, 2004
Media Mixer Control for XCON
draft-jennings-xcon-media-control-00
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at http://
www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on August 9, 2004.
Copyright Notice
Copyright (C) The Internet Society (2004). All Rights Reserved.
Abstract
Conference mixers have many controls that change how the media is
combined for each participant in the conference. There is a need to
describe these to the clients connected to the a centralized
conference so that the clients can render a user interface and allow
the user to manipulate them.
This work is very early and far from complete. This draft sketched
the outline of a solution for consideration. It is being discussed on
the xcon@ietf.org mailing list.
Jennings & Rosen Expires August 9, 2004 [Page 1]
Internet-Draft Media Mixer Control February 2004
Table of Contents
1. Conventions . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Introduction to the Problem . . . . . . . . . . . . . . . . 4
2.1 Non Problems . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.1 Semantic information in a Conference . . . . . . . . . . . . 5
3.2 The Protocol . . . . . . . . . . . . . . . . . . . . . . . . 5
3.3 Templates . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.4 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.5 Controls . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.6 Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4. Introductory Example . . . . . . . . . . . . . . . . . . . . 6
4.1 Simple Audio . . . . . . . . . . . . . . . . . . . . . . . . 6
5. Names and terminology . . . . . . . . . . . . . . . . . . . 8
5.1 Templates . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.2 Participants . . . . . . . . . . . . . . . . . . . . . . . . 8
5.3 Streams . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.3.1 Stream Types . . . . . . . . . . . . . . . . . . . . . . . . 9
5.3.2 Stream URLs . . . . . . . . . . . . . . . . . . . . . . . . 9
5.3.3 Stream Priority . . . . . . . . . . . . . . . . . . . . . . 9
5.4 Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.5 Controls . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.6 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 10
6. Solution . . . . . . . . . . . . . . . . . . . . . . . . . . 10
6.1 Templates . . . . . . . . . . . . . . . . . . . . . . . . . 11
6.1.1 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 11
6.1.2 Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
6.1.3 Streams . . . . . . . . . . . . . . . . . . . . . . . . . . 12
6.1.4 Controls . . . . . . . . . . . . . . . . . . . . . . . . . . 12
6.1.5 Conference State . . . . . . . . . . . . . . . . . . . . . . 12
6.1.6 Transport Protocol . . . . . . . . . . . . . . . . . . . . . 13
6.2 Controls . . . . . . . . . . . . . . . . . . . . . . . . . . 13
6.2.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . 13
6.2.2 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . 14
6.2.3 Integer . . . . . . . . . . . . . . . . . . . . . . . . . . 14
6.2.4 Boolean . . . . . . . . . . . . . . . . . . . . . . . . . . 15
6.2.5 Selection . . . . . . . . . . . . . . . . . . . . . . . . . 15
6.2.6 Multiple Selection . . . . . . . . . . . . . . . . . . . . . 15
6.2.7 Frame . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
7. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 16
7.1 Audio Video Presentation . . . . . . . . . . . . . . . . . . 16
8. Template Registry . . . . . . . . . . . . . . . . . . . . . 17
9. Comparison to other solutions . . . . . . . . . . . . . . . 18
10. CPCP vs. MPCP vs. CCP vs. MCP . . . . . . . . . . . . . . . 18
11. IANA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
12. Security . . . . . . . . . . . . . . . . . . . . . . . . . . 18
13. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . 18
Jennings & Rosen Expires August 9, 2004 [Page 2]
Internet-Draft Media Mixer Control February 2004
Normative References . . . . . . . . . . . . . . . . . . . . 18
Informative References . . . . . . . . . . . . . . . . . . . 18
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 19
Intellectual Property and Copyright Statements . . . . . . . 20
Jennings & Rosen Expires August 9, 2004 [Page 3]
Internet-Draft Media Mixer Control February 2004
1. Conventions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC-2119 [1].
2. Introduction to the Problem
This work tries to solve the problem of allowing a conference
participant to manipulate the media flow in a mixer. It defines a
protocol between the end user's software manipulating the conference
and the centralized conference mixer. This needs to be rich enough
for a mixer to express what information it wants from a mixer yet
simple enough to allow the client to render a useful user interface
to the user. This work takes into account that real mixers have
constraints on what media flows are possible and that UIs have
buttons, knobs, etc that users manipulate. The goal is for a
conferencing end point made by one vendor to work with mixers or
conference systems made by another vendor.
2.1 Non Problems
There are several topics that are completely internal to the
conference systems and are out of scope for this this work. These
include:
How the focus manipulates the mixer.
How one describes what a mixer is capable of doing.
3. Overview
When a conference is created, it is instantiated from a template. The
template describes what controls are available for the client to
manipulate the media. The conference also describes roles that the
client can take on, such as Moderator. The template can have
parameters that are set when it is instantiated to allow one template
to describe variations of similar flow models.
This document describes the templates and ways for the client to
understand and manipulate the media in the conference. It allows for
the following:
A conference consists of several participants and multiple streams
of media flowing between the participant and the mixer.
Sidebars are mini conferences that are just like conferences
Jennings & Rosen Expires August 9, 2004 [Page 4]
Internet-Draft Media Mixer Control February 2004
except that a sidebar cannot itself contain sidebars.
Clients can discover the template chosen for use in a conference,
and the Values of the parameters set for the conference
Clients can discover the available streams in a conference.
Clients can send media on a participant stream and receive media
and receive media on a mixer stream.
Clients can discover the Participants in a conference and their
role (this is more conference policy than media policy).
Clients can join a conference as a participant and assume a
particular role.
Conferences, Streams, and Participants can have controls that
manipulate the media sent and received.
The role of the participant will control what view of the
conference they have and which media streams they can manipulate.
3.1 Semantic information in a Conference
The conference has a list of Participants. Each Participant has a set
of Controls That he can manipulate. Each conference has a list of
sidebars. Each conference has a list of Streams. Each Stream has
attributes such as name, type, priority and list of contributing
participants.
3.2 The Protocol
The protocol between the client and the conference server allows the
client to get the semantic information in the conference, find out
when it changes, and make changes to it. It's probably something like
XCAP. [TODO add ref]
3.3 Templates
Templates define a model for the reception, manipulation and
transmission of streams. A template provides enough information that
the client can intelligently render a useful GUI to the end user to
manipulate the model. There is a registry of well known templates,
but a conference server can define new ones. A convener can find all
the templates a conference server supports and select one to use when
creating the conference.
Jennings & Rosen Expires August 9, 2004 [Page 5]
Internet-Draft Media Mixer Control February 2004
A template for a very basic audio conference, for example, may
indicate that there is one audio stream for each participant, and one
output mixer stream named "primary". Each participant in the stream
has a single binary control called "Mute". There is only one Role
that can be used, called "participant".
3.4 Parameters
Parameters are variables in the template that are set when the
conference is created. For example, in the audio conference, the
maximum number of participants might be a parameter. If the value
was set to 10 when the conference is instantiated, then up to 10
participant streams can be accepted into the mixer. The template can
indicate the valid range for max number of participants, perhaps from
2 to 128.
3.5 Controls
Controls are variables participants may manipulate to control the
media streams of the conference. Conferences can have controls,
participants in a conference can have controls, and streams in a
conference can have controls. Controls can also be implicitly created
by stream action, for example a selector control based on the loudest
speaker. Controls have a name, and a value. Controls are defined in
the template.
3.6 Roles
Participants in a conference can take on different Roles that change
what ccontrols they may manipulate. The template defines what Roles
are available for the client. The moderator (which itself is a role)
can change the role of a particular participant.
4. Introductory Example
4.1 Simple Audio
The client selects the basic audio template that looks like:
Jennings & Rosen Expires August 9, 2004 [Page 6]
Internet-Draft Media Mixer Control February 2004
The client retrieves this template and uses it to create a conference
where it sets the max-participants to 10. Alice and Bob join this
conference and the conference server tells Bob about the state of the
conference media. There is only one role "participamt". Each
participant contributes one input stream. There is also an output
stream per participant. There is a single control, called mute, for
each participant.
After Alice and Bob have joined, the conference server informs Bob
that the current state of the conference is as shown in the xml
below.
10
0 0
There are two participants, Alice and Bob, who both contribute input
streams and receive Mix streams and neither is muted.
Bob's client decides to change the Mute state for its audio stream
and sends the following to the conference server to change the state
Jennings & Rosen Expires August 9, 2004 [Page 7]
Internet-Draft Media Mixer Control February 2004
of the conference.
1
A key part of this is that Bob's client may have known about this
basic audio template and what the semantics of the "mute" control
implied. The client may have connected this up with a button of the
client's that was labeled mute. On the other hand, Bob's client may
not have known anything about this template and simply rendered a
button on the screen and labeled it "mute" with no idea what this
would do. A third client may not have been table to deal with the
control at all and may have just ignored it. Clearly the user
interface can be better if the client understands the semantics of
what the template means, but the user interface is still functional
when the client does not.
5. Names and terminology
5.1 Templates
Templates contain a list of stream, roles for participants,
parameters that need to be set, and controls for the conference.
5.2 Participants
Participants are the logical user entities participating in a
conference.
5.3 Streams
The stream is a named stream of media. An example is a simple audio
conference with 6 participants and a mixer that mixes the loudest
three. Each participant contributes an input stream. There is a
single logical output stream, but every participant gets a "custom"
version of this stream, because, in normal mixers, each participants
can hear all inputs except his own. This is commonly referred to as
"mix-minus". If the output steam also has a control (mute), the
output streams for each participant may also vary depending on the
state of the control.
Streams all have a type, a name, a direction (in or out), one or more
URLs, and a priority. The URL is the source or sink of the stream.
The priority indicates how important this particular stream is to the
Jennings & Rosen Expires August 9, 2004 [Page 8]
Internet-Draft Media Mixer Control February 2004
conference and the type indicates the type of media carried in this
steam.
Streams have types. These correspond to the major MIME types of the
media they send.
5.3.1 Stream Types
5.3.1.1 Audio
Streams originate as participant contributions (dir="in") that are
mixed using some kind of algorithm. Intermediate streams may be
created, which are subsequently mixed with other streams yielding
streams which are sent to participants (dir="out"). Controls
commonly available on audio streams include input or output faders
(volume controls), stereo balance, and mute.
5.3.1.2 Video
Streams originate as participant contributions (dir="in") that are
combined with some kind of algorithm. Intermediate streams may be
created, which are subsequently combined with other streams yielding
streams which are sent to participants (dir="out"). Controls
commonly available on video streams might include selectors for
choosing a tiling format, selectors which input streams appear on
output tiles, and video mutes.
5.3.1.3 Text
Streams originate as participant contributions (dir="in") (Instant
Messages). Messages from all participants are combined using some
algorithm. Intermediate streams may be created, which are
subsequently combined with other text streams yielding streams which
are sent to participants (dir="out").
5.3.1.4 Application
At a minimal level, this consist of a URL that defines the
application. Many systems will simply update an http URL that fetches
an HTML page that shows the current presentation.
5.3.2 Stream URLs
Streams have URLs that specify the source or sink of the stream.
These would typically be a SIP, H323 or XMPP URL.
5.3.3 Stream Priority
Jennings & Rosen Expires August 9, 2004 [Page 9]
Internet-Draft Media Mixer Control February 2004
Streams have a priority from 0 to 1. Zero indicates that a client, by
default, should not play/display this stream unless the user
specifically requests it. A priority of 1 indicates that, by default,
the client should render this stream and should warn the user if it
cannot. Other values only define an ordering, and clients should
attempt to use their resources to display the higher priority streams
before the lower.
5.4 Roles
Roles are defined as part of Conference Policy but are used here so
that the Media Policy can define separate streams and controls
depending on role. Roles are defined by in the template. Some
templates may allow a participant to take on more than one role at a
time. Each template must define a role named "participant", which is
the default role. "Moderator" is a typical role, as is
"Floor-Holder", but templates do not intrinsically define or require
such roles.
5.5 Controls
Controls manipulate the state of the conference while it is
instantiated. All controls have a name, a type, a current value and
permissions that indicate whether or not the current client can
modify them. They may also have, optionally, a min and max value.
A control can be defined as being part of a role. In that case, all
participants who assume that role have an instance of the control. A
control may also be defined as part of a stream, in which case all
contributors of that stream (dir="in") have an instance of the
control, or all sinks of the stream (dir="out") have an instance of
the control. There can be global controls, which are available to
all participants. Implicit controls extract values from streams (or
other controls), such as choosing video inputs based on loudest
speakers
5.6 Parameters
Parameters are variables that modify the function of the template.
They are fixed when the conference is instantiated. Parameters allow
a single template definition to describe a range of possible mixer
capabilities.
Parameters have a name, a type, a value and, optionally, a mix and
max value.
6. Solution
Jennings & Rosen Expires August 9, 2004 [Page 10]
Internet-Draft Media Mixer Control February 2004
6.1 Templates
A template is an xml document. The template definition includes a
name, which is a string, for example:
6.1.1 Parameters
The parameters in the templates customize a generic template for a
specific conference. Parameters have name, type, value, and
optionally min/max. Parameters are defined in the template
description. Only conveners can set template parameters
One typical template parameter is "max-participants". When the CS
generates the template for the client, it can customize the min and
max value of this parameter to match what it is capable of. When the
client instantiates the template and creates the conference, it can
specify the value that has been requested. The value typically
represents the limits the mixer is capable of. Resource availability
may limit the actual value that can be achieved.
Parameter names are strings.
Parameter Types:
Integer
Real
Enumeration
String
Values of course must be conformant to the type. Min and Max, if
defined, must also be conformant to the declared type.
Example:
75
6.1.2 Roles
Templates define all the Roles that a participant can take and
(optionally) the max number of participants of each role. Each role
is defined in a role element. A Role element includes a name and
optionally a "max-participants" value. Role elements may also
Jennings & Rosen Expires August 9, 2004 [Page 11]
Internet-Draft Media Mixer Control February 2004
contain stream elements, which define per-participant-in-role
streams.
Example:
6.1.3 Streams
Templates also define all the streams available. A stream element has
a name, a type, a direction ("in" or "out"), priority and URL.
Certain streams may actually be a set of streams, for example, one
per participant. A specific member of the set can be referenced
using an array notation with square brackets. For example, if an
input stream is available named foo, and there is a participant named
"Bob", then foo["Bob"] would be the name of the foo stream Bob
contributes. If a stream is defined within a role element, the stream
is a set of streams, one per participant in the role. If a stream is
defined in more than one role with the same name, the stream set is
the same, and participants in any roles that have that stream defined
with that name contribute/sink a stream to the set.
The URL is typically not given a value in the template definition.
The mixer assigns URL values as participants assume roles. Most
implementations would not allow the URL to be changed by the media
policy mechanism. The value of the URL would be included in the
media policy conference state document.
Example:
6.1.4 Controls
A control can be inside the template, participant, or stream. The
control will apply to the appropriate context. By including stream
definitions in multiple roles that have the same name, different
controls can be provided to different roles affecting streams
contributed or sunk from multiple roles. For example, a moderator
may be given a set of input volume controls controlling a mix, and
every participant can be given an output master mix control for the
output stream sent to him
6.1.5 Conference State
Conference state can be requested by any participant. A document
will be returned elucidating the complete current conference state,
which would contain all the participants, all the streams, and the
Jennings & Rosen Expires August 9, 2004 [Page 12]
Internet-Draft Media Mixer Control February 2004
values of all the controls. The form of the document mirrors the
template definition. The conference can also contain sidebars.
6.1.5.1 Conference State Update
The client can attempt to change the state of various controls in the
CS by sending a document that contains just the things it wants to
change.
6.1.5.2 Change Notification
The client can request that conference state be automatically sent
when it changes.
6.1.6 Transport Protocol
TODO: Need to define how the information is sent between the client
and the conference server. XCAP?
6.2 Controls
6.2.1 Requirements
Controls need to collect information. This can be classified into
several types. It should be possible to provide default values, a
name for the control and text it displays, help text, control if a
value is required, and control of whether or not the value is
editable. It should be possible to express constraints on the form an
input can take by specifying a minimum or maximum for types where
that makes sense, or specifying a regular expression that must be
satisfied. For numeric values in a constrained range, it should be
possible to provide an increment value used by the control. For
strings it should be possible to indicate that they should not be
displayed when they are entered for things like passwords. Need the
ability to internationalize any text that is displayed to the user.
There are control types for:
Strings
Multi-line Strings
Integer
Real
Boolean
Jennings & Rosen Expires August 9, 2004 [Page 13]
Internet-Draft Media Mixer Control February 2004
Date
Time
Date Time
URI
File Selection
Select Single
Select Multiple
If an unknown control is encountered, it should be treated as a
string type. The