slim G. Hellstrom
Internet-Draft Omnitor
Intended status: Standards Track June 6, 2017
Expires: December 8, 2017

Negotiating Modality in Real-Time Communications


When negotiating language for a real-time session, users may have very specific preferences for using one modality (spoken, written or signed) over other possible but less preferred modalities. This specification introduces indication of modality preference to be used in session negotiation in combination with an earlier speified mechanism for language preference negotiation.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on December 8, 2017.

Copyright Notice

Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents ( in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction

A mechanism for negotiating human language for real-time communication is specified in [I-D.ietf-slim-negotiating-human-language]. The indication of language preference is expressed per media and specified in SDP [RFC4566] attributes 'hlang-send' and 'hlang-recv'. Negotiation of language can take place by the answering part selecting from the languages, media and direction alternatives expressed by the offering part. Languages are expressed by using language-tags as specified in BCP 47 [RFC5646].

When starting a conversation in a media-rich environment, the users may have very specific preferences for using one modality (spoken, written or signed) over other possible but less preferred modalities. In traditional call establishment, it is the answering part who is expected to start the conversation by a greeting. In the media-rich environment, the modality and language of this greeting sets the expectations for what modality and language to mainly use in the session. Deviation from this initial expectation is usually possible during the session by mutual agreement between the participants, but may be time consuming and cause uncertainty.

A way for the parties to not only indicate alternative languages and modalities for the communication directions in the session, but also indicate preference for specific modalities per direction provides the opportunity to more exactly describe the desired language communication for a session, while still providing information about less preferred alternatives. This specification extends [I-D.ietf-slim-negotiating-human-language] with a mechanism for indicating modality preference by a condensed notation integrated with the syntax of the language indications of [I-D.ietf-slim-negotiating-human-language].

The expected application area is wide. By old tradition, the most common modality for real-time interaction is spoken communication. In some settings, e.g. where silence is required, it may be desirable to express a preference for using written communication, while still leaving a possibility open for traditional spoken communication by an indication on lower preference level. For persons having full ability to both use sign language and spoken language, but not wanting to force the other party to bring in a sign language interpreter in the call, it may be of importance to be able to indicate the sign language capability on a lower preference level and the spoken laanguage capability on a higher level. Some persons with disabilities may strongly prefer to conduct a written conversation, while still wanting to express that a spoken conversation is possible as a last resort. Many other situations exist in the media-rich communication environment when the media preference indication is of value for a smooth initiation of a real-time session.

2. Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

3. Modality Preference Indication

This specification extends the use of the asterisk in the 'hlang'send' and 'hlang-recv' SDP [RFC4566] attributes introduced by [I-D.ietf-slim-negotiating-human-language].

In [I-D.ietf-slim-negotiating-human-language], the asterisk appended at the end of the attribute value indicates a preference to not get the call denied if no languages match.

This specification adds the following meaning of the asterisk:

In an offer or answer, a 'hlang-send' or 'hlang-recv' attribute value MAY have an asterisk appended as the final token. An asterisk appended to a value in an offer indicates a the caller has higher preference for the corresponding modality to be used in the specified direction than other modalities for the indicated direction without an asterisk. In an answer, the asterisk indicates a modality that is preferred by the callee to be used in the session.

A user may have a clear preference to use one specific modality in a direction, while use of other modalities may be acceptable but lower in preference. This condition MAY be indicated by appending an asterisk as the last parameter in the corresponding 'hlang-' value. Note that the asterisk appended at the end of a 'hlang-' attribute value also should also be seen as a preference to not have the call denied even if no indicated languages are in common as specified in [I-D.ietf-slim-negotiating-human-language].

When negotiating language use for a direction, languages and modalities specified together with the asterisk should be given preference to be selected for use.

If there is no specific preference between modalities in the same direction, this condition should be indicated by appending an asterisk on all or no 'hlang-' values for that direction.

4. Interaction with Call Denial Indication

If no modality preference is indicated in any 'hlang-' attribute by no attached asterisk, this should also be taken as a preference by the caller to get the call denied if no languages are in common between the caller and the callee.

A caller with language capabilities in multiple media, but no specific modality preferences should attach the asterisk to all 'hlang-' attributes in at least one direction for indication that the call should not be denied.

If there is a preference for denying the call when no languages match, no asterisk should be appended on any 'hlang-' attribute value, and then it is not possible to indicate any preferred modality at the same time.

5. Interaction with Simultaneity Indication

- - Interaction with simultaneity indication - -

6. Examples

An offer requesting the following media streams: audio for the caller to send using spoken English (most preferred modality) or American Sign Language (less preferred modality), audio for the caller to receive spoken English (most preferred modality) or American Sign Language (less preferred modality), supplemental text. The offer also requests that the call proceed even if the callee does not support any of the languages. The offer is likely from a hearing person with knowledge in sign language:

An answer for the above offer, indicating video in which the callee will send and receive American Sign Language, because that callee had no capability for spoken English. The text and audio streams are opened as supplementary streams.

An offer requesting the following media streams: audio for the caller to send using spoken French (most preferred modality) or written French (less preferred modality), text for the caller to receive written French. The offer also requests that the call proceed even if the callee does not support any of the languages. Video is supplemental.The offer is likely from a hard-of-hearing person with no use of received spoken language and a preference to use spoken language rather than type French:

An answer for the above offer, indicating text in which the callee will send written French, and audio in which the callee is prepared to receive spoken French. The video stream is opened as a supplementary stream.

7. Acknowledgements

Thanks to Randall Gellens for providing the background for this extension. Brian Rosen and Paul Kyzivat for thorough discussions and guidance.

8. IANA Considerations

9. Security Considerations

10. References

10.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997.
[RFC4566] Handley, M., Jacobson, V. and C. Perkins, "SDP: Session Description Protocol", RFC 4566, DOI 10.17487/RFC4566, July 2006.
[RFC5646] Phillips, A. and M. Davis, "Tags for Identifying Languages", BCP 47, RFC 5646, DOI 10.17487/RFC5646, September 2009.

10.2. Informative References

[I-D.ietf-slim-negotiating-human-language] Gellens, R., "Negotiating Human Language in Real-Time Communications", Internet-Draft draft-ietf-slim-negotiating-human-language-10, May 2017.

Author's Address

Gunnar Hellstrom Omnitor Hammarby Fabriksvag 23 Stockholm, 120 30 Sweden Phone: +46 708 204 288 EMail: