Applications Area Working Group F. Echtler Internet-Draft Munich Univ. of Applied Sciences Intended status: Informational March 7, 2011 Expires: September 8, 2011 GISpL: Gestural Interface Specification Language draft-echtler-gispl-specification-00 Abstract This document introduces GISpL, the Gestural Interface Specification Language. GISpL enables the unambiguous description of gestures used in human-computer interfaces. This includes gestures on touch and multi-touch screens, with digital pens, with hand-held controllers or in free air. A matching engine analyzes motion data produced by the input device(s) and triggers registered gestures. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on September 8, 2011. Copyright Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as Echtler Expires September 8, 2011 [Page 1] Internet-Draft GISpL March 2011 described in the Simplified BSD License. Echtler Expires September 8, 2011 [Page 2] Internet-Draft GISpL March 2011 1. Introduction This document describes GISpL (Gestural Interface Specification Language), a formal language for describing human-computer interfaces which use gestures. The term "gesture" is used in a very wide sense here, meaning any motion of the user which can be captured by an input device. 1.1. Motivation As novel types of human-computer interfaces such as multi-touch screens, digital pens, hand-held controllers or even free-air gestures become more and more common, so does the number of special- purpose applications built to interact with these devices. Using a dedicated formal language to describe the various types of gestural interaction which are possible with an application has advantages for various distinct groups: Users End users of gestural applications benefit in two ways: the behavior of such applications becomes more consistent across different platforms and the users are given the option to modify the behavior to suit their needs. Researchers When the behavior of a gestural interface is specified formally, it is easier for user interface researchers to compare different interfaces. Also, prototypical implementations of such interfaces are easier to modify and adapt during the design, development and evaluation cycle. Developers Since the formal specification enables a dedicated matching engine to perform the actual detection and matching of gestures, developers are spared tedious tasks such as constant reimplementation of basic gesture matching algorithms. 1.2. Design Considerations GISpL serves two purposes: o describe what motions the users have to execute in order to trigger a certain response o deliver information about the actual execution of these motions to the application Moreover, GISpL should be usable across a very wide range of platforms, even unconventional ones. One important example is the Echtler Expires September 8, 2011 [Page 3] Internet-Draft GISpL March 2011 Firefox web browser which has recently acquired the capability to deliver multi-touch events to web applications. Consequently, the requirements regarding GISpL are: o must be human-readable o must be machine-readable o must be supported on a wide range of platforms o must be lightweight for low-latency network transmission Therefore, the decision was made to base GISpL on JSON [RFC4627]. Compared to XML, JSON gives a better balance between readability for humans and code size. It is supported in nearly all programming languages and platforms. Echtler Expires September 8, 2011 [Page 4] Internet-Draft GISpL March 2011 2. Specification GISpL consists of three core elements: regions, gestures and features. For the full formal ABNF specifictation, please see the appendix. This section will use a relaxed syntax for easier readability, with rule names enclosed by angle braces < >. JSON-related markup ( { } , [ ] "" ) will be reproduced verbatim. 2.1. Definitions In the context of GISpL, the terms _input object_ and _input event_ will be used. An _input object_ is any physical object which is detectable by the input hardware, such as a mouse, a Wiimote or the user's hand. Input objects are classified into one of 32 categories as given in the appendix. Every input object is given a unique ID number according to the capabilities of the hardware. I.e., a uniquely identifiable tangible object always has the same ID, while anonymous touch points on an interactive surface will have IDs so that each touch point is uniquely identifiable during the duration of the touch. An _input event_ is a single spatial measurement regarding the position (and optionally, orientation, dimensions etc.) of an input object as captured by one or more of the sensors used. 2.2. Region Regions define spatial areas in which a certain set of gestures is valid and which capture motion data that falls within their boundaries. Regions are defined in one of two reference coordinate systems: o A 2D screen coordinate system where the x and y coordinates range from 0.0 to 1.0 across the physical extension of the user's screen, with the origin being in the lower left corner. This is indicated by the region flag "poly". The list of boundary points is interpreted as a closed polygon in the x-y plane. The z coordinate is not used. o A 3D metric coordinate system with an arbitrary reference point as origin (usually defined by the input device). This is indicated by the region flag "hull". The list of boundary points is interpreted as defining the convex hull of the region. Regions are managed in an ordered list. Arriving input events are checked against this list, starting from the first element. If the position of the input event falls within the boundaries of the region _and_ if the bit in the filter bitmask which corresponds to the Echtler Expires September 8, 2011 [Page 5] Internet-Draft GISpL March 2011 category of the input event is set, the the input event is captured by the region. Captured events and their history are subsequently used to check whether one or more of the gestures attached to this region have been triggered. point = [ , , ] pointlist = [ null/ *( , ) ] regionflags = "poly" / "hull" region = { "id":, "flags":, "filters":, "points":, "gestures": } 2.3. Gesture Gestures appear in two variants: either as gesture specifications (describing what motions the user has to execute for a certain effect) or as gesture events (describing a) the fact that a matching motion event just took place and b) the motion data which triggered the event). Whether a gesture object is a gesture event can be determined from looking at the flags. Should they contain the string "result", then it is a gesture event and the features composing this gesture will contain valid data in their "result" array. Otherwise, the object is a gesture specification and the features will contain valid data in their "constraints" array. When a gesture has the "oneshot" flag, then it can only be triggered once by a given set of input IDs. Repeated triggering is only possible when the set of IDs captured by the containing region changes. When a gesture has the "default" flag set, it is added to a pool of default gestures. When a gesture specification with an empty feature list is encountered, then the name is looked up in the default gesture pool and if a match is found, the feature list is copied from the default gesture. Otherwise, the empty gesture is ignored. Echtler Expires September 8, 2011 [Page 6] Internet-Draft GISpL March 2011 gesturelist = [ null/ *( , ) ] gestureflag = "oneshot" / "default" / "result" gestureflags = [ null/ (* , ) ] gesture = { "name":, "flags":, "features": } 2.4. Feature are the building blocks of gestures and are atomic mathematical properties of the raw motion data, such as the average motion vector or the diameter of the convex hull of all motion points. featurelist = [ null/ *( , ) ] featureitem = / itemlist = [ null/ *( , ) ] feature = { "type":, "filters":, "constraints":, "result": } A feature is classified by its type, a filter bitmask for input events as described in Section 2.2, a type-specific list of constraints and a type-specific list of results. Depending on whether the containing gesture is a "result" gesture or not (see Section 2.3), either the constraint list or the result list will be empty. All temporal relations (such as motion vector) are expressed relative to a time unit of one sensor frame, i.e. for a sensor setup running at 60 Hz, the temporal unit will be 16.66 ms. Features are divided into two groups. The first one are _single- match_ features, which will only generate a single result instance regardless of the number of input objects. Echtler Expires September 8, 2011 [Page 7] Internet-Draft GISpL March 2011 +----------+-------------+----------+-------------------------------+ | Feature | Constraint | Result | Description | | Type | Types | Types | | +----------+-------------+----------+-------------------------------+ | Motion | , | | Average motion vector, lower | | | | | and upper limits per | | | | | component | | | | | | | Rotation | , | | Relative rotation angle in | | | | | rad, lower and upper limits | | | | | | | Scale | , | | Relative size change, lower | | | | | and upper limits | | | | | | | Path | *(, | | Match for template path | | | ) | | | | | | | | | Count | , | | Number of input objects, | | | | | lower and upper limits | | | | | | | Delay | , | | Number of frames since first | | | | | object entered, lower and | | | | | upper limits | +----------+-------------+----------+-------------------------------+ Single-Match Features Table 1 The second group of features are _multi-match_ features, which will potentially generate one result instance for each input object (depending on the constraint values). Echtler Expires September 8, 2011 [Page 8] Internet-Draft GISpL March 2011 +-----------------+--------------+-----------+----------------------+ | Feature Type | Constraint | Result | Description | | | Types | Types | | +-----------------+--------------+-----------+----------------------+ | ObjectID | , | | Match range of | | | | | unique IDs (e.g. for | | | | | tangible objects) | | | | | | | ObjectParent | , | | Match range of | | | | | unique parent IDs | | | | | (e.g. for fingers | | | | | belonging to | | | | | specific user) | | | | | | | ObjectPosition | , | | Position, lower and | | | | | upper limits per | | | | | component | | | | | | | ObjectDimension | , | , | Object dimensions, | | | , | , | lower and upper | | | , | | limits per component | | | , | | (major axis, minor | | | , | | axis, size) | | | | | | | | | | | | ObjectGroup | , | , | Match group of input | | | , | | objects (min. count, | | | | | max. count, max. | | | | | radius). Result = | | | | | count/centroid | +-----------------+--------------+-----------+----------------------+ Multi-Match Features Table 2 Echtler Expires September 8, 2011 [Page 9] Internet-Draft GISpL March 2011 3. Runtime Behavior This section details the behavior of the gesture matching engine. The following steps are performed every time an input event has been received: 1. ... 2. ... The following steps are performed every time after a full set of input events has been received: 1. ... 2. ... Echtler Expires September 8, 2011 [Page 10] Internet-Draft GISpL March 2011 4. Examples For illustration purposes, this section gives some usage examples of GISpL. TBD Echtler Expires September 8, 2011 [Page 11] Internet-Draft GISpL March 2011 5. Security Considerations Gesture events, if intercepted while in transit over the network, may disclose private data about the users' actions. Consequently, they should preferably be transmitted over secure channels such as private networks, VPNs or SSL-encrypted links. Echtler Expires September 8, 2011 [Page 12] Internet-Draft GISpL March 2011 6. IANA Considerations This document has no actions for IANA. Echtler Expires September 8, 2011 [Page 13] Internet-Draft GISpL March 2011 7. References 7.1. Normative References [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008. 7.2. Informative References [RFC4627] Crockford, D., "The application/json Media Type for JavaScript Object Notation (JSON)", RFC 4627, July 2006. Echtler Expires September 8, 2011 [Page 14] Internet-Draft GISpL March 2011 Appendix A. Formal GISpL Specification The following formal specification of GISpL is given in ABNF [RFC5234]. JSON-related rule definitions (e.g. , ...) are to be found in [RFC4627]. Echtler Expires September 8, 2011 [Page 15] Internet-Draft GISpL March 2011 quote = quotation-mark point = begin-array number value-separator number value-separator number end-array item = point / number itemlist = begin-array item *( value-separator item ) end-array pointlist = begin-array point *( value-separator point ) end-array gesturelist = begin-array gesture *( value-separator gesture ) end-array featurelist = begin-array feature *( value-separator feature ) end-array filters = quote "filter" quote name-separator number value-separator regionid = quote "id" quote name-separator string value-separator regionflags = quote "flags" quote name-separator ( ( quote "poly" quote ) / ( quote "hull" quote ) ) value-separator gesturename = quote "name" quote name-separator string value-separator gestureflag = ( quote "oneshot" quote ) / ( quote "default" quote ) gestureflags = quote "flags" quote name-separator begin-array gestureflag *( value-separator gestureflag ) end-array value-separator featuretype = quote "type" quote name-separator string value-separator results = quote "result" quote name-separator itemlist constraints = quote "constraints" quote name-separator itemlist value-separator feature = begin-object featuretype constraints results end-object gesture = begin-object gesturename gestureflags featurelist end-object region = begin-object regionid regionflags filters pointlist value-separator gesturelist end-object Figure 1 Echtler Expires September 8, 2011 [Page 16] Internet-Draft GISpL March 2011 Appendix B. Input Event Types TBD - see TUIO 2.0 spec Echtler Expires September 8, 2011 [Page 17] Internet-Draft GISpL March 2011 Author's Address Florian Echtler Munich Univ. of Applied Sciences Lothstr. 64 Munich 80336 Germany Phone: +49 89 1265 3736 Email: floe@butterbrot.org URI: http://floe.butterbrot.org/ Echtler Expires September 8, 2011 [Page 18]