T2TRG Hong, Choong Seon Internet-Draft Kyung Hee University Intended status: Standards Track Minh, Nguyen H N Expires: August 09, 2020 Kyung Hee University Pandey, Shashi Raj Kyung Hee University Chit Wutyee Zaw Kyung Hee University Seung Il Moon Kyung Hee University December 2018 Distributed fault management for IoT Networks draft-hongcs-t2trg-dfm-00 Abstract Recent advances in Internet of Things (IoT) have increased the use of sensing technologies for IoT applications. However, monitoring sensor nodes is still a challenging issue in distributed remote environments, especially wireless environments. Different from conventional centralized mechanism, Fog Computing becomes an essential role in a scalable IoT system. Fog Node can control and monitor its subdomain's devices and perform aggregation tasks to support the central server at the cloud. Since node fault detection can strongly affect the performance and accuracy in most IoT analysis applications, fault detection mechanism should be integrated into IoT Networks. Accordingly, these fault nodes could be detected and replaced by others available nodes in the same domain for the analysis by a distributed fault detection and node replacement mechanism based on their sensory values in a considered domain. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." Hong, et al. Expires August 09, 2020 [Page 1] Internet-Draft Fault Management for IoT December 2018 This Internet-Draft will expire on August 09, 2020. Copyright Notice Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . .. . . . . . 2 1.1. Terminology and Requirements Language . . . . . . . . . 3 2. Communication Process . . . . . . . . . . . . . . . . . . . . . 3 2.1. Constrained Application Protocol (COAP) message exchange 4 2.2. Network Setup . . . . . . . . . . . . . . .. . . . . . 5 2.3. Message description in Fiesta-IOT . . . . . . . . . . . 6 3. Distributed Fault Management . . . . . . . . . . . . . . . . . 7 3.1 Abnormal Value Fault Detection . . . . . . . . . . . . . 7 3.2. Sensor Replacement for Fault Sensors . . . . . . . . . . .9 4. IANA Considerations . . . . . . .. . . . . . . . . . . . . . . 9 5. Security Considerations . . . . . . . . . . . . . . . . . . .10 6. References . . . . . . . . . . . . . . . . . . . . . . . . . . .10 6.1. Normative References . . . . . . . . . . . . . . . . . . . . .10 6.2. Informative References . . . . . .. . . . . . . . . . . . . .10 Authors' Addresses . . . . . . . . . . . . . . . . . . . . .. . . . 11 1. Introduction IoT Networks are composed of massive, small and low-cost sensor nodes scattered deployed. Using IoT nodes, the sensory data can be collected for IoT applications through fog nodes. Accordingly, the central server and fog node need a fault detection mechanism to monitor sensor nodes in their domain. Failed nodes may affect the quality of service (Qos) from IoT analysis applications. It is an important feature in IoT management systems since faults in IoT Networks occur often due to the following reasons: . Failure in sensor nodes can occur due to massive low-cost sensor nodes are often deployed in low-cost IoT platform. Hong, et al. Expires August 09, 2020 [Page 2] Internet-Draft Fault Management for IoT December 2018 . The critical applications are very sensitive to the quality of sensory values. . Faults can be occurred due to battery depletion in battery-powered nodes. . The wireless link can be disconnected and the sensory values cannot be updated at the central server Faults in IoT domain can be classified into two types as in [a]: . 'Hard fault' is when a sensor node cannot communicate with the monitoring server (e.g., communication failure due to the failure of the communication module, energy depletion of a node, being out of the communication range of entire mobile network because of the nodes moving and so on). . 'Soft fault' means the failed nodes can communicate with the monitoring server but the data sensed or transmitted is not correct 1.1. Terminology and Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 2. Communication Process The machine-to-machine (M2M) interaction model, known Constrained Application Protocol (COAP) similar to client/server model in HTTP is adopted. This method can provide a flexible interaction environment to handle message exchanges between client and server nodes. Unlike HTTP, the message interchanges asynchronously over UDP. One complete message exchange for the application is handled in three stages. In stage-1, the sensor node (client) instantiates the device registration process by forwarding the device information in JSON format (see Figure 1). The server stores the device information in TDB, and trigger application-driven control message (Start/Stop observation) in Stage-2. In Stage-3, if the sensor node is not sending observation data, it publishes the sensory data (e.g., temperature, humidity) to the server with Start control message. It may also require to stop sending observation data. It will be trigger by the application running on the server. Hong, et al. Expires August 09, 2020 [Page 3] Internet-Draft Fault Management for IoT December 2018 +---------------+ +------------+ | Sensor Node | | Server | | (Client) | | | +---------------+ +------------+ | | ,| | || +---------------------+ | Stage-1 || | device registration | | process || | process | | || +---------------------+ | `| | | | ----------------------------------- | | | +------------------+ |` | | Control message | || Stage -2 | | exchange | || | +------------------+ |, | | ------------------------------------ ,| | || +---------------------+ | Stage-3 || | observation message | | process || | passing | | || +---------------------+ | `| | Figure 1: Basic communication process 2.1. Constrained Application Protocol (COAP) message exchange COAP messages are exchanged asynchronously between COAP endpoints [b]. In M2M interaction with COAP implementation, nodes act as both server and client roles. Using a Method Codes, a client sends COAP request on a resource (identified by a URI) on a server. Correspondingly, the server implements Response Code to send a response, which may include a resource implementation. The COAP message exchange of JSON payload is illustrated in Figure 2. A Fog Node acts as an agent to facilitate the distributed scenario. The interaction between the Sensor Node and Server is managed by the Fog Node, or can proceed as in Figure 1. The Sensor Node sends a registration message to register itself. It waits for the control message to start sending the observation data. Hong, et al. Expires August 09, 2020 [Page 4] Internet-Draft Fault Management for IoT December 2018 +---------------+ +------------+ +------------+ | Sensor Node | | Fog Node | | Server | | | | | | | +---------------+ +------------+ +------------+ | | | Device off-->| | | | | | Device on -->| | | ,| | | || +-------------------+ | | Stage-1 || |device registration| | Relay Message | process || | process | |-------------------->| || +-------------------+ | | `| | | --------------------------------------------------------------- | | +-----------------+ |` | | | Control message | ||Stage-2 | Forward Control | | exchange | || | Message | +-----------------+ |, |<----------------------| | | | | --------------------------------------------------------------- ,| | | || +-------------------+ | | Stage-3 || |observation message| | | process || | passing | | Store and forward | || +-------------------+ |-------------------->| `| | | Figure 2: COAP Message exchange 2.2. Network Setup A tree topology is used as shown in Figure 3. (Sensor Node a)------+ \ (Sensor Node b)--------+(Fog Node1) (Fog Node2) / \ / (Sensor Node c)------+ \ / \ / \ / (Sensor Node d)-------+(Server) Figure 3: A tree topology Hong, et al. Expires August 09, 2020 [Page 5] Internet-Draft Fault Management for IoT December 2018 All the nodes MUST be a COAP endpoint for message exchange. A COAP endpoint is capable of both client and server roles. A sensor node can directly interact with the server or via a Fog Node. 2.3 Message description in Fiesta-IOT The message format complies with Fiesta-IOT ontology [c]. It maintains three level of service description as shown in Figure 4. The observation data is accommodated in the sensor level description. +------------------------------------------------------+ | | | | | | | | | | | +------------------------------------------------+ | | | | | | | | | | | | | | | +-------------------------------------------+ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | +-------------------------------------------+ | | | | | | | +------------------------------------------------+ | | | +------------------------------------------------------+ Figure 4: Message description Hong, et al. Expires August 09, 2020 [Page 6] Internet-Draft Fault Management for IoT December 2018 3. Distributed Fault Management After collecting sensory data from sensors through Fog Node and Central Server, the distributed fault management MUST run the abnormal value fault detection algorithm to detect the fault sensors for a particular location. Then, the detected fault sensor nodes SHOULD be replaced with the supplementary sensor nodes that are currently off. These algorithms MAY be implemented in a centralized server (domain) and fog node at a particular location such as room, building 3.1 Abnormal Value Fault Detection Abnormal value fault detection is an important fault detection in IoT domain. In order to know whether values observed from IoT sensors are fault or not, the observation values of particular sensors SHOULD be compared with the values from the neighboring sensor nodes. A flow chart for abnormal value fault detection at a particular location is shown below. The abnormal value fault detection MAY be performed based on two parameters, the current distance of observation values, the distance between the previous and current distance of observation values. If the observation value of a particular sensor is similar to the majority of sensors observation values which means the distances are within the predefined threshold, that sensor MAY be detected as normal. Otherwise, it is a fault sensor. Hong, et al. Expires August 09, 2020 [Page 7] Internet-Draft Fault Management for IoT December 2018 +----------------------+ | Calculate distances | | between sensors | +----------------------+ | | v /\ / \ / \ / \ Yes / Are all\ (End) <-----/ sensors \ <--------------------------------------+ \ detected?/ | \ / | \ / | \ / | \ / | \/ | |No | v | +-----------------------------+ | |For each sensor, count the | | |number of sensors where their| | |distances are above threshold| | +-----------------------------+ | | | v | / \ | / \ | / \ | / Is the\ +---------------------------+ | / count \ No | Update the sensor status | | / below the \ ----->| as Fault |--->| \half of the/ +---------------------------+ | \number of/ | \sensors/ | \ ? / | \ / | \ / | | | |Yes | v | +----------------------------+ | | Update the sensor status |-------------------------------+ | as Normal | +----------------------------+ Figure 5: Flow chart of abnormal fault detection Hong, et al. Expires August 09, 2020 [Page 8] Internet-Draft Fault Management for IoT December 2018 3.2. Sensor Replacement for Fault Sensors The detected fault sensors SHOULD be replaced with the supplementary sensor nodes that are currently off for further analysis or monitoring. A flow chart for sensor replacement is shown below. For each fault sensor, an off sensor SHOULD be replaced by turning on it. If the sensor can be turned on, it is replaced with the fault sensor. Otherwise, the off sensor SHOULD be updated as malfunction sensor. Finally, the fault sensor MAY be turned off. /\ / \ / \ / \ Yes / Are all\ (End) <-----/ fault \ <-------------------------------------+ \ sensors / | \checked?/ | \ / | \ / | \ / | \/ | |No | v | +--------------------+ | | Turn on | | | an off sensor | | +--------------------+ | | | v | / \ | / \ | / \ +---------------------------+ | /Was the\ No | Update the sensor status | | / sensor \ ----->| as Malfunction |--->| \ be able / +---------------------------+ | \ turned/ | \ on? / | \ / | \ / | | | |Yes | v | +--------------------+ | | Turn off |---------------------------------+ | the fault sensor | +--------------------+ Hong, et al. Expires August 09, 2020 [Page 9] Internet-Draft Fault Management for IoT December 2018 4. IANA Considerations There are no IANA considerations related to this document. 5. Security Considerations This note touches communication security as in M2M communications and COAP protocol. 6. References 6.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [a] Elhadef, M.; Boukerche, A; Elkadiki, H. Performance analysis of a distributed comparison based self- diagnosis protocol for wireless ad hoc networks. In Proceedings of the 9th ACM International Symposium on Modeling analysis and simulation of wireless and mobile system; ACM: NY, USA, 2006; pp. 165-172. [b] Shelby, Zach, Klaus Hartke, and Carsten Bormann. The constrained application protocol (CoAP). No. RFC 7252. June 2014. [c] Agarwal, Rachit, David Gomez Fernandez, Tarek Elsaleh, Amelie Gyrard, Jorge Lanza, Luis Sanchez, Nikolaos Georgantas, and Valerie Issarny. "Unified IoT ontology to enable interoperability and federation of testbeds." In Internet of Things (WF-IoT), 2016 IEEE 3rd World Forum on, pp. 70-75. IEEE, 2016. 6.2. Informative References Hong, et al. Expires August 09, 2020 [Page 10] Internet-Draft Fault Management for IoT December 2018 Authors' Addresses Choong Seon Hong Computer Science and Engineering Department, Kyung Hee University Yongin, South Korea Phone: +82 (0)31 201 2532 Email: cshong@khu.ac.kr Minh N H Nguyen Computer Science and Engineering Department, Kyung Hee University Yongin, South Korea Phone: +82 (0)31 201 2987 Email: minhnhn@khu.ac.kr Shashi Raj Pandey Computer Science and Engineering Department, Kyung Hee University Yongin, South Korea Phone: +82 (0)31 201 2987 Email: shashiraj@khu.ac.kr Chit Wutyee Zaw Computer Science and Engineering Department, Kyung Hee University Yongin, South Korea Phone: +82 (0)31 201 2987 Email: cwyzaw@khu.ac.kr Seung Il Moon Computer Science and Engineering Department, Kyung Hee University Yongin, South Korea Phone: +82 (0)31 201 2987 Email: moons85@khu.ac.kr Hong, et al. Expires August 09, 2020 [Page 11] Internet-Draft Fault Management for IoT December 2018