IPPM H. Song, Ed.
Internet-Draft T. Zhou
Intended status: Standards Track Z. Li
Expires: April 23, 2019 Huawei
October 20, 2018

Export User Flow Telemetry Data by Postcard Packets
draft-song-ippm-postcard-based-telemetry-00

Abstract

This document describes a proposal which allows network OAM applications to collect telemetry data about any user packet. Unlike similar techniques such as INT and in-situ OAM, the Postcard-Based Telemetry (PBT) does not require inserting telemetry data into user packets, but directly exports the telemetry data to a collector through separated OAM packets called postcards. Two variations of PBT are described in this document: one requires inserting an instruction header to user packets to guide the data collection and the other only marks the user packets to invoke the data collection. Each has its own pros and cons. Either way, the postcards for a single user packet are from multiple devices and need to be correlated at the collector, which is a unique issue pertaining to PBT. Whereas, PBT provides an alternative to INT and address several implementation and deployment challenges of the INT-based solutions.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on April 23, 2019.

Copyright Notice

Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Motivation

In order to gain detailed data plane visibility to support effective network OAM, it is important to be able to examine the trace of user packets along their forwarding paths. Such path-associated data reflect the state and status of each user packet's real-time experience and provide valuable information for network monitoring, measurement, and diagnosis.

The telemetry data include the detailed forwarding path, the timestamp/latency at each network node, and other information. The emerging programmable data plane devices allow more sophisticated data to be retrieved [I-D.song-opsawg-dnp4iq]. Such path-associated flow data can only be derived from the live user packets. The data complement with the other data acquired through passive and active OAM mechanisms such as IPFIX and ICMP.

In-band Network Telemetry (INT) was designed to cater exactly this need. in-situ OAM (iOAM) represents the related standardization efforts. Essentially, INT augments user packets with instructions to tell each network node on the forwarding paths which data to collect. The requested data are inserted into and travel along with the user packets. The path-end node strips off the data trace and exports it to a data collector for processing.

While the concept is simple and straightforward, INT faces several technical challenges:

The above issues are inherent to the INT-based solutions. Nevertheless, the path-associated data acquired by INT are valuable for network operators. Therefore, alternative approaches which can collect the same data but avoid or mitigate the above issues are desired. This document provides a new proposal named Postcard-Based Telemetry (PBT) with two different implementation variations, each having its own trade-off and addressing some or all of the above issues. The basic idea of PBT is simple: at each node, instead of inserting the collected data into the user packets, the data are directly exported through dedicated OAM packets. Such "postcard" approach is in contrast to the "passport stamps" approach adopted by INT [DOI_10.1145_2342441.2342453]. The OAM packets or postcards can be transported in band or out of band, independent of the original user packets.

2. PBT-M: Postcard-based Telemetry with Packet Marking

This section describes the first variation of PBT. PBT-M aims to address all the challenges of INT listed above and introduce some new benefits. We first list all the design requirements of PBT-M.

2.1. New Requirements

2.2. Solution Description

In light of the above discussion, the sketch of the proposed solution, PBT-M, is as follows. The user packet, if its path-associated data need to be collected, is marked at the path head node. At each PBT-aware node, a postcard (i.e., the dedicated OAM packet triggered by a marked user packet) is generated and sent to a collector. The postcard contains the data requested by the management plane. The requested data are configured by the management plane through data templates (as in IPFIX) or other means. Once the collector receives all the postcards for a single user packet, it can infer the packet's forwarding path and analyze the data set. The path end node is configured to unmark the packets to its original format if necessary.

The overall architecture of PBT-M is depict in Figure 1.


                          +------------+        +-----------+ 
                          | Network    |        | Telemetry | 
                          | Management |(-------| Data      | 
                          |            |        | Collector | 
                          +-----:------+        +-----------+ 
			        :                     ^ 
				:configurations       |postcards (OAM pkts)
		                :                     | 
                 ...............:.....................|........
                 :             :               :      |       :
		 :   +---------:---+-----------:---+--+-------:---+
		 :   |         :   |           :   |          :   |
	         V   |         V   |           V   |          V   |
              +------+-+     +-----+--+     +------+-+     +------+-+
    usr pkts  | Head   |     | Path   |     | Path   |     | End    |
         ====>| Node   |====>| Node   |====>| Node   |====>| Node   |====>
              |        |     | A      |     | B      |     |        |
              +--------+     +--------+     +--------+     +--------+
              gen postcards  gen postcards  gen postcards  gen postcards
              mark usr pkts                                unmark usr pkts  

          

Figure 1: Architecture of PBT-M

Next we discuss the details of the PBT-M solution and the potential standard gaps.

2.3. New Challenges

Although PBT-M solves the issues of INT, it does introduce a few new challenges.

2.4. Considerations on PBT-M Design

To address the above challenges, we propose several design details of PBT-M.

2.4.1. Packet Marking

Instead of stuffing new header fields into user packets, it is preferred to reuse some existing header fields. To trigger the path-associated data collection, usually a single bit is sufficient. While no such bit is available, other packet marking techniques are needed. we discuss three possible application scenarios.

2.4.2. Flow Path Discovery

By default, all PBT-aware nodes are configured to react to the marked packets by exporting some basic data such as node ID and TTL before a data set template for that flow is configured. This way, the management plane can learn the flow path.

If the management plane wants to collect the path-associated data for some flow, it configures the head node(s) with a probability or time interval for the flow packet marking. When the first marked packet is forwarded in the network, the PBT-aware nodes will export the basic data to the collector. Hence, the flow path is identified. If other types of data need to be collected, the management plane can further configure the data set template to the target nodes. The PBT-aware nodes would collect and export data accordingly if the packet is marked and a data set template is present.

If for any reason, the flow path is changed. The new path nodes can be learnt immediately by the collector, so the management plane controller can be informed to configure the new path nodes. The outdated configuration can be automatically timed out or explicitly revoked by the management plane controller.

2.4.3. Packet Identity for Export Data Correlation

The collector needs to correlate all the OAM packets for a single user packet. Once this is done, the TTL (or the timestamp, if the network time is synchronized) can be used to infer the flow forwarding path. The key issue here is to uniquely identify the user packet affiliation of the OAM packet.

The first possible approach is to include the flow ID plus the user packet ID in the OAM packets. The user packet ID can be some unique information pertaining to a user packet (e.g., the sequence number of a TCP packet).

If the packet marking interval is long enough, then the flow ID itself is enough to identify the user packet. That is, we can assume all the exported OAM packets for the same flow during a short period of time belong to the same user packet.

If the network is synchronized, then the flow ID plus the timestamp at each node can also infer the packet identity. However, some errors may occur under some circumstances. For example, if two consecutive user packets from the same flows are both marked and one exported OAM packet from a node is lost, then it is difficult for the collector to decide which user packet the remaining OAM packet belongs to. In many cases, such rare errors may be tolerable.

3. PBT-I: Postcard-based Telemetry with Instruction Header

Since PBT-M has some challenges as listed in Section 2.3, this section describes another variation of PBT, which essentially compromises some of the design requirements listed in Section 2.1, yet retains most of the benefits of PBT.

PBT-I can be seen as a trade-off between INT/iOAM and PBT-M. PBT-I needs to add a fixed length instruction header to user packets for OAM data collection. However, the collected data will be exported through dedicated OAM packets. On the one hand, PBT-I violates the Req. 1 in Section 2.1. It also makes it harder to meet the Req. 2. On the other hand, the overhead of the instruction header is under control and user packets will not inflate with path length or telemetry data amount. We also introduce an optimization to mitigate the impact on Req. 2. In return, PBT-I addresses all the challenges of PBT-M:

3.1. Solution Description

The sketch of the proposed solution, PBT-I, is as follows. If the path-associated data need to be collected for a user packet, an instruction header named Telemetry Instruction Header (TIH) is inserted into the packet at the path head node. At each PBT-aware node, a postcard is generated and sent to a collector. Once the collector receives all the postcards for a single user packet, it can combine and analyze the data set. The path end node is configured to remove the TIH.

The overall architecture of PBT-I is depict in Figure 2.


                                  +-----------+ 
                                  | Telemetry | 
                                  | Data      | 
                                  | Collector | 
                                  +-----------+ 
			                ^ 
				        |postcards (OAM pkts)
		                        | 
                                        |
                                        |       
		  +--------------+------+-------+--------------+
		  |              |              |              |
	          |              |              |              |
              +---+----+     +---+----+     +---+----+     +---+----+
    usr pkts  | Head   |     | Path   |     | Path   |     | End    |
         ====>| Node   |====>| Node   |====>| Node   |====>| Node   |====>
              |        |     | A      |     | B      |     |        |
              +--------+     +--------+     +--------+     +--------+
	      insert TIH                                   remove TIH
	      gen postcards  gen postcards  gen postcards  gen postcards
                                                             

          

Figure 2: Architecture of PBT-I

3.2. PBT-I Telemetry Instruction Header

The proposed format of TIH is shown in Figure 3.


    0             0 0             1 1             2 2             3 
    0             7 8             5 6             3 4             1 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   | Next Header   |  TIH Length   |   Reserved    |   Hop Count   |    
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |                         Flow ID                               |  
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |                         Flow ID                               |  
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |                     Sequence Number                           |  
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |                   Data Element Bitmap                       |E|  
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |             Data Element Bitmap Extension (optional)          |  
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   

          

Figure 3: TIH Format

The specification of the data element bitmap is as follows:

3.3. Considerations on PBT-I Design

4. Security Considerations

Several security issues need to be considered.

5. IANA Considerations

TBD.

6. Contributors

TBD.

7. Acknowledgments

TBD.

8. Informative References

[DOI_10.1145_2342441.2342453] Handigol, N., Heller, B., Jeyakumar, V., Maziéres, D. and N. McKeown, "Where is the debugger for my software-defined network?", Proceedings of the first workshop on Hot topics in software defined networks - HotSDN '12, DOI 10.1145/2342441.2342453, 2012.
[I-D.brockners-inband-oam-requirements] Brockners, F., Bhandari, S., Dara, S., Pignataro, C., Gredler, H., Leddy, J., Youell, S., Mozes, D., Mizrahi, T., <>, P. and r. remy@barefootnetworks.com, "Requirements for In-situ OAM", Internet-Draft draft-brockners-inband-oam-requirements-03, March 2017.
[I-D.brockners-inband-oam-transport] Brockners, F., Bhandari, S., Govindan, V., Pignataro, C., Gredler, H., Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov, P. and R. Chang, "Encapsulations for In-situ OAM Data", Internet-Draft draft-brockners-inband-oam-transport-05, July 2017.
[I-D.brockners-ippm-ioam-geneve] Brockners, F., Bhandari, S., Govindan, V., Pignataro, C., Gredler, H., Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov, P. and R. Chang, "Geneve encapsulation for In-situ OAM Data", Internet-Draft draft-brockners-ippm-ioam-geneve-01, June 2018.
[I-D.bryant-mpls-synonymous-flow-labels] Bryant, S., Swallow, G., Sivabalan, S., Mirsky, G., Chen, M. and Z. Li, "RFC6374 Synonymous Flow Labels", Internet-Draft draft-bryant-mpls-synonymous-flow-labels-01, July 2015.
[I-D.clemm-netconf-push-smart-filters-ps] Clemm, A., Voit, E., Liu, X., Bryskin, I., Zhou, T., Zheng, G. and H. Birkholz, "Smart filters for Push Updates - Problem Statement", Internet-Draft draft-clemm-netconf-push-smart-filters-ps-00, October 2017.
[I-D.ietf-ippm-alt-mark] Fioccola, G., Capello, A., Cociglio, M., Castaldelli, L., Chen, M., Zheng, L., Mirsky, G. and T. Mizrahi, "Alternate Marking method for passive and hybrid performance monitoring", Internet-Draft draft-ietf-ippm-alt-mark-14, December 2017.
[I-D.ietf-ippm-ioam-data] Brockners, F., Bhandari, S., Pignataro, C., Gredler, H., Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov, P., Chang, R. and d. daniel.bernier@bell.ca, "Data Fields for In-situ OAM", Internet-Draft draft-ietf-ippm-ioam-data-00, September 2017.
[I-D.ietf-netconf-udp-pub-channel] Zheng, G., Zhou, T. and A. Clemm, "UDP based Publication Channel for Streaming Telemetry", Internet-Draft draft-ietf-netconf-udp-pub-channel-01, November 2017.
[I-D.ietf-netconf-yang-push] Clemm, A., Voit, E., Prieto, A., Tripathy, A., Nilsen-Nygaard, E., Bierman, A. and B. Lengyel, "YANG Datastore Subscription", Internet-Draft draft-ietf-netconf-yang-push-12, December 2017.
[I-D.ietf-sfc-ioam-nsh] Brockners, F., Bhandari, S., Govindan, V., Pignataro, C., Gredler, H., Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov, P. and R. Chang, "NSH Encapsulation for In-situ OAM Data", Internet-Draft draft-ietf-sfc-ioam-nsh-00, May 2018.
[I-D.ietf-sfc-nsh] Quinn, P., Elzur, U. and C. Pignataro, "Network Service Header (NSH)", Internet-Draft draft-ietf-sfc-nsh-28, November 2017.
[I-D.sambo-netmod-yang-fsm] Sambo, N., Castoldi, P., Fioccola, G., Cugini, F., Song, H. and T. Zhou, "YANG model for finite state machine", Internet-Draft draft-sambo-netmod-yang-fsm-00, October 2017.
[I-D.song-ippm-ioam-data-extension] Song, H. and T. Zhou, "In-situ OAM Data Type Extension", Internet-Draft draft-song-ippm-ioam-data-extension-00, October 2017.
[I-D.song-ippm-ioam-tunnel-mode] Song, H., Li, Z., Zhou, T. and Z. Wang, "In-situ OAM Processing in Tunnels", Internet-Draft draft-song-ippm-ioam-tunnel-mode-00, June 2018.
[I-D.song-mpls-extension-header] Song, H., Li, Z., Zhou, T. and L. Andersson, "MPLS Extension Header", Internet-Draft draft-song-mpls-extension-header-01, August 2018.
[I-D.song-opsawg-dnp4iq] Song, H. and J. Gong, "Requirements for Interactive Query with Dynamic Network Probes", Internet-Draft draft-song-opsawg-dnp4iq-01, June 2017.
[I-D.talwar-rtgwg-grpc-use-cases] Specification, g., Kolhe, J., Shaikh, A. and J. George, "Use cases for gRPC in network management", Internet-Draft draft-talwar-rtgwg-grpc-use-cases-01, January 2017.
[I-D.weis-ippm-ioam-gre] Weis, B., Brockners, F., crhill@cisco.com, c., Bhandari, S., Govindan, V., Pignataro, C., Gredler, H., Leddy, J., Youell, S., Mizrahi, T., Kfir, A., Gafni, B., Lapukhov, P. and M. Spiegel, "GRE Encapsulation for In-situ OAM Data", Internet-Draft draft-weis-ippm-ioam-gre-00, March 2018.
[RFC2925] White, K., "Definitions of Managed Objects for Remote Ping, Traceroute, and Lookup Operations", RFC 2925, DOI 10.17487/RFC2925, September 2000.
[RFC6241] Enns, R., Bjorklund, M., Schoenwaelder, J. and A. Bierman, "Network Configuration Protocol (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011.
[RFC7011] Claise, B., Trammell, B. and P. Aitken, "Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of Flow Information", STD 77, RFC 7011, DOI 10.17487/RFC7011, September 2013.

Authors' Addresses

Haoyu Song (editor) Huawei 2330 Central Expressway Santa Clara, 95050, USA EMail: haoyu.song@huawei.com
Tianran Zhou Huawei 156 Beiqing Road Beijing, 100095, P.R. China EMail: zhoutianran@huawei.com
Zhenbin Li Huawei 156 Beiqing Road Beijing, 100095, P.R. China EMail: lizhenbin@huawei.com