ALTO WG Q. Xiang Internet-Draft Yale University Intended status: Informational F. Le Expires: September 12, 2019 IBM Y. Yang Yale University March 11, 2019 ALTO for Multi-Domain Applications: A Review of Use Cases and Design Requirements draft-xiang-alto-multidomain-usecases-00.txt Abstract With the development of novel network technology, such as software defined networking and network function virtualization, many novel multi-domain applications, such as flexible interdomain routing, distributed, federated machine learning and multi-domain collaborative dataset transfer, have been deployed. These applications can benefit substantially from the ALTO protocol [RFC7285], through which the information of multiple networks can be provided to applications. This document first introduces several multi-domain applications and how they can benefit from ALTO. It then describes a generic framework for multi-domain applications to use ALTO to improve the performance, followed by a discussion on new requirements and challenges for ALTO to better support these applications. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on September 12, 2019. Xiang, et al. Expires September 12, 2019 [Page 1] Internet-Draft ALTO for Multi-Domain March 2019 Copyright Notice Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 3 3. Review of Multi-Domain Applications . . . . . . . . . . . . . 3 3.1. Flexible Interdomain Routing . . . . . . . . . . . . . . 3 3.1.1. How flexible interdomain routing can benefit from ALTO? . . . . . . . . . . . . . . . . . . . . . . . . 3 3.1.2. Example . . . . . . . . . . . . . . . . . . . . . . . 4 3.2. Resource Orchestration for Collaborative Data Sciences . 4 3.2.1. How multi-domain resource orchestration can benefit from ALTO . . . . . . . . . . . . . . . . . . . . . . 4 3.2.2. Example . . . . . . . . . . . . . . . . . . . . . . . 5 3.3. Federated Machine Learning . . . . . . . . . . . . . . . 6 3.3.1. How federated machine learning can benefit from ALTO 6 3.3.2. Example . . . . . . . . . . . . . . . . . . . . . . . 6 4. A Generic Framework . . . . . . . . . . . . . . . . . . . . . 7 4.1. Workflow . . . . . . . . . . . . . . . . . . . . . . . . 8 5. Requirements of ALTO in Multi-Domain Applications . . . . . . 9 5.1. Design Requirements . . . . . . . . . . . . . . . . . . . 9 5.2. Existing Efforts in the ALTO Working Group . . . . . . . 10 6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 7.1. Normative References . . . . . . . . . . . . . . . . . . 10 7.2. Informative References . . . . . . . . . . . . . . . . . 11 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 1. Introduction The ALTO protocol [RFC7285] provides network information to applications so that applications can make network informed decisions to improve the performance. Not only traditional applications such peer-to-peer systems, many recent, novel multi-domain applications, Xiang, et al. Expires September 12, 2019 [Page 2] Internet-Draft ALTO for Multi-Domain March 2019 which orchestrate resources across multiple networks, can also benefit substantially from ALTO. The goal of this document is to explore how ALTO can help improve the performance of novel multi-domain applications, what ALTO extension services are needed, and what are the corresponding requirements and challenges for designing such extensions. To this end, this document first give a case-by-case review of emerging multi-domain applications and how they can benefit from ALTO. It then describes a generic framework for multi-domain applications to use ALTO to improve the performance, followed by a discussion on the need of new ALTO services and the corresponding requirements and challenges for these extensions to better support these applications. 2. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 3. Review of Multi-Domain Applications 3.1. Flexible Interdomain Routing Flexible interdomain routing can be a highly valuable service for network providers. Specifically, an autonomous system (AS) providing such a service (the provider) allows other ASes (clients) to specify routing actions at the provider based on flexible matching conditions (e.g., match on TCP/IP 5-tuple). In this way, a client AS using the flexible interdomain routing service can offload access and traffic control to provider ASes, leading to a simpler client network configuration while giving the provider ASes additional business opportunities. 3.1.1. How flexible interdomain routing can benefit from ALTO? ALTO provides provider ASes a standardized approach to expose its routing capability to client ASes. Traditional interdomain routing protocols such as BGP are not good options because they only expose the currently used routes, limiting client ASes' choices to specify flexible routes. In contrast, ALTO and its extensions provide interfaces for provider ASes to expose not only currently used routes, but also available yet unused routes, to client ASes so that they can have the flexibility to specify different routes for different data traffic. Xiang, et al. Expires September 12, 2019 [Page 3] Internet-Draft ALTO for Multi-Domain March 2019 3.1.2. Example Consider the example in Figure 1. AS A is compromised and being used to send DDoS traffic to AS E. Without flexible interdomain routing, AS E can setup a firewall locally, but normal traffic from B to E will still be congested at C-D-E due to the existence of malicious traffic from A to E. If AS C provides flexible interdomain routing service, AS E can specify such a firewall at AS C to block DDoS traffic from A, and at the same time avoid the congestion of normal traffic from B to E. +-----+ DDoS traffic | AS A|_ | | \ +--+---+ +---+--+ +------+ +-----+ \ | | | | | | \_| AS C +---------+ AS D |--------| AS E | +-----+ / | | | | | | | AS B|__/ +------+ +------+ +------+ | | +-----+ Normal traffic Figure 1: Flexible interdomain routing for DDoS mitigation. 3.2. Resource Orchestration for Collaborative Data Sciences As the data volume increases exponentially over time, data analytics is transiting from a single-domain network to a multi-domain, geo- distributed network, where different member networks contribute various resources, e.g., computation, storage and networking resources, to collaboratively collect, share and analyze extremely large amounts of data. Such a paradigm calls for a unified resource orchestration framework to manage a large set of distributively- owned, heterogeneous resources, with the objective of efficient resource utilization, following the autonomy and privacy of different domains. 3.2.1. How multi-domain resource orchestration can benefit from ALTO One key design challenge for multi-domain resource orchestration is its resource information model. Existing design options such as resource graph and ClassAds are inadequate because they cannot simultaneously (1) allow member networks to provide accurate information on different types of resource, (2) avoid the exposure of private information of member networks such as topology, and (3) allow data analytics jobs to accurately describe their requirements of different types of resources. In contrast, the section 7.1 of Xiang, et al. Expires September 12, 2019 [Page 4] Internet-Draft ALTO for Multi-Domain March 2019 Figure 2 discusses the advantages of choosing ALTO as the resource information model for multi-domain resource orchestration, and how ALTO can simultaneously satisfy the aforementioned design requirements. 3.2.2. Example Consider an example of three member networks in Figure 2, where s1 and s2 are storage endpoints and d1 and d2 are computation endpoints. Assume a data analytics job is composed of two parallel tasks T1 and T2. T1 needs dataset X as input and T2 needs dataset Y as input. .------------. | Network B | .-------------. ingB| | | Network A o--------|o-----d1 | | /| '------------' | s1\ / | | o--o | .------------. | s2/ \ | | Network C | | \| ingC| | | o--------|o-----d2 | '-------------' '------------' Figure 2: Multi-domain resource orchestration. Using the ALTO endpoint property service, an ALTO client in the resource orchestrator can discover that d1 satisfies the computing requirements of T1 and d2 satisfies the computing requirements of T2. Hence there are only two candidate endpoint pairs: (s1, d1) and (s2, d2). Afterwards, using the ALTO path vector extension, the ALTO client can retrieve the bandwidth sharing information of task T1 and T2, denoted as x1 and x2, respectively, as follows. A: x1 + x2 <= 10Mbps B: x1 <= 3Mbps C: x2 <= 3Mbps With such information, the resource orchestrator can make the optimal resource orchestration decision to reserve 3 Mbps bandwidth for task T1, and 3 Mbps bandwidth for task T2. Xiang, et al. Expires September 12, 2019 [Page 5] Internet-Draft ALTO for Multi-Domain March 2019 3.3. Federated Machine Learning Instead of moving large-scale datasets from multiple devices / networks to a centralized location for training, federated learning, is a distributed machine learning approach which enables training on distributed datasets residing on different autonomous systems (devices or networks). In this way, only updates on the training model need to be communicated between networks, leading to substantial reduction of networking resource consumption (e.g., saving bandwidth). 3.3.1. How federated machine learning can benefit from ALTO Federated machine learning requires efficient scheduling algorithms to decide how networking resources should be allocated to transmit training model updates between different ASes. Similar as moving large-scale datasets between multiple ASes, moving updates of training model between ASes can also benefit from the availability of networking information, such as the AS-path and bandwidth sharing. ALTO provides a standardized approach for federated machine learning schedulers to retrieve such information from networks so that adaptive scheduling decisions can be made. 3.3.2. Example Consider the example in Figure 3, where machine learning workers are located in AS A and D, while AS B and C are transit networks for data traffic transmitted between A and D. When AS A has a large, critical training model update to send to D. It first queries the ALTO servers and B and C for the endpoint cost (e.g., bandwidth) to transmit data from A to D. Suppose the ALTO server at AS B returns an endpoint cost of 10Mbps, while the ALTO server at AS C returns an endpoint cost of 100 Mbps. AS A can then use such information to make the optimal model update scheduling algorithm to send the training model update to AS D via AS C, instead of AS B. Xiang, et al. Expires September 12, 2019 [Page 6] Internet-Draft ALTO for Multi-Domain March 2019 +-----+ -------| AS B|-- +-----+ / | | \ +--+---+ | AS A|/ +-----+ \ | | | |\ =| AS D + +-----+ \ +-----+ / | | -------| AS C|--/ +------+ | | +-----+ Figure 3: Federated machine learning. 4. A Generic Framework After reviewing several important, novel multi-domain applications that can benefit substantially from ALTO, this document describes a generic framework for such applications to use ALTO to retrieve information from networks to improve their performance. The high- level architecture of this framework is given in Figure 4. Xiang, et al. Expires September 12, 2019 [Page 7] Internet-Draft ALTO for Multi-Domain March 2019 .---------------------------------------------------------------------. |Application Layer | | .-------------. .-------------. | | |Application 1| ... |Application N| | | |ALTO Client 1| |ALTO Client N| | | '-------------' '-------------' | '-------------|---------\---------------/----|------------------------' .- - - - - - -| - - - - -\ - - - - - - /- - -|- - - - - - -- - - - - -. |Service | \ / | | |Layer | \ / | | | .-----------------. .-----------------. | | | Network 1 | | Network N | | | | | | | | | |ALTO Server 1 | ... |ALTO Server N | | | |Execution Agent 1| |Execution Agent N| | | '-----------------' '-----------------' | '- - - - - | - - - - - - - - - - - - - - - - - - - - -|- - - - - - - -' .----------|------------------------------------------|---------------. |Signaling | | | |Layer | | | | .----------------. .----------------. | | | Network 1 | ... | Network N | | | '----------------' '----------------' | '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -' Figure 4: Generic framework of using ALTO in multi-domain applications. The top layer of this framework is the application layer, in which each application deploy one or more ALTO clients to query for information provided by networks. The middle layer is the service layer. In this layer, each network deploys one more more ALTO servers to respond the queries sent by the ALTO clients from applications, and deploys one or more execution agents to respond to the applications' resource consumption actions. The bottom layer is the signaling layer, in which each network deploys interdomain protocols / systems, such as routing protocol BGP and resource reservation system OSCARS. 4.1. Workflow The basic workflow of this framework is as follows. o An application identifies the networks whose resources (e.g., networking, computation and storage) it may want to consume, and invokes its ALTO clients to query the ALTO servers deployed in those networks for detailed resource information using base ALTO Xiang, et al. Expires September 12, 2019 [Page 8] Internet-Draft ALTO for Multi-Domain March 2019 protocol and its extension services (e.g., path vector, cost calendar and so on); o Upon receiving a query from an ALTO client, an ALTO server checks its local information, contacts the underlying signaling layer protocol / system of its residing network if local information is outdated, and returns the latest resource information to the querying ALTO client; o The applications uses the resource information collected from ALTO servers to make resource allocation decisions (e.g., route selection, resource reservation, etc.), and send such decisions to corresponding execute agents in the corresponding networks (e.g., the simple-reservation-interface of OSCARS). 5. Requirements of ALTO in Multi-Domain Applications Using ALTO to improve the performance of recent novel multi-domain applications poses several new design requirements. This section discusses these requirements and briefly review existing efforts in the ALTO working group aiming to satisfy them. 5.1. Design Requirements o Exposing information of alternative resources. Current ALTO protocols and its extensions only provide information of currently used resources (e.g., currently used interdomain route). However, exposing information of alternative resources (e.g., available but not used interdomain routes) may provide the users of new multi- domain applications (e.g., flexible interdomain routing) more flexibility on choosing different resources, giving networks that provide such applications additional business opportunities. o Providing a unified, accurate representation of multiple types of resources. Current ALTO protocols and its extensions mainly focus on providing network information to applications, with the exception of endpoint property service. However, as new multi- domain applications often consume multiple types of resources across multiple networks, encoding such information accurately in a unified approach is crucial for deploying ALTO to improve such applications' performance. o Providing interfaces for more flexible query. Current ALTO protocol and its extensions allows applications to query resource information by specifying IP addresses of endpoints and simple filters. However, with the emerging of new networking architecture (e.g., software defined networking and network function virtualization) and the fine-grained resource requirement Xiang, et al. Expires September 12, 2019 [Page 9] Internet-Draft ALTO for Multi-Domain March 2019 of applications (e.g., link-disjoint paths and endpoint precedence), applications need a more flexible interface to specify queries of resource information. 5.2. Existing Efforts in the ALTO Working Group Several documents have been submitted to the ALTO working group, with the aim to satisfy one or more of the design requirements discussed above. For example, [DRAFT-PV], [DRAFT-RSA], [DRAFT-UNICORN-INFO] and several other documents propose and apply the ALTO path vector extension to provide accurate networking resource information to support multi-domain resource orchestration. [DRAFT-NFCHAIN] proposes to use ALTO to support resource orchestration for multi- domain service function chaining, and proposes a new ALTO extension to retrieve AS path of network functions across different networks. [DRAFT-CONTEXT] proposes proposes to extend cost information specified in RFC7285 by providing several possible cost values for the same cost metric where each value depends on qualitative criteria as opposed to quantitative criteria such as time. [DRAFT-UR] makes a proposal to use mathematical programming constraint as a generic representation of multiple resources. [DRAFT-FCS] proposes a flexible flow query extension service to allow applications to specify query entities based on flexible matching conditions (e.g., TCP/IP 5-tuple) instead of IP addresses only. 6. Summary This document reviews several emerging multi-domain applications and how they can benefit from ALTO. It then describes a generic framework for multi-domain applications to use ALTO to improve the performance. In addition, several design requirements are discussed. Though different drafts in the working group have been trying to address one or more these design requirements, a systematic investigation of these issues is still missing. The authors of this document plan to perform such an investigation and make a unified design proposal in the next version of this document. 7. References 7.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . Xiang, et al. Expires September 12, 2019 [Page 10] Internet-Draft ALTO for Multi-Domain March 2019 7.2. Informative References [DRAFT-CONTEXT] Randriamasy, S., "ALTO Contextual Cost Values", 2017, . [DRAFT-FCS] Zhang, J., Gao, K., Wang, J., Xiang, Q., and Y. Yang, "ALTO Extension: Flow-based Cost Query", 2017, . [DRAFT-NFCHAIN] Perez, D. and C. Rothenberg, "ALTO-based Broker-assisted Multi-domain Orchestration", 2018, . [DRAFT-PV] Bernstein, G., Lee, Y., Roome, W., Scharf, M., and Y. Yang, "ALTO Extension: Abstract Path Vector as a Cost Mode", 2015, . [DRAFT-RSA] Gao, K., Wang, X., Xiang, Q., Gu, C., Yang, Y., and G. Chen, "A Recommendation for Compressing ALTO Path Vectors", 2017, . [DRAFT-UNICORN-INFO] Xiang, Q., Newman, H., Bernstein, G., Du, H., Gao, K., Mughal, A., Balcas, J., Zhang, J., and Y. Yang, "Implementation and Deployment of A Resource Orchestration System for Multi-Domain Data Analytics", 2017, . [DRAFT-UP] Roome, W., Chen, S., Randriamasy, S., Yang, Y., and J. Zhang, "Unified Properties for the ALTO Protocol", 2015, . Xiang, et al. Expires September 12, 2019 [Page 11] Internet-Draft ALTO for Multi-Domain March 2019 [DRAFT-UR] Xiang, Q., Le, F., and Y. Yang, "ALTO Extension: Unified Resource Representation", 2018, . [RFC7285] Alimi, R., Ed., Penno, R., Ed., Yang, Y., Ed., Kiesel, S., Previdi, S., Roome, W., Shalunov, S., and R. Woundy, "Application-Layer Traffic Optimization (ALTO) Protocol", RFC 7285, DOI 10.17487/RFC7285, September 2014, . [RFC8189] Randriamasy, S., Roome, W., and N. Schwan, "Multi-Cost Application-Layer Traffic Optimization (ALTO)", RFC 8189, DOI 10.17487/RFC8189, October 2017, . Authors' Addresses Qiao Xiang Yale University 51 Prospect Street New Haven, CT USA Email: qiao.xiang@cs.yale.edu Franck Le IBM Thomas J. Watson Research Center Yorktown Heights, NY USA Email: fle@us.ibm.com Y. Richard Yang Yale University 51 Prospect Street New Haven, CT USA Email: yry@cs.yale.edu Xiang, et al. Expires September 12, 2019 [Page 12]