Unifying Carrier and Cloud Networks: Problem Statement and Challenges
draft-unify-nfvrg-challenges-00

Abstract

The introduction of network and service functionality virtualization in carrier-grade networks promises improved operations in terms of flexibility, efficiency, and manageability. In current practice, virtualization is controlled through orchestrator entities that expose programmable interfaces according to the underlying resource types. Typically this means the adoption of, on the one hand, established data center compute/storage and, on the other, network control APIs which were originally developed in isolation. Arguably, the possibility for innovation highly depends on the capabilities and openness of the aforementioned interfaces. This document introduces in simple terms the problems arising when one follows this approach and motivates the need for a high level of programmability beyond policy and service descriptions. This document also summarizes the challenges related to orchestration programming in this unified cloud and carrier network production environment.

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

1. Introduction

To a large degree there is agreement in the network research, practitioner, and standardization communities that rigid network control limits the flexibility and manageability of speedy service creation, as discussed in [NSC] and the references therein. For instance, it is not unusual that today an average service creation time cycle exceeds 90 hours, whereas given the recent advances in virtualization and cloudification one would be interested in service creation times in the order of minutes [EU-5GPPP-Contract] if not seconds.

Flexible service definition and creation start by formalizing the service into the concept of network function forwarding graphs, such as the ETSI VNF Forwarding Graph [ETSI-NFV-Arch] or the ongoing work in IETF [I-D.ietf-sfc-problem-statement]. These graphs represent the way in which service end-points (e.g., customer access) are interconnected with a set of selected network functionalities such as firewalls, load balancers, and so on, to deliver a network service. Service graph representations form the input for the management and orchestration to instantiate and configure the requested service. For example, ETSI defined a Management and Orchestration (MANO) framework in [ETSI-NFV-MANO]. We note that throughout such a management and orchestration framework different abstractions may appear for separation of concerns, roles or functionality, or for information hiding.

Compute virtualization is central to the concept of Network Function Virtualization (NFV). However, carrier-grade services demand that all components of the data path, such as Network Functions (NFs), virtual NFs (VNFs) and virtual links, meet key performance requirements. In this context, the inclusion of Data Center (DC) platforms, such as OpenStack [OpenStack], into the SDN infrastructure is far from trivial.

In this document we examine the problems arising as one combines these two formerly isolated environments in an effort to create a unified production environment and discuss the associated emerging challenges. Our goal is the definition of a production environment that allows multi-vendor and multi-domain operation based on open and interoperable implementations of the key entities described in the remainder of this document.

2. Terms and Definitions

We use the term compute and "compute and storage" interchangeably throughout the document. Moreover, we use the following definitions, as established in [ETSI-NFV-Arch]:

3. Motivation and Challenges

Figure 1 illustrates a simple service graph comprising three network functions (NFs). For the sake of simplicity, we will assume only two types of infrastructure resources, namely SWs and DCs - as per the terminology introduced above - and ignore appliance-based NFs for the time being. The goal is to implement the given service based on the available infrastructure resources.


            fr2  +---+ fr3
            +->o-|NF2|-o-+
            |  4 +---+ 5 |
      +---+ |            V +---+
1-->o-|NF1|-o----------->o-|NF3|-o-->8
    2 +---+ 3     fr1    6 +---+ 7

Figure 1: Service graph

The service graph definition contains NF types (NF1, NF2, NF3) along with the

corresponding ports (NF1:{2,3}; NF2:{4,5}; NF3:{6,7})
service access points {1,8} corresponding to infrastructure resources,
definition of forwarding behavior (fr1, fr2, f3)

The forwarding behavior contains classifications for matching of traffic flows and corresponding outbound forwarding actions.

Assume now that we would like to use the infrastructure (topology, network and software resources) depicted in Figure 2 and Figure 3 to implement the aforementioned service graph. That is, we have three SWs and two Point of Presence (PoP) with DC software resources at our disposal.


                   +---+
                +--|SW3|--+
                |  +---+  |
    +---+       |         |      +---+
 1  |PoP|    +---+      +---+    |PoP|  8
 o--|DC1|----|SW2|------|SW4 |---|DC2|--o
    +---+    +---+      +---+    +---+

 <--SP1--><---------SP2--------><--SP3-->

Figure 2: Infrastructure resources


   +----------+
   |  +----+  |PoP DC (== NFVI PoP)
   |  | CN |  |
   |  +----+  |
   |   |  |   |
   |  +----+  |
 o-+--| SW |--+-o
   |  +----+  |
   +----------+

Figure 3: A virtualized Point of Presence (PoP) with software resources (Compute Node - CN)

In the simple case, all resources could be part of the same service provider (SP) domain, which case, we need to ensure that each entity in Figure 2 could be procured from a different vendor and therefore interoperability is key for multi-vendor NFVI deployment.

Alternatively, different technologies like data center operation and network operation could result in a separation of technology domains within a single ownership (multi-technology).

We are also interested in a multi-operation environment, where the roles and responsibilities are distributed according to some organizational structure within the organization. Finally, we are interested in multi-provider environment, where different infrastructure resources are available from different service providers (SPs). Figure 2 indicates a multi-provider environment in the lower part of the figure as an example. We expect that this type of deployments will become more common in the future as they are well suited with the elasticity and flexibility requirements [NSC].

Figure 2 also shows the service access points corresponding to the overarching domain view, i.e., {1,8}.

In order to deploy the service graph of Figure 1 on the infrastructure resources of Figure 2, we will need an appropriate mapping which can be implemented in practice. In Figure 4 we illustrate a resource orchestrator (RO) as a functional entity whose task is to map the service graph to the infrastructure resources under some service constraints and taking into account the NF resource descriptions.


            fr2  +---+  fr3
            +->o-|NF2|-o-+
            |  4 +---+ 5 |
      +---+ |            V +---+
1-->o-|NF1|-o----------->o-|NF3|-o-->8
    2 +---+ 3     fr1    6 +---+ 7

                     ||
                     ||
 +--------+          \/        SP0
 |   NF   |   +---------------------+
 |Resource|==>|Resource Orchestrator|==> MAPPING
 | Descr. |   |      (RO)           |
 +--------+   +---------------------+
                     /\
                     ||
                     ||

                   +---+
                +--|SW3|--+
                |  +---+  |
    +---+       |         |      +---+
 1  |PoP|     +---+     +---+    |PoP|  8
 o--|DC1|-----|SW2|-----|SW4|----|DC2|--o
    +---+     +---+     +---+    +---+

<-----SP1--><--------SP2---------><---SP3---->
<------------------- SP0 -------------------->

Figure 4: Resource Orchestrator: information base, inputs and output

NF resource descriptions are assumed to contain information necessary to map NF types to a choice of instantiable VNF flavor or a selection of an already deployed NF appliance and networking demands for different operational policies. For example, if energy efficiency is to be considered during the decision process then information related to energy consumption of different NF flavors under different conditions (e.g., network load) should be included in the resource description.

Note that we also introduce a new service provider (SP0) which effectively operates on top of the virtualized infrastructure offered by SP1, SP2 and SP3.

In order for the RO to execute the resource mapping (which in general is a hard problem) it needs to operate on the combined control plane illustrated in Figure 5). In this figure we mark clearly that the interfaces to the compute (DC) control plane and the SDN (SW) control plane are distinct and implemented through different interfaces/APIs. For example, Ic1 could be the Apache CloudStack API, while Ic2 could be a control plane protocol such as ForCES or OpenFlow [I-D.irtf-sdnrg-layer-terminology]. In this case, the orchestrator at SP0 (top part of the figure) needs to maintain a tight coordination across this range of interfaces.

                 +---------+
                 |Orchestr.|
                 |   SP0   |
            _____+---------+_____
           /          |          \
          /           V Ic2       \
         |       +---------+       |
     Ic1 V       |SDN Ctrl |       V  Ic3
+---------+      |   SP2   |      +---------+
|Comp Ctrl|      +---------+      |Comp Ctrl|
|  SP1    |        /  |  \        |   SP3   |
+---------+    +---   V   ----+   +---------+
     |         |    +----+    |         |
     |         |    |SW3 |    |         |
     V         |    +----+    |         V
    +----+     V   /      \   V     +----+
 1  |PoP |    +----+      +----+    |PoP |  8
 o--|DC1 |----|SW2 |------|SW4 |----|DC2 |--o
    +----+    +----+      +----+    +----+

<-----SP1--><--------SP2---------><---SP3---->
<------------------- SP0 -------------------->

Figure 5: The RO Control Plane view

Note that in Figure 5 we denote the control plane interfaces with (line) arrows. Data plane connections use simple lines.

In the real-world, however, orchestration operations do not stop, for example, at the DC1 level as depicted in Figure 5. If we (so-to-speak) "zoom into" DC1 we will see a similar pattern and the need to coordinate SW and DC resources within DC1 as illustrated in Figure 6. As depicted, this edge PoP includes compute nodes (CNs) and SWs which in most of the cases will also contain an internal topology.

In Figure 6, IcA is an interface similar to Ic2 in Figure 5, while IcB could be, for example, OpenStack Nova or similar. The Northbound Interface (NBI) to the Compute Controller is Ic1 or Ic3 as shown in Figure 5.

             NBI
              |
         +---------+
         |Comp Ctrl|
         +---------+
       +----+     |
   IcA V          | IcB:to CNs
+---------+       V
|SDN Ctrl |    |          |  ext port
+---------+  +---+      +---+
  to|SW      |SW |      |SW |
    +->     ,+--++.._  _+-+-+
    V    ,-"   _|,,`.""-..+
       _,,,--"" |    `.   |""-.._
  +---+      +--++     `+-+-+    ""+---+
  |SW |      |SW |      |SW |      |SW |
  +---+    ,'+---+    ,'+---+    ,'+---+
  |   | ,-"  |   | ,-"  |   | ,-"  |   |
+--+ +--+  +--+ +--+  +--+ +--+  +--+ +--+
|CN| |CN|  |CN| |CN|  |CN| |CN|  |CN| |CN|
+--+ +--+  +--+ +--+  +--+ +--+  +--+ +--+

Figure 6: PoP DC Network with Compute Nodes (CN)

Even further, each single Compute Node (CN) may also have internal switching resources (see Figure 7). In a carrier environment, in order to meet data path requirements, allocation of compute node internal distributed resources (blades, CPU cores, etc.) may become equivalently important.

+-+  +-+ +-+  +-+
|V|  |V| |V|  |V|
|N|  |N| |N|  |N|
|F|  |F| |F|  |F|
+-+  +-+ +-+  +-+
|   /   /       |
+---+ +---+ +---+
|LSI| |LSI| |LSI|
+---+ +---+ +---+
  |  /        |
+---+       +---+
|NIC|       |NIC|
+---+       +---+
  |           |

Figure 7: Compute Node with internal switching resource

4. Problem Statement

The motivational example of Section 3 illustrates that compute virtualization implicitly involves network virtualization. On the other hand, if one starts with an SDN network and adds compute resources to network elements, then compute resources must be assigned to some virtualized network resources if offered to clients. That is, we observe that compute virtualization is implicitly associated with network virtualization. Furthermore, virtualization leads to recursions with clients (redefining and) reselling resources and services [I-D.huang-sfc-use-case-recursive-service].

We argue that given the multi-level virtualization of compute, storage and network domains, automation of the corresponding resource provisioning needs a recursive programmatic interface. Existing separated compute and network programming interfaces cannot provide such recursions and cannot satisfy key requirement for multi-vendor, multi-technology and multi-provider interoperability environments. Therefore we foresee the necessity of a recursive programmatic interface for joint compute, storage and network provisioning.

We conclude this section with two key questions which we hope will initiate the discussion in the NFVRG community for further development of the concept described in this document.

Firstly, as motivated in Section 3, orchestrating networking resources appears to have a recursive nature at different levels of the hierarchy. Would a programmatic interface at the combined compute and network abstraction better support this recursive and constraint-based resource allocation?

Secondly, can such a joint compute, storage and network programmatic interface allow an automated resource orchestration similar to a recursive SDN architecture [ONF-SDN-ARCH]?

5. IANA Considerations

This memo includes no request to IANA.

6. Security Considerations

TBD

7. Acknowledgement

The authors would like to thank the UNIFY team for inspiring discussions and in particular Fritz-Joachim Westphal for his comments and suggestions on how to refine this draft.

This work is supported by FP7 UNIFY, a research project partially funded by the European Community under the Seventh Framework Program (grant agreement no. 619609). The views expressed here are those of the authors only. The European Commission is not liable for any use that may be made of the information in this document.

8. Informative References

[ETSI-NFV-Arch]	ETSI, "Architectural Framework v1.1.1", Oct 2013.
[ETSI-NFV-MANO]	ETSI, "Network Function Virtualization (NFV) Management and Orchestration V0.6.1 (draft)", Jul. 2014.
[EU-5GPPP-Contract]	5G-PPP Association, "Contractual Arrangement: Setting up a Public- Private Partnership in the Area of Advance 5G Network Infrastructure for the Future Internet between the European Union and the 5G Infrastructure Association", Dec 2013.
[I-D.huang-sfc-use-case-recursive-service]	Huang, C., Zhu, J. and P. He, "SFC Use Cases on Recursive Service Function Chaining", Internet-Draft draft-huang-sfc-use-case-recursive-service-00, July 2014.
[I-D.ietf-sfc-problem-statement]	Quinn, P. and T. Nadeau, "Service Function Chaining Problem Statement", Internet-Draft draft-ietf-sfc-problem-statement-00, January 2014.
[I-D.irtf-sdnrg-layer-terminology]	Haleplidis, E., Pentikousis, K., Denazis, S., Salim, J., Meyer, D. and O. Koufopavlou, "SDN Layers and Architecture Terminology", Internet-Draft draft-irtf-sdnrg-layer-terminology-02, October 2014.
[NSC]	John, W., Pentikousis, K., et al., "Research directions in network service chaining", Proc. SDN for Future Networks and Services (SDN4FNS), Trento, Italy IEEE, November 2013.
[ONF-SDN-ARCH]	ONF, "SDN architecture", Jun. 2014.
[OpenStack]	The OpenStack project, "Openstack cloud software", 2014.

Authors' Addresses

Robert Szabo Ericsson Research, Hungary Irinyi Jozsef u. 4-20 Budapest, 1117 Hungary EMail: robert.szabo@ericsson.com URI: http://www.ericsson.com/

Andras Csaszar Ericsson Research, Hungary Irinyi Jozsef u. 4-20 Budapest, 1117 Hungary EMail: andras.csaszar@ericsson.com URI: http://www.ericsson.com/

Kostas Pentikousis EICT GmbH EUREF-Campus Haus 13 Torgauer Strasse 12-15 10829 Berlin, Germany EMail: k.pentikousis@eict.de

Mario Kind Deutsche Telekom AG Winterfeldtstr. 21 10781 Berlin, Germany EMail: mario.kind@telekom.de

Diego Daino Telecom Italia Via Guglielmo Reiss Romoli 274 10148 Turin, Italy EMail: diego.daino@telecomitalia.ite