Internet Research Task Force (IRTF) S. Lee
Internet-Draft ETRI
Intended status: Informational S. Pack
Expires: January 7, 2016 KU
M-K. Shin
ETRI
E. Paik
KT
R. Browne
Intel
July 6, 2015

Resource Management in Service Chaining
draft-irtf-nfvrg-resource-management-service-chain-01

Abstract

This document specifies problem definition and use cases of NFV resource management in service chaining for path optimization, traffic optimization, failover, load balancing, etc. It further describes design considerations and relevant framework for the resource management capability that dynamically creates and updates network forwarding paths (NFPs) considering resource constraints of NFV infrastructure.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on January 7, 2016.

Copyright Notice

Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

Network Functions Virtualisation (NFV) [ETSI-NFV-WHITE] offers a new way to design, deploy and manage network services. The network service can be composed of one or more network functions and NFV relocates the network functions from dedicated hardware appliances to generic servers, so they can run in software. Using these virtualized network functions (VNFs), one or more VNF forwarding graphs (VNF-FGs; a.k.a. service chains) can be associated to the network service, each of which describes a network connectivity topology, by referencing VNFs and Virtual Links that connect them. One or more network forwarding paths (NFPs) can be built on top of such a topology, each defining an ordered sequence of VNFs and Virtual Links to be traversed by traffic flows matching certain criteria.

The network service is instantiated by allocating NFVI resources for VNFs and VLs which constitute the VNF-FGs. Thus, the capacity and performance of the network service depends on the state and attributes of the resources used for the VNF instances. While this brings a similar problem to the VM placement optimization in a cloud computing environment, it differs as one or more VNF instances are interconnected for a single network service. For example, if one of the VNF instances in the VNF-FG gets failed or overloaded, the whole network service also gets affected. Thus, the VNF instances need to be carefully placed during NS instantiation considering their connectivity within NFPs. They also need to be monitored and dynamically migrated or scaled at run-time to adapt to changes in the resources.

The resource placement problem in VNF-FGs matters not only to the performance and capacity of network services but also to the optimized use of NFVI resources. For example, if processing and bandwidth burden converges on the VNF instances placed in a specific NFVI-PoP, it may result in scalability problem of the NFV infrastructure. Thus care is encouraged to be taken in distributing load across local and external VNF instances at run-time.

This document addresses resource management problem in service chaining to optimize the NS performance and NFVI resource usage. It provides the relevant use cases of the resource management such as traffic optimization, failover, load balancing and further describes design considerations and relevant framework for the resource management capability that dynamically creates and updates NFP instances considering NFVI resource states for VNF instances and VL instances.

Note that this document mainly focuses on the resource management capability based on the ETSI NFV framework [ETSI-NFV-ARCH] but also studies contribution points to the work for control plane of SFC architecture [I-D.ietf-sfc-architecture] [I-D.ietf-sfc-architecture] [I-D.ww-sfc-control-plane].

2. Terminology

This document uses the following terms and most of them were reproduced from [ETSI-NFV-TERM].

3. Resource management in service chain

3.1. Resource scheduling among network services

In the NFV framework, network services are realized with NS instantiation procedures at which virtualized NFVI resources are assigned to the VNFs and VLs which constitute VNF-FGs of the network service. The NFVI resources are placed and located along the VNF-FG by NFV Orchestrator (NFVO) dynamically according to:

In order to satisfy the deployment templates and resource policies, VNF-FGs of the network services need to be built by considering the state of NFVI resources for VNF instances (e.g., availability, throughput, load, disk usage) and VL instances (e.g., bandwidth, delay, delay variation, packet loss).

However, since the NFVI resources are shared by different network services and their deployment constraints are very different from each other, it is required to carefully schedule the NFVI resources for multiple network services to optimize their KPIs.

3.2. Performance coupling within a service chain

In NFV, a network service is composed of one or more virtualized network functions which are connected via virtual links along NFPs specified for a traffic flow for the network service. Thus, the performance of a network service is determined by the performance and capability of a coupling of VNF instances and VL instances. For example, if one of the VNF instances or VL instances of an NFP gets failed or overloaded, the whole network service also gets affected. Thus, the VNF instances need to be carefully placed during NS instantiation considering their connectivity within NFPs.

This performance coupling can be handled by considering deployment rules for affinity/anti-affinity, geography, or topological locations of VNFs; and QoS of virtual links.

Another important factor for virtual links is the inter-connectivity between different NFVI-PoPs, which is an enabler of resource sharing among different NFVI-PoPs. When the VNF instances of a network service are allocated at different NFVI-PoPs, the NFVI-PoP interconnect may be a bottleneck point which needs to be monitored to support KPIs of the service chain.

3.3. Multiple policies and conflicts

The NFVI resources for a network service should be allocated and managed according to a NS policy given in the network service descriptor (NSD), which describes how to govern NFVI resources for VNF instances and VL instances to support KPIs of the network service. The examples of NS policy are affinity/anti-affinity, scaling, fault and performance, geography, regulatory rules, NS topology, etc. Since network services may have different NS policies for their own deployment and performance, this may cause resource management difficult within the shared NFVI resources.

For network-wide (or NS-wide) resource management, NFVI policy (or network policy) can be also provided. It may describe the resource management policy for optimized use of infrastructure resources rather than the performance of a single network service. The examples of NFVI policy are NFVI resource access control, reservation and/or allocation policies, placement optimization based on affinity and/or anti-affinity rules, geography and/or regulatory rules, resource usage, etc.

Multiple administrative domains or subsystems may have different NFVI policies so that it may bring some conflicts when enforcing them in a global infrastructure. There could be a similar problem among NS policies and NFVI policies.

Note that the similar topics are being studied in [I-D.norival-nfvrg-nfv-policy-arch]

3.4. Dynamic adaptation of service chains

The performance and capability of NFVI resources may vary in time due to different uses and management policies of the resources. If some changes in the resources make the service quality unacceptable, the VNF instances can be scaled according to the given auto-scaling policies. But it's only for local quality of the VNF.

In order to provide optimized KPIs to network services, the NFP instances need to dynamically adapt to the changes of the resource state at run-time. The performance of the whole NFP instance should be measured by monitoring the resource state of VNF instances and VL instances. Based on the monitoring results, some VNF instances may be determined and relocated at different virtualized resources with better performance and capabilities.

4. Use cases

In this section, several (but not exhausted) use cases for resource management in service chaining are provided: fail-over, load balancing, path optimization, traffic optimization, and energy efficiency.

4.1. Fail-over

For service continuity, failure of a VNF instance needs to be detected and the failed one needs to be replaced with the other one which is available to use as per redundancy policy. Figure 1 presents an example of the fail-over use case. A network service is defined as a chain of VNF-A and VNF-B; and the service chain is instantiated with VNF-A1 and VNF-B1 which are instances of VNF-A and VNF-B, respectively. In the meantime, failure of VNF-B1 is detected so that VNF-B2 replaces the failed one for fail-over of the NFP.

                                                             
                +--------+                        +--------+   
                | VNF-B2 |                       #| VNF-B2 |###
   +--------+   +--------+           +--------+ # +--------+   
###| VNF-A1 |       _|_           ###| VNF-A1 |#      _|_      
   +--------+      (___)             +--------+      (___)     
  ___/    #       /     \      \    ___/            /     \    
 (___)+---#------+       +   ===}  (___)+----------+       +   
          #       \ ___ /      /                    \ ___ /    
          #        (___)                             (___)     
          #          |                                 |
          #     +--------+                        +--------+   
          ######| VNF-B1 |###        (failure)--> | VNF-B1 |   
                +--------+                        +--------+   

### NFP

Figure 1: A fail-over use case

The above is in the case where there is a 1+1 or 1:N redundancy scheme. In event that VNF instance overloads before NFVO has time to scale out, or when resources do not permit a scale-out then we can route the service chain deterministically to a remote VNF instance. This adaptation may be revertive or non-revertive dependent on service provider policy and resource availability.

4.2. Load balancing

A single VNF instance may be a bottleneck point of a service chain due to its overload. It may affect the performance of the whole service chain consequently so that an NFP instance needs to be built to avoid bottleneck points or maintained to distribute workloads of overloaded VNF instances.

With NFVI-PoP Interconnect, service chains can be balanced between NFVI-PoPs in a way that best utilises NFV infrastructure and ensures service integrity. The wide area conditions can be monitored in real-time to provide KPIs, such as BW, delay, delay variation and packet loss per QoS class to the service chaining application which may enable use of external VNF instances when there is an overload or failure condition in the local NFVI-PoP. In this way the service chaining application can make a service chain reroute decision (in the event of failure and/or overload) that is network and platform-aware. The service chaining application understands the state of external VNFs and WAN conditions per QoS class between the local NFVI-PoP and remote NFVI-PoP in real-time.

4.3. Path optimization

Traffic for a network service traverses all of the VNF instances and the connecting VL instances given by a NFP instance to reach a target end point. Thus, quality of the network service depends on the resource constraints (e.g., processing power, bandwidth, topological locations, latency) of VNF instances, VL instances including NFVI-PoP interconnects. In order to optimize the path of the network service, the resource constraints of VNF instances and VLs need to be considered at constructing NFPs. Since the resource state may vary in time during the service, NFP instances also need to adapt to the changes of resource constraints of the VNF instances and VL instances by monitoring and replacing them at run-time.

4.4. Traffic optimization

A network operator may provide multiple network services with different VNF-FGs and different flows of traffic traverse between source and destination end-points along the VNF-FGs. For efficiency of resource usage, the NFP instances need to be built by default to localize the traffic flows and to avoid processing and network bottlenecks. It is only in the case of local failure or overload (whereby the NFVO is unable or has not completed a scale-out of on-site resources) that NFP instances would be constructed between NFVI-PoPs. In this case, multiple VNF instances of different NFVI-PoPs need to be considered together at constructing a new NFP instance or adapting one.

4.5. Energy efficiency

Energy efficiency in the network is getting important to reduce impact on the environment so that energy consumption of VNF instances using NFVI resources (e.g., compute, storage, I/O) needs to be considered at NFP instantiation or adaptation. For example, a NFP can be instantiated as to make traffic flows aggregated into a limited number of VNF instances as much as its performance is preserved in a certain level. Policy may vary between centralized or distributed NFV applications, and could include policies for even energy distribution between sites, time-of-day etc.

5. Framework

To support the aforementioned use cases, it is required to support resource management capability which provides service chain (or NFP) construction and adaptation by considering resource state or constraints of VNF instances and VL instances which connect them. The resource management operations for service chain construction and adaptation can be divided into several sub-actions:

As listed above, VNF instances are relocated according to monitoring or evaluation results of performance metrics of the VNF instances and VL instances. Studies about evaluation methodologies and performance metrics for VNF instances and NFVI resources can be found at [ETSI-NFV-PER001] [I-D.liu-bmwg-virtual-network-benchmark] [I-D.morton-bmwg-virtual-net]. The performance metrics of VNF instances and VL instances specific to service chain construction and adaptation can be defined as follows:

The resource management functionality for dynamic service chain adaptation takes role of NFV orchestration with support of VNF manager (VNFM) and Virtualised Infrastructure Manager (VIM) in the NFV framework [ETSI-NFV-ARCH]. Detailed functional building block and interfaces are still under study.

6. Applicability to SFC

6.1. Related works in IETF SFC WG

IETF SFC WG provides a new service deployment model that delivers the traffic along the predefined logical paths of service functions (SFs), called service function chains (SFCs) with no regard of network topologies or transport mechanisms. Basic concept of the service function chaining is similar to VNF-FG where a network service is composed of SFs and deployed by making traffic flows traversed instances of the SFs in a pre-defined order.

There are several works in progress in IETF SFC WG for resource management of service chaining. [I-D.ietf-sfc-architecture] defines SFC control plane that selects specific SFs for a requested SFC, either statically or dynamically but details are currently outside the scope of the document. There are other works [I-D.ww-sfc-control-plane] [I-D.lee-sfc-dynamic-instantiation] [I-D.krishnan-sfc-oam-req-framework] [I-D.aldrin-sfc-oam-framework] which define the control plane functionality for service function chain construction and adaptation but details are still under study. While [I-D.dunbar-sfc-fun-instances-restoration] and [I-D.meng-sfc-chain-redundancy] provide detailed mechanisms of service chain adaptation, they focus only on resilience or fail-over of service function chains.

6.2. Integration in SFC control-plane architecture

In SFC WG, [I-D.ww-sfc-control-plane] describes requirements for conveying information between Service Function Chaining (SFC) control elements (including management components) and SFC functional elements. It also identifies a set of control interfaces to interact with SFC-aware elements to establish, maintain or recover service function chains.


                +----------------------------------------------+ 
                |                                              |
                |       SFC  Control & Management Planes       |
        +-------|                                              |
        |       |                                              |
        C1      +------^-----------^-------------^-------------+
 +---------------------|C3---------|-------------|-------------+
 |      |            +----+        |             |             |
 |      |            | SF |        |C2           |C2           |
 |      |            +----+        |             |             |
 | +----V--- --+       |           |             |             |
 | |   SFC     |     +----+      +-|--+        +----+          |
 | |Classifier |---->|SFF |----->|SFF |------->|SFF |          |
 | |   Node    |<----|    |<-----|    |<-------|    |          |
 | +-----------+     +----+      +----+        +----+          |
 |                     |           |              |            |
 |                     |C2      -------           |            |
 |                     |       |       |     +-----------+ C4  |
 |                     V     +----+ +----+   | SFC Proxy |-->  |
 |                           | SF | |SF  |   +-----------+     |
 |                           +----+ +----+                     |
 |                             |C3    |C3                      |
 |  SFC Data Plane Components  V      V                        |
 +-------------------------------------------------------------+

Figure 2: SFC control plane overview

The service chain adaptation addressed in this document may be integrated into the SFC Control & Management Planes and may use the C2 and C4 interfaces for monitoring or collecting the resource constraints of VNF instances, NFVI-PoP interconnects and VL instances.

To prevent constant integration between the application and probing functions we would propose a 3-tier architecture per NFVI-PoP.

  • Top level application control at the SFC Control & Management Planes
  • An abstraction layer between the application layer and the probing layer. This would decouple NFVI and link monitoring methods from the application layer
  • A probing layer that monitors VNF, physical and virtual link resources

Note that SFC does not assume that Service Functions are virtualized. Thus, the parameters of resource constraints may differ, and it needs further study for integration.

7. Security Considerations

TBD.

8. IANA Considerations

TBD.

9. References

9.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

9.2. Informative References

[ETSI-NFV-ARCH] ETSI, "ETSI NFV Architectural Framework v1.1.1", October 2013.
[ETSI-NFV-MANO] ETSI, "Network Function Virtualization (NFV) Management and Orchestration V0.6.3", October 2014.
[ETSI-NFV-PER001] ETSI, "Network Function Virtualization: Performance and Portability Best Practices v1.1.1", June 2014.
[ETSI-NFV-TERM] ETSI, "NFV Terminology for Main Concepts in NFV", October 2013.
[ETSI-NFV-WHITE] ETSI, "NFV Whitepaper 2", October 2013.
[I-D.aldrin-sfc-oam-framework] Aldrin, S., Krishnan, R., Akiya, N., Pignataro, C. and A. Ghanwani, "Service Function Chaining Operation, Administration and Maintenance Framework", Internet-Draft draft-aldrin-sfc-oam-framework-01, October 2014.
[I-D.dunbar-sfc-fun-instances-restoration] Dunbar, L. and A. Malis, "Framework for Service Function Instances Restoration", Internet-Draft draft-dunbar-sfc-fun-instances-restoration-00, April 2014.
[I-D.ietf-sfc-architecture] Halpern, J. and C. Pignataro, "Service Function Chaining (SFC) Architecture", Internet-Draft draft-ietf-sfc-architecture-09, June 2015.
[I-D.krishnan-sfc-oam-req-framework] Krishnan, R., Ghanwani, A., Gutierrez, P., Lopez, D., Halpern, J., Kini, S. and A. Reid, "SFC OAM Requirements and Framework", Internet-Draft draft-krishnan-sfc-oam-req-framework-00, July 2014.
[I-D.lee-sfc-dynamic-instantiation] Lee, S., Pack, S., Shin, M. and E. Paik, "SFC dynamic instantiation", Internet-Draft draft-lee-sfc-dynamic-instantiation-01, October 2014.
[I-D.liu-bmwg-virtual-network-benchmark] Liu, V., Liu, D., Mandeville, B., Hickman, B. and G. Zhang, "Benchmarking Methodology for Virtualization Network Performance", Internet-Draft draft-liu-bmwg-virtual-network-benchmark-00, July 2014.
[I-D.meng-sfc-chain-redundancy] Wu, C., Meng, W. and C. Wang, "Redundancy Mechanism for Service Function Chains", Internet-Draft draft-meng-sfc-chain-redundancy-01, December 2014.
[I-D.morton-bmwg-virtual-net] Morton, A., "Considerations for Benchmarking Virtual Network Functions and Their Infrastructure", Internet-Draft draft-morton-bmwg-virtual-net-03, February 2015.
[I-D.norival-nfvrg-nfv-policy-arch] Figueira, N., Krishnan, R., Lopez, D. and S. Wright, "Policy Architecture and Framework for NFV Infrastructures", Internet-Draft draft-norival-nfvrg-nfv-policy-arch-04, June 2015.
[I-D.ww-sfc-control-plane] Li, H., Wu, Q., Boucadair, M., Jacquenet, C., Haeffner, W., Lee, S., Parker, R., Dunbar, L., Malis, A., Halpern, J., Reddy, T. and P. Patil, "Service Function Chaining (SFC) Control Plane Components & Requirements", Internet-Draft draft-ww-sfc-control-plane-06, June 2015.

Authors' Addresses

Seungik Lee ETRI 218 Gajeong-ro Yuseung-Gu Daejeon, 305-700 Korea Phone: +82 42 860 1483 EMail: seungiklee@etri.re.kr
Sangheon Pack Korea University 145 Anam-ro, Seongbuk-gu Seoul, 136-701 Korea Phone: +82 2 3290 4825 EMail: shpack@korea.ac.kr
Myung-Ki Shin ETRI 218 Gajeong-ro Yuseung-Gu Daejeon, 305-700 Korea Phone: +82 42 860 4847 EMail: mkshin@etri.re.kr
EunKyoung Paik KT 17 Woomyeon-dong, Seocho-gu Seoul, 137-792 Korea Phone: +82 2 526 5233 EMail: eun.paik@kt.com
Rory Browne Intel Dromore House, East Park Shannon, Co. Clare Ireland Phone: +353 61 477400 EMail: rory.browne@intel.com