NFSv4 D. Noveck, Ed.
Internet-Draft HPE
Intended status: Informational P. Shivam
Expires: February 18, 2017 C. Lever
B. Baker
August 17, 2016

NFSv4 migration: Implementation Experience and Specification Issues


The migration feature of NFSv4 provides for moving responsibility for a single filesystem from one server to another, without disruption to clients. Recent implementation experience has shown problems in the existing specification for this feature. This document discusses options to cure issues which have arisen. It also explains the choices made in updating the NFSv4.0 specification and those to be made with regard to the NFSv4.1 specification, in order to properly address migration.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on February 18, 2017.

Copyright Notice

Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents ( in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction

This document is in the informational category, and while the facts it reports may have normative implications, any such normative significance reflects the readers' preferences. For example, we may report that the reboot of a client with migrated state results in state not being promptly cleared and that this will prevent granting of conflicting lock requests at least for the lease time, which is a fact. While it is to be expected that client and server implementers will judge this to be a situation that is best avoided, the judgment as to how pressing this issue should be considered is a judgment for the reader, and eventually the nfsv4 working group to make.

We do explore possible ways in which such issues can be avoided, with minimal negative effects, given that the working group has decided to address these issues, but the choice of exactly how to address these is best given effect in one or more standards-track documents and/or errata.

This document focuses on NFSv4.0, since that is where the majority of implementation experience has been. Nevertheless, there is discussion of the implications of the NFSv4.0 experience for migration in NFSv4.1, as well as discussion of other issues with regard to the treatment of migration in NFSv4.1.

2. Conventions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

In the context of this informational document, these normative keywords will always occur in the context of a quotation, most often direct but sometimes indirect. The context will make it clear whether the quotation is from:

3. NFSv4.0 Implementation Experience

3.1. Implementation Issues

Note that the examples below reflect current experience which arises from clients implementing the recommendation to use different nfs_client_id4 id strings for different server addresses, i.e. using what is later referred to herein as the "non-uniform client-string approach."

This is simply because that is the experience implementers have had. The reader should not assume that in all cases, this practice is the source of the difficulty. It may be so in some cases but clearly it is not in all cases.

3.1.1. Failure to Free Migrated State on Client Reboot

The following sort of situation has proved troublesome:

Note here that while it seems clear to us in this example that C-XYZ and C-ABC are from the same client, the server has no way to determine the structure of the "opaque" id string. In the protocol, it really is treated as opaque. Only the client knows which nfs_client_id4 values designate the same client on a different server.

3.1.2. Server Reboots Resulting in a Confused Lease Situation

Further problems arise from scenarios like the following.

Note that if the client used "C" (rather than "C-ABC") as the nfs_client_id4 id string, the exact same situation would arise.

One of the first cases in which this sort of situation has resulted in difficulties is in connection with doing a SETCLIENTID for callback update.

The SETCLIENTID for callback update only includes the nfs_client_id4, assuming there can only be one such with a given nfs_client_id4 value. If there were multiple, confirmed client records with identical nfs_client_id4 id string values, there would be no way to map the callback update request to the correct client record. Apart from the migration handling specified in [RFC7530], such a situation cannot arise.

One possible accommodation for this particular issue that has been used is to add a RENEW operation along with SETCLIENTID (on a callback update) to disambiguate the client.

When the client updates the callback info to the destination, the client would, by convention, send a compound like this:

{ RENEW clientid4, SETCLIENTID nfs_client_id4,verf,cb }

The presence of the clientid4 in the compound would allow the server to differentiate among the various leases that it knows of, all with the same nfs_client_id4 value.

While this would be a reasonable patch for an isolated protocol weakness, interoperable clients and servers would require that the protocol truly be updated to allow such a situation, specifically that of multiple clientid4's with the same nfs_client_id4 value. The protocol is currently designed and implemented assuming this cannot happen. We need to either prevent the situation from happening, or fully adapt to the possibilities which can arise. See Section 4 for a discussion of such issues.

3.1.3. Client Complexity Issues

Consider the following situation:

Now, instead of a clientid4 identifying a client-server pair, we have many more entities for the client to deal with. In addition, it isn't clear how new state is to be incorporated in this structure.

The limitations of the migrated state (inability to be freed on reboot) would argue against adding more such state but trying to avoid that would run into its own difficulties. For example, a single lockowner string presented under two different clientids would appear as two different entities.

Thus we have to choose between:

In any case, we have gone (in adding migration as it was described) from a situation in which

To one in which

This sort of additional client complexity is troublesome and needs to be eliminated.

3.2. Sources of Protocol Difficulties

3.2.1. Issues with nfs_client_id4 Generation and Use

In [RFC7530], the section entitled "Client ID" says:

There are two possible interpretations of the phrase "uniquely defines" in the above:

The first interpretation would make these client-strings like phone numbers (a single person can have several) while the second would make them like social security numbers.

Debate about the possible meanings of "uniquely defines" in this context is quite possible but not very helpful. The following points should be noted though:

Given the need for the server to be aware of client identity with regard to migrated state, either client-string construction rules will have to change or there will be a need to get around current issues, or perhaps a combination of these two will be required. Later sections will examine the options and propose a solution.

One consideration that may indicate that this cannot remain exactly as it has been derives from the fact that the current explanation for this behavior is not correct. In [RFC7530], the section entitled "Client ID" says:

In point of fact, a "SETCLIENTID with the same id string" sent to multiple network addresses will be treated as all from the same client but will not "cause the server to begin the process of removing the client's previous leased state" unless the server believes it is a different instance of the same client, i.e. if the id string is the same and there is a different boot verifier. If the client does not reboot, the verifier should not change. If it does reboot, the verifier will change, and it is appropriate that the server "begin the process of removing the client's previous leased state.

The situation of multiple SETCLIENTID requests received by a server on multiple network addresses is exactly the same, from the protocol design point of view, as when multiple (i.e. duplicate) SETCLIENTID requests are received by the server on a single network address. The same protocol mechanisms that prevent erroneous state deletion in the latter case prevent it in the former case. There is no reason for special handling of the multiple-network-appearance case, in this regard.

3.2.2. Issues with Lease Proliferation

It is often felt that this is a consequence of the client-string construction issues, and it is certainly the case that the two are closely connected in that non-uniform client-strings make it impossible for the server to appropriately combine leases from the same client.

However, even where the server could combine leases from the same client, it needs to be clear how and when it will do so, so that the client will be prepared. These issues will have to be addressed at various places in the protocol specification.

This could be enough only if we are prepared to do away with the "should" recommending non-uniform client-strings and replace it with a "should not" or even a "SHOULD NOT". Current client implementation patterns make this an unpalatable choice for use as a general solution, but it is reasonable to "RECOMMEND" this choice for a well-defined subset of clients. One alternative would be to create a way for the server to infer from client behavior which leases are held by the same client and use this information to do appropriate lease mergers. Prototyping and detailed specification work has shown that this could be done but the resulting complexity is such that a better choice is to "RECOMMEND" use of the uniform client-string approach for clients supporting the migration feature.

Because of the discussion of client-string construction in [RFC7530], most existing clients implement the non-uniform client-string approach. As a result, existing servers may not have been tested with clients implementing uniform client-strings. As a consequence, care must be taken to preserve interoperability between UCS-capable clients and servers that don't tolerate uniform client strings for one reason or another.

4. Issues Requiring Resolution in NFSv4.0

4.1. Changes to nfs_client_id4 Client-string

The fact that the reason given in client-string-BP3 is not valid makes the existing "should" insupportable. We can't either

What are often presented as reasons that motivate use of the non-uniform approach always turn out to be cases in which, if the uniform approach were used, the server will treat a client which accesses that server via two different IP addresses as part of a single client, as it in fact is. This may be disconcerting to a client unaware that the two IP addresses connect to the same server. This is not a reason to use the non-uniform approach but is better thought of as an illustration of the fact that those using the uniform approach need to be aware of the possibility of server trunking and its potential effect on server behavior.

Since it is possible to reliably infer the existence of trunking of server IP addresses from observed server behavior, use of the uniform approach is more desirable, although compatibility issues need to be dealt with.

An alternative to having the client infer the existence of trunking of IP server addresses, is to make this information available to the client directly. See Section 4.3 for details.

It is always possible that a valid new reason will be found, but so far none has been proposed. Given the history, the burden of proof was on those asserting the validity of a proposed new reason.

So we will assume that the "should" needs to go. The question was what to replace it with.

4.2. Changes to Handle Differing nfs_client_id4 String Values

Given the difficulties caused by having different nfs_client_id4 client-string values for the same client, we had two choices:

4.3. Potential Changes to Add a New Operation

It might be possible to return server-identity information to the client, just as is done in NFSv4.1 by the response to the EXCHANGE_ID operation. This could be done by a SETCLIENTID_PLUS optional operation, which acts like SETCLIENTID, except that it returns server identity information. Such information could be used by clients, making it possible to for them to be aware of server trunking relationships, rather than having to infer them from server behavior.

It has been generally thought that protocol extensions such as this are not appropriate in bis documents and other documents updating NFSv4 protocol definition RFC's. However, [NFSv4-vers] discusses means by which protocol extensions, similar to those allowed between minor versions, could be used to correct protocol mistakes.

A decision to adopt this approach would require waiting for [NFSv4-vers] to become a Proposed Standard. In view of the time necessary for that to happen, this approach was not available in an RFC updating [RFC7530], such as [RFC7931]. Still, it is worth keeping in mind, if implementers have difficulties inferring trunking relationships using the techniques discussed there.

4.4. Other Issues Within Migration-state Sections

There are a number of issues where the existing text is unclear and/or wrong and needs to be fixed in some way.

4.5. Issues Within Other Sections

There are a number of cases in which certain sections, not specifically related to migration, require additional clarification. This is generally because text that is clear in a context in which leases and clientids are created in one place and live there forever may need further refinement in the more dynamic environment that arises as part of migration.

Some examples:

5. Resolution of NFSv4.0 Protocol Difficulties

This section lists the changes that were necessary to resolve the difficulties mentioned above. Such changes, along with other clarifications found to be desirable during drafting and review are contained in [RFC7931].

5.1. Changes Regarding nfs_client_id4 Client-string

It was decided to replace client-string-BP3 with the following text:

In addition, given the importance of the issue of client identity and the fact that both client string-approaches are to be considered valid, a greatly expanded treatment of client identity was desirable. It had the following major elements.

5.2. Changes Regarding Merged (vs. Synchronized) Leases

In [RFC7530], the section entitled "Migration and State" says:

There are a number of problems with this and any resolution of our difficulties must address them somehow.

To avoid client complexity, we need to have no more than one lease between a single client and a single server. This requires merger of leases since there is no real help from synchronizing them at a single instant.

For the uniform approach, the destination server would simply merge leases as part of state transfer, since two leases with the same nfs_client_id4 values must be for the same client.

We have made the following decisions as far as proposed normative statements regarding for state merger. They reflect the facts that we want to allow full migration support in the simplest way possible and that we can't say MUST since we have older clients and servers to deal with.

If servers obey the SHOULD and clients choose to adopt the uniform id approach, having more than a single lease for a given client-server pair will be a transient situation, cleaned up as part of adapting to use of migrated state.

Since clients and servers will be a mixture of old and new and because nothing is a MUST we have to ensure that no combination will show worse behavior than is exhibited by current (i.e. old) clients and servers.

5.3. Other Changes to Migration-state Sections

5.3.1. Changes Regarding Client ID Migration

In [RFC7530], the section entitled "Migration and State" says:

This poses some difficulties, mostly because the part about "client ID" is not clear:

We have decided that it is best to address this issue as follows:

5.3.2. Changes Regarding Callback Re-establishment

In [RFC7530], the section entitled "Migration and State" says:

The above will need to be fixed to reflect the possibility of merging of leases,


In [RFC7530], the section entitled "Notification of Migrated Lease" says:

There is a lack of clarity that is prompted by ambiguity about what exactly probing is and what the interlock between client and server must be. This has led to some worry about the scalability of the probing process, and although the time required does scale linearly with the number of filesystems that the client may have state for with respect to a given server, the actual process can be done efficiently.

To address these issues, the text above had to be rewritten to be more clear and to give suggestions about how to do the required scanning efficiently.

5.4. Changes to Other Sections

5.4.1. Callback Update

Some changes are necessary to reduce confusion about the process of callback information update and in particular to make it clear that no state is freed as a result:

5.4.2. clientid4 Handling

To address both of the clientid4-related issues mentioned in Section 4.5, it was necessary to replace the last three paragraphs of the section entitled "Client ID" with the following:

5.4.3. Handling of NFS4ERR_CLID_INUSE

It appears to be the intention that only a single principal be used for client establishment between any client-server pair. However:

As a result, servers exist which reject a SETCLIENTID simply because there already exists a clientid for the same client, established using a different IP address. Although this is generally understood to be erroneous, such servers still exist and the spec should make the correct behavior clear.

Although the error name cannot be changed, the following changes should be made to avoid confusion:

6. Issues for NFSv4.1

Because NFSv4.1 embraces the uniform client-string approach, as advised by section 2.4 of [RFC5661], addressing migration issues is simpler.

Nevertheless, there are some issues that will have to be addressed. Some examples:

Discussion of how to resolve these issues will appear in the sections below.

6.1. Addressing state merger in NFSv4.1

The existing treatment of state transfer in [RFC5661], has similar problems to that in [RFC7530] in that it assumes that the state for multiple filesystems on different servers will not be merged to so that it appears under a single common clientid. We've already seen the reasons that this is a problem, with regard to NFSv4.0.

Although we don't have the problems stemming from the non-uniform client-string approach, there are a number of complexities in the existing treatment of state management in the section entitled "Lock State and File System Transitions" in [RFC5661] that make this non-trivial to address:

6.2. Addressing pNFS relationship with migration

This is made difficult because, within the PNFS framework, migration might mean any of several things:

Migration needs to support both the first and last of these models.

6.3. Addressing server owner changes in NFSv4.1

Section 2.10.5 of [RFC5661] states the following.

While this paragraph is literally true in that such reconfiguration events can happen and clients have to deal with them, it is confusing in that it can be read as suggesting that clients have to deal with them without disruption, which in general is impossible.

A clearer alternative would be:

7. Security Considerations

With regard to NFSv4.0, the Security Considerations section of [RFC7530] encourages clients to protect the integrity of the SECINFO operation, any GETATTR operation for the fs_locations attribute. A needed change is to include the operations SETCLIENTID/SETCLIENTID_CONFIRM as among those for which integrity protection is recommended. A migration recovery event can use any or all of these operations.

With regard to NFSv4.1, the Security Considerations section of [RFC5661] takes proper care of migration-related issues. No change is needed.

8. IANA Considerations

This document does not require actions by IANA.

9. Acknowledgements

The editor and authors of this document gratefully acknowledge the contributions of Trond Myklebust of NetApp and Robert Thurlow of Oracle. We also thank Tom Haynes of NetApp and Spencer Shepler of Microsoft for their guidance and suggestions.

Special thanks go to members of the Oracle Solaris NFS team, especially Rick Mesta and James Wahlig, for their work implementing an NFSv4.0 migration prototype and identifying many of the issues documented here.

10. References

10.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997.
[RFC5661] Shepler, S., Eisler, M. and D. Noveck, "Network File System (NFS) Version 4 Minor Version 1 Protocol", RFC 5661, DOI 10.17487/RFC5661, January 2010.
[RFC7530] Haynes, T. and D. Noveck, "Network File System (NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530, March 2015.
[RFC7931] Noveck, D., Shivam, P., Lever, C. and B. Baker, "NFSv4.0 Migration: Specification Update", RFC 7931, DOI 10.17487/RFC7931, July 2016.

10.2. Informative References

[NFSv4-vers] Noveck, D., "NFSv4 Version Management", July 2016.

Work in progress.

Authors' Addresses

David Noveck (editor) Hewlett Packard Enterprise 165 Dascomb Road Andover, MA 01810 US Phone: +1 978 474 2011 EMail:
Piyush Shivam Oracle Corporation 5300 Riata Park Ct. Austin, TX 78727 US Phone: +1 512 401 1019 EMail:
Charles Lever Oracle Corporation 1015 Granger Avenue Ann Arbor, MI 48104 US Phone: +1 248 614 5091 EMail:
Bill Baker Oracle Corporation 5300 Riata Park Ct. Austin, TX 78727 US Phone: +1 512 401 1081 EMail: