INTERNET-DRAFT C. Charles Fan Expires: September 2005 Rainfinity Dave Noveck Network Appliance Mario Wurzl EMC March 2005 NFSv4 Global Namespace Problem Statement draft-fan-nfsv4-global-namespace-00.txt Status of this Memo By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, or will be disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet- Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (C) The Internet Society (2004). All Rights Reserved. Abstract NFS is one of the primary data access protocols for NAS, and naturally NFS users have been demanding a global namespace for NFS. This document intends to explain the rational for a global namespace, why it is an important feature for a network file system Fan, Noveck, Wurzl Expires September 2005 [Page 1] Internet Draft Global Namespace Problem Statement March 2005 protocol, and the problems that a global namespace for files would solve. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . 2 2. The Applications . . . . . . . . . . . . . . . . . . . . 3 3. The Requirements . . . . . . . . . . . . . . . . . . . . 4 4. NFSv4 Global Namespace . . . . . . . . . . . . . . . . . 7 5. Suggested Approach . . . . . . . . . . . . . . . . . . . 7 Acknowledgements . . . . . . . . . . . . . . . . . . . . 8 Normative References . . . . . . . . . . . . . . . . . . 8 Informative References . . . . . . . . . . . . . . . . . 8 Author's Address . . . . . . . . . . . . . . . . . . . . 9 Full Copyright Statement . . . . . . . . . . . . . . . . 9 1. Introduction During recent years, the range and complexity of Network Attached Storage (NAS) deployments has increased greatly. Such trends as grid computing, virtualization, and information lifecycle management have helped to drive these wider, more complex deployments. As a result, the need for tools to allow more manageable interaction with these more complex uses of NAS technology has brought forth a new set of requirements for what NAS users need to have in their environments. A global namespace for files has emerged as one common requirement from NAS users. NFS is one of the primary data access protocols for NAS, and naturally NFS users have been demanding a global namespace for NFS. [Thurlow] This document intends to explain the rationale for a global namespace, why it is an important feature for a network file system protocol, and the problems that a global namespace for files would solve. Fundamentally, how a network file-system object is named for access should be independent of where it is placed for storage. A global namespace is a logical organization of files-system objects for user access. Users expect it to be uniform, location independent, transparent to location changes, and is typically hierarchical. The creators and owners of file-system objects are the ones who decide how these objects should be named and where they should be found for access within the hierarchy of the global namespace. On the other hand, administrators are the ones to decide where the files should be physically stored. In a large enterprise storage Fan, Noveck, Wurzl Expires September 2005 [Page 2] Internet Draft Global Namespace Problem Statement March 2005 environment, there are a large number of NAS storage nodes, often distributed among multiple geographical sites. The files may be distributed across the enterprise to satisfy a number of different goals including optimizing cost, optimizing performance and providing greater convenience of access and greater availability of the NAS data. A global namespace provides a unified view for data access, independent of where data is physically placed. This namespace consists of mappings between names in the namespace and their corresponding physical locations, and acts as a road map to guide users to the locations of the data they seek. This namespace is uniform and consistent for all users who access the data, and its presentation to the clients remains the same even when mappings change. 2. The Applications This section lists a few applications that would benefit from a global namespace. 2.1 NAS Storage Virtualization As NAS storage infrastructure scales up, each data site may house a large number of NAS storage units in a data center, and it is common to have multiple data centers in an enterprise. It is desired that the data stored by these NAS units be managed as a single coherent space, rather than as a large number of disjoint data islands. A global namespace is an important enabler for NAS storage virtualization. It provides a single-system view for data access into the enterprise-wide NAS storage. A global namespace is desired to be re-configurable online: when the physical location of some data changes, the global namespace mapping is updated and users continue accessing the data without being interrupted. With a global namespace in place, a virtualization solution can dynamically balance capacities among the physical NAS units behind the scenes, without affecting any of the user access. 2.2 Replication for Load Balancing & Grid Computing Storage grid is an integral part of any grid computing architecture. A well-designed global namespace is a necessary component in creating a storage grid using NAS. Multiple data replicas can be created, and the global namespace will be able to transparently guide clients to the appropriate replica based on Fan, Noveck, Wurzl Expires September 2005 [Page 3] Internet Draft Global Namespace Problem Statement March 2005 load, distance, and access requirement (read or write). This would effectively multiply the throughput capabilities of a storage grid, transparently to the users and applications. 2.3 Transparent Migration A similar benefit applies to migration. A global namespace can keep the data presentation to the users and applications constant, while updating the mapping to the new location after the migration is complete. This avoids the disruptive post-migration unmount and re-mount, while allowing the clients to find the right location for the requested data. Not only will this benefit one-time migration when users consolidate or upgrade storage equipment, but also make possible dynamic capacity balancing or hierarchical storage management applications that frequently move the data. 2.4 Transparent Fail-over With a global namespace, a high availability solution can handle the case where the backup node does not fully assume the identity of the primary node. The global namespace gracefully points the clients to the new location of the data, and makes the fail-over a more transparent experience for the end-users and applications. This capability can be applied to online storage management as well. Without a failure, global namespace facilities can be used to point the clients to another server, while this server is being serviced. After the service is completed, the clients can then be redirected back. The existence of a global namespace greatly helps reducing downtime during a normal storage management process. 2.5 Information Lifecycle Management As data go through different phases in their lifecycle, their storage requirements change. An ILM solution optimizes the storage for the data according to the phase of their lifecycle. A global namespace provides a way to guide clients to the appropriate physical locations of the data at the time, without the use of stubs on the file storage themselves. 3. The Requirements There are at least three different kinds of namespaces that have been referred to as global namespace for file storage: Fan, Noveck, Wurzl Expires September 2005 [Page 4] Internet Draft Global Namespace Problem Statement March 2005 1 Intra-cluster namespace. This is the unified namespace for a set of NAS servers in a tightly-coupled or aggregated cluster. People refer to it as "global" namespaces, as opposed to the "local" namespace of each node in the cluster. Many propri- etary intra-cluster namespace schemes exist today as part of single-vendor solutions. 2 Enterprise namespace. This is the most requested form of "global namespace" by enterprise storage administrators. An enterprise namespace provides a uniform view into network file storage for an entire enterprise. 3 World-wide namespace. This makes possible the "world-wide file storage", with a global URL to each file. This could be achieved by an extension of the enterprise namespace scheme. This draft focuses on the enterprise namespace. Enterprise file storage environment will continue to grow and continue to be het- erogeneous. Standardization supports interoperability between dif- ferent vendors, and having a standards-based namespace solution for NFSv4 will help the wide adoption of the protocol. What are the requirements for an enterprise-wide namespace? Here is a list of fundamental requirements: - Location Independence: The namespace tree should be structured to reflect organizational or logical associations, independent of the physical location of the data. This implies that there needs to be some sort of location table that serves to link the logical namespace and the physical locations. - Uniformity of View: There should be a single location table of the namespace that all agree is authoritative. This implies the existence of a root server and/or central repository for an enterprise domain, but does not imply that each client must mount into this unified namespace in the same way. - Transparency: It is desired that when the physical location of the data changes due to administrative reasons, for example by migration or replication, the namespace presented to the client applications remain constant. The update of the names- pace map entry can be achieved transparently to the clients. The client applications continue running, namespace remain constant, while the data is now from a different physical location. Fan, Noveck, Wurzl Expires September 2005 [Page 5] Internet Draft Global Namespace Problem Statement March 2005 - Security: The deployment of a namespace solution must not com- promise the security of data access. In addition to the above requirements, we must ask the following questions as well: - Granularity of namespace mapping: Whether the namespace map- ping can happen at the file system granularity, or directory granularity, or file granularity, or sub-file granularity? - Hierarchical Mapping: Is it possible for namespace entry /a/b to link to file server A, while /a/b/c to link to file server B? - Variable Support: Depending on variables such as client OS, client geographical location, or time-of-day, can the names- pace mapping be different? This is highly desired in many customer environments. - Manageability. Can the namespace be accessed and modified real-time by administrators? by applications? by user groups? How fast does a namespace mapping change propagate to all clients? - Cycle Prevention. Will the namespace tree be guaranteed to be acyclic? - Multi-protocol Interoperability. Will NFSv2 and v3 clients be able to use this same namespace? Will this namespace be syn- chronized with the CIFS namespace? - A viable global namespace solution will need to be location independent, unified, transparent and secure. We should also consider additional user requirements to make sure we have a solution that addresses the needs of enterprise storage admin- istrators. 4. NFSv4 Global Namespace For NFS v2/v3 environments, the most popular namespace solution implemented is automounter daemon with automounter maps centrally managed at NIS server or LDAP server. The popularity of this solu- tion shows that it addresses some of the namespace requirements outlined. In particular, it supports the "location independence" requirement at export granularity, the "uniformity of view" requirement and the "security" requirement. In addition, it Fan, Noveck, Wurzl Expires September 2005 [Page 6] Internet Draft Global Namespace Problem Statement March 2005 supports hierarchical mapping and wildcard variables. Because there is no server to server redirect, there is no cycle issues here either. So why do some NFS enterprise users still ask for a "global names- pace"? What is lacking in an automounter-based solution? First, the update of the automounter map is not completely transparent. Clients which have applications running and keeping the old mount active will not let go the old mount. For some versions of some OS's, even after the mount become inactive, the old mount still won't be released. Dealing with the varieties of client OS's and versions, this is a difficult problem to completely solve. Secondly the granularity of this solution is at the export level. For some of the above mentioned applications that require a global namespace, such as Load Balancing and ILM applications, finer gran- ularity (directory, file, and sub-file) is desired. In addition, some administrators have had experiences with the global namespace solution from other network file access protocols, such as CIFS and AFS. CIFS includes specification of Dfs links that supports the deployment of Dfsroot namespace server. AFS can dynamically map its volumes to different physical locations by the use of Volume Location Database (VLDB). They desire comparable functionality be available in NFSv4. RFC3530 [RFC3530] specifies basic functionality useful for imple- menting an NFSv4 global namespace, either by using solely the facilities within RFC3530, or by augmenting them through features to be added in a minor version. 5. Suggested Approach First, we could choose a central repository, such as LDAP, for the namespace mappings. We can work to define a standard schema for the NFS namespace mappings. This work is not part of the NFSv4 protocol itself, but it is reasonable for us to address it for the specific case of NFS namespace. We should define well how servers and clients can access this central repository, and how it should support not only NFSv4, but v3 and v2 clients as well. Second, we need to clarify the client-server interactions based on RFC 3530. This work is already under way, with both implementation and suggestions for errata for RFC 3530. Both the migration case and the pure referral case need to be fully considered. [Noveck] The security issues should also be considered that the proposed scheme doesn't compromise existing level of security. The hope is Fan, Noveck, Wurzl Expires September 2005 [Page 7] Internet Draft Global Namespace Problem Statement March 2005 that this challenge will be overcome, and we'll be able to have the first client, server and namespace server reference implementation of the basic use of the NFS4ERR_MOVED and fs_location. Third, we should define a mechanism with which clients in the enterprise know where to find the root for the NFS enterprise namespace. One simple solution is to leverage the DNS domain, and set up a convention that the DNS name nfsroot always corresponds to the root namespace server. The root namespace server can refer clients to other namespace servers. Schemes should be designed to enforce that the relationship between namespace servers is hierar- chical and not cyclical. This scheme can be extended to support world-wide NFS namespace as well. Next, with NFSv4.x clients accessing the namespace through the namespace server via NFS protocol, it is then possible to enhance the protocol in the form of minor versions to support better trans- parency and finer granularity and better manageability. Possible enhancements in 4.x that may worth some discussion include file- level referrals, lifetime on file handles, additional client-server exchange of variable values, etc. Acknowledgements The authors would like to thank Andy Adamson, Ted Anderson and Robert Thurlow for their helpful comments on the draft. Normative References [RFC3530] S. Shepler, et. al., "NFS Version 4 Protocol", Standards Track RFC Informative References [Noveck] D. Noveck, C. Burnett, "Implementation Guide for Referrals in NFSv4", IETF Internet Draft, draft-noveck-nfsv4-refer- rals-00.txt [Thurlow] R. Thurlow, "A Namespace For NFS Version 4", IETF Internet Draft, draft-thurlow-nfsv4-namespace-00.txt Author's Address Fan, Noveck, Wurzl Expires September 2005 [Page 8] Internet Draft Global Namespace Problem Statement March 2005 C. Charles Fan Rainfinity 2740 Zanker Road San Jose, CA 95134 USA Phone: +1 408 382 4755 EMail: fan@rainfinity.com Full Copyright Statement Copyright (C) The Internet Society (2004). This document is sub- ject to the rights, licenses and restrictions contained in BCP 78 and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REP- RESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC docu- ments can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this speci- fication can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf- ipr@ietf.org. Fan, Noveck, Wurzl Expires September 2005 [Page 9] Internet Draft Global Namespace Problem Statement March 2005 Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society. Fan, Noveck, Wurzl Expires September 2005 [Page 10]