Network Working Group A. Aggarwal Internet-Draft Sun Microsystems, Inc. Expires: November 12, 2006 May 11, 2006 Extensions to NFSv4 for Checksums draft-aggarwal-nfsv4-cksum-01.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on November 12, 2006. Copyright Notice Copyright (C) The Internet Society (2006). Abstract This document provides motivation for enhancing the NFSv4 protocol to enable checksumming of data and describes extensions to NFSv4 in order to enable such a capability. Discussion and suggestions for improvements are requested. Aggarwal Expires November 12, 2006 [Page 1] Internet-Draft Extensions to NFSv4 for Checksums May 2006 Table of Contents 1. Security Considerations . . . . . . . . . . . . . . . . . . . 3 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 5 4. Checksum Extensions . . . . . . . . . . . . . . . . . . . . . 6 4.1. Checksum Algorithm Negotiation . . . . . . . . . . . . . . 6 4.2. Checksumming the READs and the WRITEs . . . . . . . . . . 6 4.3. New operation 40: CKINFO - Get server preferences on checksum algorithms . . . . . . . . . . . . . . . . . . . 6 4.4. New operation 41: CKSUM - checksum values . . . . . . . . 8 4.5. Checksum Algorithm Considerations . . . . . . . . . . . . 10 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 6. Checksum Futures . . . . . . . . . . . . . . . . . . . . . . . 13 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15 8.1. Normative References . . . . . . . . . . . . . . . . . . . 15 8.2. Informative References . . . . . . . . . . . . . . . . . . 15 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 16 Intellectual Property and Copyright Statements . . . . . . . . . . 17 Aggarwal Expires November 12, 2006 [Page 2] Internet-Draft Extensions to NFSv4 for Checksums May 2006 1. Security Considerations None. Aggarwal Expires November 12, 2006 [Page 3] Internet-Draft Extensions to NFSv4 for Checksums May 2006 2. Introduction The standard NFS protocol (without Kerberos) has historically relied on the underlying transport for data integrity and it itself does not make an attempt to ensure data integrity. Kerberised NFS provides data integrity via the use of krb5i. Version 4 of the Network File System (NFSv4) protocol specification [RFC3530] employs TCP and Ethernet as its underlying transport mechanism and it relies on these protocols to provide data integrity. In order to ensure data integrity, TCP uses a one's complement of 16- bit integers as its standard checksum. Its a weak checksum in that its not resilient to multiple single bit errors cancelling out as well as data re-arrangement and thus cannot be relied upon its error detection capabilities. Ethernet, on the other hand, uses a relatively strong checksum in the form of CRC32. However, this checksum is not end-to-end and is computed at every hop thus rendering the protection rather weak. Given that the underlying transports don't provide enough protection, the NFSv4 protocol (when deployed without Kerberos) needs to add its own data integrity mechanisms in order to ensure sane data. Much like other transport protocols, implementing checksums at the NFSv4 layer is an obvious choice. Checksums for NFSv4 might be implemented for the entire NFS payload or a part of the payload. This document proposes extensions to NFSv4 that might provide for a way to implement checksums for the READ/ WRITE data portion of NFSv4 payload. Aggarwal Expires November 12, 2006 [Page 4] Internet-Draft Extensions to NFSv4 for Checksums May 2006 3. Requirements One of the key requirements for NFSv4 checksums is to protect the NFSv4 payload against network corruption occuring due to deficiencies in the underlying transports. As part of providing data integrity, it would be ideal if such a scheme would allow for extending the protection domain from as close as possible to where the data is generated, i.e. the application on the client; to as close as possible to where its stored, i.e. the storage on the server. Since end-to-end integrity is undeniably a hard problem to solve, it would be a worthwhile to have the initial specification allow for extensions such that end-to-end integrity can be accomplished as part of future work. Finally, any data integrity scheme should allow for providing the flexibility to choose from a set of checksum algorithms. Aggarwal Expires November 12, 2006 [Page 5] Internet-Draft Extensions to NFSv4 for Checksums May 2006 4. Checksum Extensions 4.1. Checksum Algorithm Negotiation In order for the client and server to checksum the data, they need to arrive at a common checksum algorithm that both sides will use to compute the checksum values. This can be accomplished by having the client query the server for its preferences at the time of mounting the filesystem. The server's preferences will serve as a hint for the client as to what algorithm might be appropriate. If it has been a priori determined that the server supports checksums, the client will now add an extra operation, CKINFO, to the mount compound. The server will respond with a list of checksum algorithms it supports, in the order of preference. Given the server's preferences and its own preferences, the client will determine which algorithms to use. If the client determines that it cannot support any of the algorithms supported by the server, it must settle on using the "none" algorithm. That is, it must turn off checksumming. The checksum algorithm will need to be renegotiated in case of failover events, migration events and while crossing server boundaries. 4.2. Checksumming the READs and the WRITEs After the algorithm preferences have been determined, all READ and WRITE operations should be checksummed using the CKSUM operation. In the case of a read, CKSUM should succeed READ and in the case of a write, CKSUM should precede WRITE. 4.3. New operation 40: CKINFO - Get server preferences on checksum algorithms SYNOPSIS cksum_algs ARGUMENT void; Aggarwal Expires November 12, 2006 [Page 6] Internet-Draft Extensions to NFSv4 for Checksums May 2006 RESULT typedef struct cksum_alg4 { uint32_t alg_num;/*maps to an algorithm name*/ uint32_t alg_bits; }; typedef struct cksum_algs4 { uint32_t alg_len; cksum_alg4 alg_val<>; }; struct CKINFO4resok { cksum_algs4 cksum_algs; }; union CKINFO4res switch (nfsstat4 status) { case NFS4_OK: CKINFO4resok resok4; default: void; }; DESCRIPTION The CKINFO operation is added in the mount compound by the client to gather a list of checksum algorithm preferences that the server may have. The server responds by returning a list of algorithms it supports in the order of preference. The client should compute the intersection of algorithms supported by the server and itself. IMPLEMENTATION The client gathers checksum preferences at mount time. The preferences are essentially treated as hints as to which algorithms may be best suited for future READ or WRITE operations for a given file system. If a server implementation supports finer grained checksumming such as per-file based checksums, the preferences may often be invalid. In such cases, the CKSUM operation (See Section 4.4) will allow for the flexibility to enable per-file based checksums. Aggarwal Expires November 12, 2006 [Page 7] Internet-Draft Extensions to NFSv4 for Checksums May 2006 ERRORS TBD 4.4. New operation 41: CKSUM - checksum values SYNOPSIS cksum_alg, cksum_val, cksum_alg_pref ARGUMENT union nfs4_cksum switch (cksum_alg4 alg.alg_num) { case FLETCHER4: uint64_t cksum[4]; case SHA256: uint64_t cksum[2]; .. default: uint32_t error_status; }; struct CKSUM4args { cksum_alg4 cksum_alg; nfs4_cksum cksum_val; cksum_alg4 cksum_alg_pref<>; }; RESULT Aggarwal Expires November 12, 2006 [Page 8] Internet-Draft Extensions to NFSv4 for Checksums May 2006 struct CKSUM4resok { nfs4_cksum cksum_val; cksum_stat4 sub_status; cksum_alg4 cksum_alg; cksum_alg4 cksum_alg_pref<>; }; union CKSUM4res switch (nfsstat4 status) { case NFS4_OK: CKSUM4resok resok4; default: void; }; DESCRIPTION The CKSUM operation carries checksum value computed using the algorithm specified in cksum_alg. It also specifies the client's preferences in cksum_alg_pref. This operation is not intended to be used as a standalone operation, rather in conjunction with the READ or the WRITE operation. When used with READ operation, this operation should follow the READ. The client must set the checksum value to 0 to indicate that a checksum wasn't computed because CKSUM is being compounded with READ. The client should also set the checksum algorithm it prefers in cksum_alg and its list of preferences in cksum_alg_pref. If the server supports the requested checksum algorithm, it computes a checksum, over the READ data, using that algorithm. It also indicates its preferences (if different from the client's) by setting the cksum_alg_pref. If the server doesn't support the requested algorithm (say, it implements per-file checksums), it should compute the checksum using one of the algorithms in cksum_alg_pref and indicate the checksum algorithm used by setting cksum_alg as well as updating the cksum_alg_pref in the results. If the server is not able to support any of the algorithms out of cksum_alg_pref, it should indicate that by setting the sub-status to NFS4ERR_WRONGALG. The server may also consider specifying its preferences via cksum_alg_pref. The client should verify the checksum by computing a checksum over the READ data using the algorithm specified in cksum_alg. Aggarwal Expires November 12, 2006 [Page 9] Internet-Draft Extensions to NFSv4 for Checksums May 2006 When used with the WRITE operation, CKSUM should precede the WRITE. The client must compute the checksum over the WRITE data and set cksum_alg to the algorithm that was used to compute the checksum. It may also set indicate its algorithm preferences via cksum_alg_pref. The server must verify the checksum value by computing a checksum over the WRITE data using the algorithm specified in cksum_alg. If the checksum value matches, it should indicate success by setting the status to NFS4_OK. If the checksum value matches but the checksum algorithm, different than the one specified in cksum_alg, is more suitable, it may be indicated by setting the cksum_alg_pref. If the checksum verification failed due to incorrect checksum, the server will set the status to NFS4ERR_WRONGCKSUM. If the checksum verification failed due to inability to support a particular checksum algorithm, the server should indicate that by returning an NFS4ERR_WRONGALG and optionally setting the cksum_alg_pref. If the checksum verification failed due to an internal error, the server should set the status to NFS4ERR_CKSUM. IMPLEMENTATION On a READ or a WRITE, if the client believes it wants to read the data or write it regardless of the capabilities of the server, it may specify that by specifying the "none" algorithm in its preferences. Optionally, the client can just send the READ or a WRITE without following it with CKSUM if it doesn't want the server to use checksums entirely. The server should honour this by successfully reading or writing the data without checksumming. In the case of a READ, if the checksum verification on the client side fails, its most beneficial for the client to retry the READ atleast once in order to rule out transient errors. Likewise for a WRITE, if the checksum verification on the server fails, its most beneficial for the client to retry the WRITE atleast once in order to rule out transient errors. ERRORS NFS4ERR_WRONGCKSUM NFS4ERR_CKSUM NFS4ERR_WRONGALG TBD 4.5. Checksum Algorithm Considerations If the client and the server support checksums, they should support atleast one of the recommended algorithms. Taking a cue from some of Aggarwal Expires November 12, 2006 [Page 10] Internet-Draft Extensions to NFSv4 for Checksums May 2006 the other newer protocols like SCTP and iSCSI, CRC32 is likely a good candidate for a recommended algorithm. The "none" algorithm must also be included as a recommended algorithm. It is expected that the client and server may also support algorithms beyond the recommended set of algorithms. Algorithm names must not contain an at-sign("@"), a comma (",") or whitespace or control characters. The algorithm names are case- sensitive, and should not be longer than 256 characters [TBD: Is there value in providing algorithm names longer than 256 characters] User defined algorithms must be defined using names in the format name@userdomainname, e.g., "ourchecksum@sun.com". All the rules from above, except the use of the at-sign ("@") apply to user defined algorithms. Aggarwal Expires November 12, 2006 [Page 11] Internet-Draft Extensions to NFSv4 for Checksums May 2006 5. IANA Considerations The CKINFO and CKSUM operations carry algorithm numbers rather than carrying algorithm strings over the wire. This makes implementations easier as it eliminates the need for tedious string comparisons on the client. The constraint this brings about is that now there needs to be an agreement on which algorithm numbers correspond to which algorithm names. An IANA registry will need to be created to manage the algorithm namespace. Aggarwal Expires November 12, 2006 [Page 12] Internet-Draft Extensions to NFSv4 for Checksums May 2006 6. Checksum Futures This document proposes integrity protection for only the READ and WRITE data between the client and the server. The rest of the operations in the NFSv4 compound will not be protected. A natural extension to this document would be to enhance the checksum operations to enable protection for the entire NFSv4 compound. This may or may not be akin to the krb5i protection. Aggarwal Expires November 12, 2006 [Page 13] Internet-Draft Extensions to NFSv4 for Checksums May 2006 7. Acknowledgements Spencer Shepler and David Robinson Aggarwal Expires November 12, 2006 [Page 14] Internet-Draft Extensions to NFSv4 for Checksums May 2006 8. References 8.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3530] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., Beame, C., Eisler, M., and D. Noveck, "Network File System (NFS) version 4 Protocol", RFC 3530, STD 1, April 2003. [Shepler] Shepler, S., "NFS version 4 Minor Version 1", draft-ietf-nfsv4-minorversion1-00 .txt, October 2005. 8.2. Informative References [RFC1071] Braden, R., Borman, D., and C. Partridge, "Computing the Internet Checksum", RFC 1071, September 1988. [RFC1146] Zweig, J. and C. Partridge, "TCP Alternate Checksum Options", RFC 1146, March 1990. [RFC3309] Stone, J., Stewart, R., and D. Otis, "Stream Control Transmission Protocol (SCTP) Checksum Change", RFC 3309, September 2002. [RFC3385] Sheinwald, D., Satran, J., Thaler, P., and V. Cavanna, "Internet Protocol Small Computer System Interface (iSCSI) Cyclic Redundancy Check (CRC)/Checksum Considerations", RFC 3385, September 2002. [Stone] Stone, J., Greenwald, M., Partridge, C., and J. Hughes, "Performance of Checksums and CRC's over Real Data", IEEE/ ACM Transactions on Networking, Vol. 6,No. 5, October 1998. Aggarwal Expires November 12, 2006 [Page 15] Internet-Draft Extensions to NFSv4 for Checksums May 2006 Author's Address Alok Aggarwal Sun Microsystems, Inc. 500 Eldorado Blvd. MS: UBRM05-171 Broomfield, CO 80021 USA Email: alok.aggarwal@sun.com Aggarwal Expires November 12, 2006 [Page 16] Internet-Draft Extensions to NFSv4 for Checksums May 2006 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2006). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Aggarwal Expires November 12, 2006 [Page 17]