DNS Extensions Working Group G. Barwood Internet-Draft Intended status: Experimental 28 September 2009 Expires: March 2010 DNS Transport draft-barwood-dnsext-dns-transport-11 Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on March 29, 2010. Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Abstract Describes an experimental transport protocol for DNS. IP fragmentation is avoided, blind spoofing, amplification attacks and other denial of service attacks are prevented. Latency for a typical DNS query is a single round trip, after a setup exchange that establishes a long term shared secret. No per-client server state is required between transactions. The protocol may have other applications. Barwood Expires March 2010 [Page 1] Internet-Draft DNS Transport September 2009 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Fragmentation. . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Spoofing . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.3 Server state . . . . . . . . . . . . . . . . . . . . . . . . 4 2.4 Amplification attacks . . . . . . . . . . . . . . . . . . . 4 2.5 Packet retransmission . . . . . . . . . . . . . . . . . . . 4 2.6 Performance . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . 5 3.2 Setup request . . . . . . . . . . . . . . . . . . . . . . . 5 3.3 Setup response . . . . . . . . . . . . . . . . . . . . . . . 5 3.4 Initial request . . . . . . . . . . . . . . . . . . . . . . 5 3.5 Server response : single page . . . . . . . . . . . . . . . 6 3.6 Server response : multi page . . . . . . . . . . . . . . . . 6 3.7 Follow-up request . . . . . . . . . . . . . . . . . . . . . 7 3.8 Encryption and Authentication . . . . . . . . . . . . . . . 7 3.9 Congestion control . . . . . . . . . . . . . . . . . . . . . 8 4. Security Considerations . . . . . . . . . . . . . . . . . . . 9 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 7. Normative References . . . . . . . . . . . . . . . . . . . . . 9 Appendix A. Implementation of Cookies . . . . . . . . . . . . . . 10 Appendix B. Anycast considerations . . . . . . . . . . . . . . . 10 Authors Address . . . . . . . . . . . . . . . . . . . . . . . . . 10 Barwood Expires March 2010 [Page 2] Internet-Draft DNS Transport September 2009 1. Introduction DNSSEC implies that DNS responses may be large, possibly larger than the de facto ~1500 byte internet MTU. Large responses are a challenge for DNS transport. EDNS [RFC2671] was introduced in 1999 to allow larger responses to be sent over UDP, previously DNS/UDP was limited to a 512 bytes. EDNS is problematic for several reasons: (1) It allows amplification attacks against 3rd parties. DNS/UDP has always been susceptible to these attacks, but EDNS has increased the amplification factor by an order of magnitude. (2) The IP protocol specifies a means by which large IP packets are split into fragments and then re-assembled. However fragmented UDP responses are undesirable for several reasons: o Fragments may be spoofed. The DNS ID and port number are only present in the first fragment, and the IP ID may be easy for an attacker to predict. o In practice fragmentation is not reliable, and large UDP packets may fail to be delivered. o If a single fragment is lost, the entire response must be re-sent. o Re-assembling fragments requires buffer resources, which opens up denial of service attacks. Instead, it is possible to use TCP, but this is undesirable, as TCP imposes increased latency and significant server state that may be vulnerable to denial of service attack. In addition, support for TCP is not universal. Nearly all current DNS traffic is carried by UDP with a maximum size of 512 bytes, and relying on TCP is a risk for the deployment of DNSSEC. Therefore a new protocol is proposed, with menomic QRP, to stand for "Quick Response Protocol". Barwood Expires March 2010 [Page 3] Internet-Draft DNS Transport September 2009 2. Requirements 2.1 Fragmentation As described in the introduction, fragmentation is undesirable. However, fragmentation is unavoidable if the path MTU is too small. Therefore, we require only that fragmentation does not occur provided the actual path MTU is at least the MTU sent by the client. 2.2 Spoofing Blind spoofing attacks must be prevented. 2.3 Server state No per-client server state should be needed between transactions. 2.4 Amplification attacks Amplification attacks against third parties must be prevented. 2.5 Packet re-transmission Only lost IP packets must be re-transmitted. This reduces problems due to network congestion. 2.6 Performance Each transaction ( for moderate response sizes ) must be performed in 1 RTT, after setup, provided that no IP packets are lost. Barwood Expires March 2010 [Page 4] Internet-Draft DNS Transport September 2009 3. Protocol 3.1 Overview Communication is over UDP in two stages. First a long-lived SERVERTOKEN is acquired by the client. Subsequent queries are protected by the SERVERTOKEN. Throughout, DNS Payload refers to a DNS Message [RFC1035], not including the 16-bit ID field. DNS server support for the protocol is signaled by a general purpose method [TPORT]. Each UDP packet starts with a 16-bit opcode, followed by a 64-bit QUERYID that identifies the transaction. These fields are not shown. All numbers are unsigned integers, with the first bit being the most significant. Reserved areas must be omitted by the sender and ignored by the receiver. 3.2 Setup request The client acquires a SERVERTOKEN for a given Server by sending a packet with opcode 1. The variable length portion is reserved. 3.3 Setup response The server response has opcode 1, format : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SERVERTOKEN | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \ RESERVED \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ where SERVERTOKEN is a 32 bit value computed as a secure hash of the client IP Address and a long-term server secret. The client associates SERVERTOKEN, and the client IP address ( for multi-homed clients ) with the Server. 3.4 Initial request To make a DNS request, a packet is sent with opcode 2, format: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SERVERTOKEN | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MTU | COUNT | RESERVED | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \ DATA \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ where : Barwood Expires March 2010 [Page 5] Internet-Draft DNS Transport September 2009 SERVERTOKEN is a copy of SERVERTOKEN from the setup response. MTU is a 16-bit number that limits the size of the IP packets used to send the response. Must be at least 600. COUNT limits the number of pages the server will send. DATA is the DNS payload. 3.5 Server response : single page The server checks SERVERTOKEN, and divides the response DNS payload into equal size pages, so that the size of each IP packet is not greater than MTU. Servers should use a smaller MTU if the path MTU is known to be less than the MTU supplied by the client. If there is only one page, the response has opcode 2, format : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \ DATA \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ where DATA is the DNS payload. The client uses DATA as the normal DNS response. 3.6 Server response : multi page If there is more than one page, the server sends multiple packets, each with opcode 3 and format : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TOTAL | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | COOKIE | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | COUNT | PAGE | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PAGESIZE | \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \ \ DATA \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ where : TOTAL is the size of the complete DNS payload. COOKIE is a 64-bit value used to request further pages. COUNT is the number of pages sent. PAGE is the 0-based number of this page. PAGESIZE is the size into which the response has been divided. Barwood Expires March 2010 [Page 6] Internet-Draft DNS Transport September 2009 DATA is part of the DNS payload. The client allocates an assembly buffer of TOTAL bytes (if not already allocated), and copies DATA into it at offset PAGE x PAGESIZE. Servers may send a smaller number of pages than requested, for policy reasons, or if there is local congestion. The pages sent have numbers 0 .. COUNT-1. 3.7 Follow-up request If the client does not receive a page, due to packet loss or not all pages being sent, it sends a packet with opcode 3, format : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SERVERTOKEN | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | COOKIE | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | COUNT | PAGE | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PAGESIZE | \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \ \ DATA \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ where : SERVERTOKEN is a copy of the SERVERTOKEN from the setup response. COOKIE is a copy of COOKIE from the server response. COUNT is the number of pages requested. PAGE is the number of the first page to be sent. PAGESIZE is a copy of PAGESIZE from the server response. DATA is a copy of DATA from the initial request. The server response is the same as in section 3.6. If the server encounters an error condition, such as an invalid SERVERTOKEN or COOKIE, it sends a setup response (section 3.3), and the client retries with a new initial request (section 3.4). Once a client has received all pages, it processes the complete assembled response as normal. 3.8 Encryption and Authentication. Opcode 4 is used to optionally encrypt and authenticate packets. Barwood Expires March 2010 [Page 7] Internet-Draft DNS Transport September 2009 For client requests, the 16-bit opcode is followed by o A 12 byte client nonce. o A 50 byte client public key. o A cryptographic box containing a 16 byte MAC and the encrypted packet. For server responses, the opcode is followed by o A 12 byte client nonce ( copied from the request ). o A 12 byte server nonce. o A cryptographic box containing a 16 byte MAC and the encrypted packet. In both cases the 8 byte QUERYID is omitted from the unencrypted packet, which starts with the underlying opcode. The 12 byte client nonce is used instead to identify the transaction. The encryption and authentication algorithms are described in [NACL]. Public keys are stored in QRPK resource records, which are included in the Authority section or the response whenever an A or AAAA record is sent. The wire format is the 50 byte public key. The presentation format is the base32 encoded string, using "0123456789BCDFGHJKLMNPQRSTUVWXYZ" as the alphabet for the encoding, which is case insensitive. Where the parent zone does not have QRPK support, or the domain owner is unable to upload a QRPK record to the parent zone, public keys may be encoded in a label of a name server, using the prefix "QK-" followed by the presentation format encoding of the public key. 3.9 Congestion control Clients should take into account estimated network performance when requesting pages. Factors are : o The round trip time (RTT). o The time to transmit 1 packet due to limited bandwidth (PT). The total number of pages requested but not received or lost (INFLIGHT) should not be more than max( RTT / PT, 4 ) at any time. For example, if RTT = 150 milli-seconds, the bandwidth is 300 kilobytes per second and the packet size is 1500 bytes, then we have PT = 5 milli-seconds and INFLIGHT = 30. The setup response should be used to initialize RTT to a value not exceeding 30 milli-seconds. Clients must use a conservative value for PT, not less than 5 milli-seconds, until a proper estimate is available. In practice for normal DNS usage, INFLIGHT can simply be set to 4, and this should be sufficient to send the complete response in a single round trip, assuming the MTU is 1500 bytes. Barwood Expires March 2010 [Page 8] Internet-Draft DNS Transport September 2009 4. Security Considerations Fragmented responses are vulnerable to blind spoofing, so should be avoided if possible. Clients should check for fragmented responses if possible, and should not use records from fragmented responses unless they can be verified with DNSSEC. To prevent an attacker generating a large number of IP packets from a single request, a check should be made that the MTU is at least 600. Clients should impose limits on the maximum size response (TOTAL) they will accept, to prevent attacks by malicious servers. Secret values, that is the long term server secret and QUERYID should be generated so that an attacker cannot easily guess them, by using cryptographic random number generators seeded from data that cannot be guessed by an attacker, such as thermal noise or other random physical fluctuations. The hash function used to compute SERVERTOKEN must be cryptographically secure. Amplification attacks from previous users of the client IP address on the current user are not prevented by the protocol. Therefore servers should change their long term secret occasionally to invalidate old SERVERTOKENs. 5. IANA Considerations The UDP port to which requests are sent. The [TPORT] DNS transport protocol identifier. The resource record type code for the QRPK record. 6. Acknowledgments Mark Andrews, Alex Bligh, Robert Elz, Alfred Hoenes, Douglas Otis, Nicholas Weaver and Wouter Wijngaards were each instrumental in creating and refining this specification. 7. Normative References [RFC1035] Mockapetris, P., "Domain names - implementation and specification", STD 13, RFC 1035, November 1987. [RFC2671] Vixie, P., "Extension Mechanisms for DNS (EDNS0)", RFC 2671, August 1999. [TPORT] Barwood, G., "DNS Transport Signal", IETF dnsext draft, September 2009. [NACL] Bernstein, D., "Cryptography in NaCl", March 2009. Barwood Expires March 2010 [Page 9] Internet-Draft DNS Transport September 2009 Appendix A. Implementation of Cookies The suggested implementation of cookies is by version numbers. Each RRset has a version number assigned from a 64-bit local clock that is incremented when the database is updated. The version of a response is the largest version number of the associated RRsets. The cookie is the version number. If the database is updated while a transfer is progress, a COOKIE error occurs, and the client restarts the transfer. Alternatively, if old queries may be replayed, COOKIE errors may be avoided( however such errors should be rare ). Appendix B. Anycast considerations Anycast DNS servers need to operate consistently. There are (at least ) two possibilities: (a) Each server within the Anycast system issues distinct SERVERTOKENS. If the Anycast routing changes, a SERVERTOKEN error occurs, and the client restarts the query. (b) Each server within the Anycast system has the same long term secret, and thus issues the same SERVERTOKEN to a given client. A global clock is used for issuing updates. If the Anycast routing changes and an update is in progress, a COOKIE error may occur, and the client has to restart the query. Such errors can be avoided by not serving updates until all the Anycast servers have received a copy. Author's Address George Barwood 33 Sandpiper Close Gloucester GL2 4LZ United Kingdom Phone: +44 452 722670 EMail: george.barwood@blueyonder.co.uk Barwood Expires March 2010 [Page 10]