DNS Extensions Working Group                                  G. Barwood
Internet-Draft                                                           
Intended status: Informational                        September 28, 2008
Expires: March 2009


                       Resolver side mitigations
              draft-barwood-dnsext-fr-resolver-mitigations-02

Status of This Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire in March 2009   .

Abstract

   Describes mitigations against spoofing attacks on DNS, as follows:

   (1) Prepending a random nonce to the question where a referral is 
       probable.

   (2) Repeating the query, including techniques for handling 
       non-deterministic responses.

   (3) Estimating the entropy available, taking into account 
      (a) Observed packets with incorrect IDs.
      (b) Records where the owner name does not match the question.
      (c) The previous content of the cache.
   





Barwood                Expires March 2009                       [Page 1]

Internet-Draft               Resolver mitigations         September 2008


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3

   2.  Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . .  3

   3.  Mitigations  . . . . . . . . . . . . . . . . . . . . . . . . .  4
     3.1.  Prepend a random nonce label to the question.  . . . . . .  4
     3.2.  Repeat the query . . .   . . . . . . . . . . . . . . . . .  5
     3.3   Include Bad IDs in entropy calculation . . . . . . . . . .  7
     3.4   Use of calculated entropy  . . . . . . . . . . . . . . . .  7

   4. Analyis . . . . . . . . . . . . . . . . . . . . . . . . . . . .  8
   4.1.  Random nonce . . . . . . . . . . . . . . . . . . . . . . . .  8
   4.2.  Query repetition . . . . . . . . . . . . . . . . . . . . . .  8
   4.3.  Impact on Root and TLD . . . . . . . . . . . . . . . . . . .  8
   4.4.  Impact on other levels . . . . . . . . . . . . . . . . . . .  9
   4.5.  Impact of the Kaminsky check . . . . . . . . . . . . . . . .  9
   4.6   Lame servers and the random nonce. . . . . . . . . . . . . .  9

   5.  Security Considerations  . . . . . . . . . . . . . . . . . . . 10

   6.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 10

   7.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 10

   8.  Informative References . . . . . . . . . . . . . . . . . . . . 10



























Barwood                 Expires March 2009                      [Page 2]

Internet-Draft               Resolver mitigations         September 2008


1.  Introduction

   This document describes mitigations that a resolver can currently
   deploy to resist spoofing attacks on DNS, without server software
   being updated.

   The context in which these solutions were explored is CERT
   Vulnerability Note VU#800113, "Multiple DNS implementations 
   vulnerable to cache poisoning".   

   The Kaminsky attack proceeds by asking a recursive DNS server 
   a series of questions, each with a different random prefix, 
   and then sending spoof packets to the server, containing 
   additional records with genuine owner names but invalid data. 
   For example:

   Query: 
   Question <nonce>.com A

   Spoof response:
   Question <nonce>.com A
   Authority: example.com NS ns.evil.com

   The effect is to inject an invalid record into the cache.

   Since the ID field in the DNS packet header is only 16 bits, a 
   DNS server that does not deploy any mitigations can be 
   compromised in a matter of seconds.

   [ An implementation of the techniques described can accessed at 
     http://www.george-barwood.pwp.blueyonder.co.uk/DnsServer/ ]

2.  Criteria

   These are resolver side solutions, thus only the resolver needs to be 
   redeployed, or the software updated.  This allows updated resolvers
   to be deployed immediately.

   The solutions have to follow the DNS protocol.

   The solutions have to be practical, non disruptive, and not 
   anti-social.   





   






Barwood                 Expires March 2009                      [Page 3]

Internet-Draft               Resolver mitigations         September 2008
   
   
3.  Mitigations

   Below, the resolver side mitigations are described.
   
   The techniques are especially, but not solely, applicable where port 
   randomization is not possible, due to NAT devices or other reasons.

   Not described are port randomization, and 0x20, which are both 
   nevertheless recommmended as methods of obtaining additional entropy. 

3.1. Prepend a random nonce label to the question.

   This should be used where a referral is probable.

   It allows an amount of entropy to be encoded limited only by the 256
   character limit on a question, provided the authority server returns 
   a copy of the question in the response.

   If the response is not a referral*, the response should be discarded, 
   and the query repeated without the nonce.

   * That is any of the following are observed:
     (a) The response is Authoritative ( AA bit is set in the header ).
     (b) There is an error ( RCODE is not zero ).
     (c) The answer section is not empty.
     (d) The authority section is empty. 

   A simple heuristic for deciding where a referral probable is:

   (1) If the Bailiwick is Root, and the last label in the question is 
   a known TLD, a referral is probable.

   (2) If the Bailiwick is a TLD, a referral is probable.

   (3) Otherwise a referral is not probable.

   If the heuristic fails, this may be recorded so subsequent retries 
   are avoided.

   A static list of TLDs (or other domains) may be used to initialise 
   the heuristic. If this list is not up to date, extra queries may be
   generated, but no loss of functionality will occur.












Barwood                 Expires March 2009                       [Page 4]

Internet-Draft               Resolver mitigations         September 2008


3.2.  Query repetition

   By repeating the query, additional entropy may be obtained. A 
   practical problem occurs when responses are non-deterministic, that 
   is many different responses are obtained for the same question.
    
   In this case, the resolver will need to perform an analysis to 
   produce a converged result, or to report server failure (or a 
   security warning, if this is possible) if convergence has not 
   been achieved after some iteration limit.

   RFC 2181 introduced the concept of "RRset Integrity", and this needs 
   to be taken into account.

   Resolvers may decide to ditch RRset Integrity for some Types, for 
   non-deterministic servers, if the alternative is unacceptable security 
   or failure to resolve a name.

   In particular, for most of the types defined in RFC 1034/1035, RRset 
   integrity may not be essential.

   The suggested method is to accumulate entropy for various attributes 
   of the response, such as Rcode, AA bit, for each RRset the number 
   of alternate values, and for each distinct record RData and TTL.
   Provided these converge, plausible RRsets may be synthesised, and 
   name resolution can proceed. Care must be taken to eliminate 
   duplicate records in a single response.

   For example, suppose the question is MX records for example.com.

   First response:
   example.com MX mail1.example.com
   example.com MX mail2.example.com

   Second response:
   example.com MX mail2.example.com  ( mail2.example.com confirmed)
   example.com MX mail3.example.com

   Third response:
   example.com MX mail3.example.com ( mail3.example.com confirmed )
   example.com MX mail4.example.com 

   Plausible result:
   example.com MX mail2.example.com
   example.com MX mail3.example.com

   The semantic model here is that 2 MX records are to be offered, 
   but the selection does not matter.  
   
 
   
   

   
Barwood                    Expires March 2009                   [Page 5]

Internet-Draft               Resolver mitigations         September 2008
  


   Another possibility where convergence is slow is to resolve glue. For 
   example:

   First response:
   example.com NS ns1.example.com
   example.com NS ns2.example.com
   ..
   example.com NS ns9.example.com
   ns1.example.com A 0.0.0.1
   ns2.example.com A 0.0.0.2   
   ..
   ns9.example.com A 0.0.0.9

   Second response:
   example.com NS ns1.example.com
   example.com NS ns2.example.com
   ..
   example.com NS ns9.example.com
   ns1.example.com A 0.0.0.2
   ns2.example.com A 0.0.0.3   
   ..
   ns8.example.com A 0.0.0.9
   ns9.example.com A 0.0.0.1

   Converged result:
   example.com NSA 0.0.0.1
   example.com NSA 0.0.0.2
   ..
   example.com NSA 0.0.0.9

   where NSA is an internal pseudo-type with the obvious meaning.

   Some in-essential information is lost, but resolution can 
   still proceed.

   This may all sound quite daunting, but early practical experiments  
   show that commonly encountered non-deterministic servers select 
   values from very small pools (in short time intervals), and show
   simple behavior. A more comprehensive survey of such servers would 
   be useful, unfortunately the author does not have access to
   the resources needed to carry out such a survey properly.












Barwood                  Expires March 2009                     [Page 6]

Internet-Draft               Resolver mitigations         September 2008

3.3.  Include observed Bad IDs in entropy calculation

   When a response is received, an entropy calculation may be performed
   to estimate how many bits have been checked.

   It will typically include 16 bits for the ID, 0x20 bits, 
   bits from the prepended nonce, and discount for unusual / 
   non-standard features (such as IP mismatch, question not copied).

   The number of incorrect IDs observed while waiting for a response 
   should be included in the calculation, for example the logarithm 
   (base 2) of the number of Bad IDs could be subtracted.

   The result of the calculation should be used to decide whether to
   repeat the query. This allows a smooth response to attacks, while 
   not detracting from performance in the normal situation where Bad 
   IDs are not observed.

   While this measure does not reduce the number of packets required 
   for a successful attack, it does increase the time required, since 
   an attacker gains nothing from sending spoof packets at a very
   high rate.

3.4.  Use of calculated entropy

   The entropy calculated in 3.3 should be used to decide whether 
   a value is to be accepted as valid, which in turn affects whether
   the query needs to be repeated as described in 3.2.

   Other factors in this decision should be:

   (1) Whether the value is already in the cache.
   (2) If so, the TTL status of the cache entry. 
   (3) Whether the name of the record being updated matchs 
       ( ends with ) the query question. This is intended 
       to be a further mitigation (in addition to 3.3) against 
       Kaminsky attacks.

   For example, the test for whether a value is valid could be

   E + [C] > 50 + K

   where 
     E is the value computed in 3.3 
     C is Zero if the value is not already in the cache
       Otherwise 30 - [D/1000]
       where D is the number of seconds since the cache entry expired
     K is 10 if the RR name does not match the question otherwise 0
   and [] denotes that zero is substituted if enclosed term is negative.






Barwood                     Expires March 2009                  [Page 7]

Internet-Draft               Resolver mitigations         September 2008

4. Analysis

  This section is intended to be less formal, to give some insight
  into the rationale for the recommendations given in section 3,
  and to discuss possible adverse effects.

  The intention is that these mitigations have minimal effects, other 
  than to make DNS spoof attacks impractical.  

4.1.  Random nonce
  It is conceiveable that the random prepended nonce cause problems
  with memory management for some servers.

  For example if a server normalised all incoming strings, and
  never reclaimed the memory, failure would rapidly occur.
  
  Such servers, if they exist, are severely broken and subject to
  denial of service attacks.

  It is expected that high performance authoritative servers 
  reclaim all memory allocated to process a query on completion 
  of the transaction.

  Nevertheless it would be wise to research this issue before large 
  scale deployment.

4.2.  Query repetition
  Query repetition should have no impact other than on server load.
  Servers do not normally retain any state information about clients
  after the query/response transaction completes.

4.3.  Impact on Root and TLD servers

  The random nonce (3.1) is valuable because it means that no 
  extra queries to Root and top level servers are needed in normal 
  operation (except in very rare cases). This is important because 
  these servers constitute the shared public base of the DNS, so the
  stability of these servers is very important.

  The exceptions are the initial root "priming" query and queries 
  for non-existent domains. For the root domain, by assuming 
  that every child domain has an SOA record, Name Errors need not 
  be retried.   While this assumption is currently correct (and is 
  also observed to be true for net and com domains), implementors 
  need to carefully weigh any performance advantage with the risk 
  that the assumption may not be valid in future.

  Clients in general should implement user interfaces that make it
  unlikely that users will enter invalid domain names, and that   
  errors are properly notified, so they can be corrected. However 
  this is outside the scope of this document.




Barwood                     Expires March 2009                  [Page 8]

Internet-Draft               Resolver mitigations         September 2008

  In practice, the most root server queries emanate from 
  mis-configured software, so in any case proportional effect on 
  root servers will be small. It is important that negative results be 
  properly cached.

4.4.  Impact on other levels

  For the example test given in 3.4, two queries are usually
  required the first time a record is fetched. However when the 
  TTL expires, the refresh operation only requires a single query.

  It is expected that such refresh operations dominate proper
  DNS traffic, so the impact should be minimal.

  Operators of authoritative servers have several options if 
  the query repetition may cause overload.

  (a) Increase unreasonably low TTLs.
  (b) Use names with more alpha characters (to take advantage of 0x20).
  (c) Implement support for the proposed AL record or equivalent.

  The latter implies that agreeing a specification for the preoposed
  AL record type (or EDNS Ping equivalent) would be useful. 

4.5.  Impact of the Kaminsky check

  In practice, this check ( for the example test given in 3.4 ), rarely 
  causes additional queries to be generated. It mainly affects NS and 
  glue records, which are normally already established in the cache.

4.6   Lame servers and the random nonce

  In order to resolve domain names where servers are incorrectly
  configured, it may be necessary to use a query without the nonce.

  A current example is resolving the IP addresses for the name servers 
  for www.iahc.org, which are ns2.ar.com and ns3.ar.com.
  
  The com nameservers generate a referral for the question 
  <nonce>.ns2.ar.com, which leads only to lame name servers, but the 
  IP address for a non-lame server when the nonce is omitted.

  Thus when lame servers are detected, special logic to allow name
  resolution to still occur is needed.

  Of course a resolver may choose to merely report failure in this
  case, however this may not be practical.


 





Barwood                 Expires March 2009                      [Page 9]

Internet-Draft               Resolver mitigations         September 2008


5.  Security Considerations

   All of the mitigations aim to provide more security. Query repetition
   has an obvious adverse effect on performance and bandwith.

   Each query repetition provides an extra attack opportunity, so the 
   total entropy requirement may be adjusted to reflect this.

   The random nonce may expose internal state to an attacker who 
   controls a name server. It is essential that a cryptographically
   strong source of random numbers be used to generate IDs, 0x20 bits 
   and prepended nonces. This must be seeded from data that cannot be
   guessed by an attacker, such as thermal noise or other random 
   physical fluctuations.

   A sufficently determined attacker may cause a denial of service, 
   due to a very large number of Bad IDs reducing the effective entropy
   to zero. In practice, denial of service would probably occur due 
   to the extreme number of incoming packets.

6.  IANA Considerations

   No direct considerations.
   Indirectly, the TYPE code for AL record described in 4.4.


7.  Acknowledgments

   Thanks to Nicholas Weaver (ICSI Berkeley) and Wouter Wijngaards (NLnet
   Labs). The idea of prepending a nonce may be due to Paul Vixie (ISC).

8.  Informative References

   [RFC2181]  Elz, R. and R. Bush, "Clarifications to the DNS
              Specification", RFC 2181, July 1997.

Author's Address

   George Barwood
   33 Sandpiper Close
   Gloucester 
   GL2 4LZ
   United Kingdom

   Phone: +44 452 722670
   EMail: george.barwood@blueyonder.co.uk
   Skype: george.barwood







Barwood                    Expires March 2009                  [Page 10]

Internet-Draft               Resolver mitigations         September 2008


Full Copyright Statement

   Copyright (C) The IETF Trust (2008).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.















Barwood                  Expires March 2009                    [Page 11]