Individual Submission L-J. Liman Internet-Draft Autonomica Updates: 1123 (if approved) J. Abley Intended status: Standards Track ICANN Expires: October 24, 2010 April 22, 2010 Top Level Domain Name Specification draft-liman-tld-names-02 Abstract The syntax for allowed Top-Level Domain (TLD) labels in the Domain Name System (DNS) is not clearly applicable to the encoding of Internationalised Domain Names (IDNs) as TLDs. This document provides a concise specification of TLD label syntax based on existing syntax documentation, extended minimally to accommodate IDNs. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on October 24, 2010. Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must Liman & Abley Expires October 24, 2010 [Page 1] Internet-Draft Top Level Domain Name Specification April 2010 include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. TLD Label Syntax Specification . . . . . . . . . . . . . . . . 6 4. Policy Considerations . . . . . . . . . . . . . . . . . . . . 7 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 6. Security Considerations . . . . . . . . . . . . . . . . . . . 9 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 7.1. Normative References . . . . . . . . . . . . . . . . . . . 10 7.2. Informative References . . . . . . . . . . . . . . . . . . 10 Appendix A. Change History . . . . . . . . . . . . . . . . . . . 11 A.1. draft-liman-tld-names-02 . . . . . . . . . . . . . . . . . 11 A.2. draft-liman-tld-names-01 . . . . . . . . . . . . . . . . . 11 A.3. draft-liman-tld-names-00 . . . . . . . . . . . . . . . . . 11 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12 Liman & Abley Expires October 24, 2010 [Page 2] Internet-Draft Top Level Domain Name Specification April 2010 1. Introduction The syntax of Top-Level Domain (TLD) labels in the Domain Name System (DNS) was specified somewhat imprecisely in [RFC1123] which required that TLD names be "alphabetic". This is commonly interpreted as excluding the hyphen character and numeric digits from TLD labels. This restriction does not accommodate the US-ASCII encoding of Internationalised Domain Names (IDNs), as specified in [I-D.ietf-idnabis-defs]. A more detailed discussion of the existing specifications can be found in Section 2. This document updates the definition of allowable top-level domain names to support IDNs but places some restrictions on the choice of IDN labels, consistent with the existing specification for US-ASCII TLD labels. See Section 3 for the updated specification. This document focuses narrowly on the issue of allowable labels in TLDs and does not (and is not intended to) make any other changes or clarifications to existing domain name syntax rules. It is carefully noted that the specification in this document is not the only factor in choosing TLD labels, and that many considerations external to the IETF are included in that wider policy. See Section 4 for more discussion of policy considerations. Liman & Abley Expires October 24, 2010 [Page 3] Internet-Draft Top Level Domain Name Specification April 2010 2. Background [RFC0952] defines a host name as follows: 'A "name" ... is a text string up to 24 characters drawn from the alphabet (A-Z), digits (0-9), minus sign (-), and period (.). Note that periods are only allowed when they serve to delimit components of "domain style names". (See RFC-921, "Domain Name System Implementation Schedule", for background). No blank or space characters are permitted as part of a name. No distinction is made between upper and lower case. The first character must be an alpha character. The last character must not be a minus sign or period.' [Unnumbered section titled "ASSUMPTIONS", first paragraph] [RFC1123] reaffirms this definition, making two additional changes to the syntax: 1. 'The syntax of a legal Internet host name was specified in RFC- 952 [DNS:4]. One aspect of host name syntax is hereby changed: the restriction on the first character is relaxed to allow either a letter or a digit. Host software MUST support this more liberal syntax.' [Section 2.1] 2. 'However, a valid host name can never have the dotted-decimal form #.#.#.#, since at least the highest-level component label will be alphabetic.' [Section 2.1] Neither [RFC0952] nor [RFC1123] explicitly states the reasons for these restrictions. It might be supposed that human factors considerations were involved; [RFC1123] appears to suggest that one of the reasons was to prevent confusion between dotted-decimal IPv4 addresses and host domain names. In any case, it is reasonable to believe that the restrictions are often assumed in deployed software, and that changing the rules should be undertaken with caution. The Internationalised Domain Names in Applications (IDNA) 2008 specification provides a protocol for encoding unicode strings in DNS labels. The unicode string used by applications is known as a U-Label; its corresponding encoding in the DNS is known as an A-Label. The terms A-Label and U-Label are used in this document as defined in [I-D.ietf-idnabis-defs]. Valid A-Labels always contain non-alphabetic characters. In order to accommodate the wish to express TLD names in scripts other than Latin (or rather, the US-ASCII subset of Latin), it is necessary to allow non-alphabetic characters in TLD DNS labels. To make the change as small as possible, we restrict the U-label form of Liman & Abley Expires October 24, 2010 [Page 4] Internet-Draft Top Level Domain Name Specification April 2010 the label in ways functionally compatible with the restrictions (from [RFC0952] and [RFC1123]) on US-ASCII labels for TLDs. This restriction will not enable new A-label TLDs to function with existing software that checks DNS top-level labels for conformance with the alphabetic restriction. It merely makes the same traditional rule work in a similar way for IDNA top-level labels. Liman & Abley Expires October 24, 2010 [Page 5] Internet-Draft Top Level Domain Name Specification April 2010 3. TLD Label Syntax Specification This document relaxes the existing specification to allow TLD labels to be well-formed A-Labels, but places restrictions on the corresponding U-Labels. The ABNF expression that matches a valid TLD label is as follows: tldlabel = traditional-tld-label / idn-label traditional-tld-label = 1*63(ALPHA) idn-label = Restricted-A-Label ALPHA = %x41-5A / %x61-7A ; A-Z / a-z Restricted-A-label is an A-Label as defined in [I-D.ietf-idnabis-defs] converted from (and convertible to) a U-Label that is consistent with the definition in [I-D.ietf-idnabis-defs] and that is further restricted to contain only Unicode characters with the derived property value PVALID and of General Category "L" or "Mn". Note that "L" contains several sub-categories, and that many characters of the "L" category do not have the derived property value PVALID. Most emphatically, this specification does not allow use of any character in a Restricted-A-Label that has derived property value of CONTEXT. More specifically, a Restricted-A-Label consists of one or more Unicode characters such that all of the following statements are true: 1. the label is a valid A-Label according to [I-D.ietf-idnabis-defs]; 2. the derived property value of all codepoints, as defined by [I-D.ietf-idnabis-defs], is PVALID; 3. the general category of all codepoints, is one of { Ll, Lo, Lm, Mn }. This new specification reflects current practice in registration of TLD names by the IANA, extended to accommodate IDNs. Liman & Abley Expires October 24, 2010 [Page 6] Internet-Draft Top Level Domain Name Specification April 2010 4. Policy Considerations This document provides a technical specification for TLD label syntax; it does not encapsulate a complete policy under which TLD names may be chosen. At the time of writing, the policy under which TLD names are chosen is developed and maintained by ICANN in consultation with a wide base of stakeholders. As the Internet continues to grow to serve new user communities, applications and services, it is to be expected that the corresponding policy will be changed accordingly. Liman & Abley Expires October 24, 2010 [Page 7] Internet-Draft Top Level Domain Name Specification April 2010 5. IANA Considerations This document makes no requests of the IANA. This section needs IANA review. :-) Liman & Abley Expires October 24, 2010 [Page 8] Internet-Draft Top Level Domain Name Specification April 2010 6. Security Considerations This document is believed to have limited security implications. The creation of new TLDs has the potential to conflict with software which (for example) does not accommodate TLD labels which did not exist at the time the software was written. Such software problems might in turn lead to security vulnerabilities; e.g., in the event that a DNS name specified by a user is truncated or otherwise misinterpreted, causing an application to interact with a different remote host from that which the user intended. It should be noted that this is not a new phenomenon, and has been observed following the creation of new TLDs prior to the publication of this document. The issue of characters that can be confused with each other is a risk, but it is discussed at length in the Security Considerations section of [I-D.ietf-idnabis-defs]. Liman & Abley Expires October 24, 2010 [Page 9] Internet-Draft Top Level Domain Name Specification April 2010 7. References 7.1. Normative References [RFC1123] Braden, R., "Requirements for Internet Hosts - Application and Support", STD 3, RFC 1123, October 1989. [I-D.ietf-idnabis-defs] Klensin, J., "Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework", draft-ietf-idnabis-defs-13 (work in progress), January 2010. 7.2. Informative References [RFC0952] Harrenstien, K., Stahl, M., and E. Feinler, "DoD Internet host table specification", RFC 952, October 1985. Liman & Abley Expires October 24, 2010 [Page 10] Internet-Draft Top Level Domain Name Specification April 2010 Appendix A. Change History This section (and sub-sections) should be removed before publication/ A.1. draft-liman-tld-names-02 Wordsmithing and rearrangement of text following discussions with Joe Abley, Tina Dam, Thomas Narten and Andrew Sullivan. Incorporated revised ABNF and associated specification from Patrik Faltstrom. A.2. draft-liman-tld-names-01 Substantial comments and improvements supplied by Thomas Narten and John Klensin. Decided to go for a minimal change approach. Also noted that U-labels have to be letters due to jumping digit problem. Rewritten major parts. A.3. draft-liman-tld-names-00 First cut. Prompted by Olafur Gudmundsson and Tina Dam. Liman & Abley Expires October 24, 2010 [Page 11] Internet-Draft Top Level Domain Name Specification April 2010 Authors' Addresses Lars-Johan Liman Autonomica AB Franzengatan 5 SE-112 51 Stockholm Sweden Email: liman@autonomica.se URI: http://www.autonomica.se/ Joe Abley ICANN 4676 Admiralty Way Suite 330 Marina del Rey 90292 USA Email: joe.abley@icann.org URI: http://www.icann.org/ Liman & Abley Expires October 24, 2010 [Page 12]