Internet Draft Sun Guonian draft-guonian-idn-ace-eval-cn-00.txt CNNIC Jul 10, 2001 Expires Jan 10, 2002 Evaluation of various ACEs with existing Chinese Domain Names Status of this memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract The ACE (ASCII Compatible Encoding) Design Team in IDN Working Group is selecting an appropriate ACE proposal. To push forward the work of ACE Design Team, this document illustrates the results of various ACEs that being applied to Chinese Domain Names. 1. Test data and tools These test data were sampled from CNNIC's current registry database, with the total number of 100,000. Applied ACEs are RACE-03 [RACE], BRACE-00 [BRACE], LACE-01 [LACE], UTF-6-00 [UTF6], DUDE-02 [DUDE], AMC-ACE-M-00 [AMCACEM], AltDUDE-01 [AltDUDE], AMC-ACE-O-00 [AMCACEO], AMC-ACE-R-01 [AMCACER], AMC-ACE-V-00 [AMCACEV], AMC-ACE-W-00 [AMCACEW], MACE-01 [MACE] and LDUDE-00[LDUDE]. 2. Result of each ACE HANZI_len is the characters in the Chinese string before UCS2-ACE coding, max_len is the maximum length that a ACE-coded string can reach, min_len is the minimum length that a ACE-coded string can reach, aver_len is the average length of all ACE-coded strings converted from the Chinese string with the same HANZI_len-size. 2.1 AMC-ACE-M HANZI_len max_len min_len aver_len ----------- -------- -------- -------- 1 9 9 9.0000 2 13 10 12.7385 3 17 11 15.8155 4 20 12 18.8141 5 23 13 21.7565 6 26 14 24.6305 7 29 16 27.5662 8 33 16 30.4084 9 35 30 33.3293 10 39 27 36.1929 11 41 36 39.1082 12 45 34 42.0063 13 48 42 44.8869 14 50 43 47.8597 15 53 48 50.7955 16 57 51 53.8091 17 59 53 56.5031 18 62 57 59.7467 19 65 61 62.5106 20 68 63 65.6471 2.2 AMC-ACE-O HANZI_len max_len min_len aver_len ----------- -------- -------- -------- 1 9 9 9.0000 2 13 10 12.7401 3 17 11 16.2954 4 21 12 19.8272 5 25 13 23.3622 6 29 14 26.6862 7 32 16 30.3160 8 36 16 33.3543 9 40 30 36.9435 10 44 27 40.1309 11 48 36 43.5358 12 51 34 46.9934 13 55 41 50.1921 14 59 31 53.6718 15 62 48 56.4908 16 66 51 60.5728 17 68 53 63.0881 18 73 57 66.3467 19 76 61 70.4894 20 79 67 73.4706 2.3 AMC-ACE-R HANZI_len max_len min_len aver_len ----------- -------- -------- -------- 1 9 9 9.0000 2 13 10 12.7401 3 17 11 16.4921 4 21 12 20.3006 5 25 13 24.0866 6 29 14 27.7128 7 33 16 31.5339 8 37 16 34.8798 9 41 31 38.7328 10 45 27 42.2981 11 49 40 45.9261 12 53 34 49.7214 13 57 45 53.1325 14 61 43 56.8372 15 65 52 60.1350 16 69 56 64.5534 17 73 53 67.0566 18 75 63 71.5600 19 80 67 74.8936 20 84 72 79.2647 2.4 AMC_ACE_V HANZI_len max_len min_len aver_len ----------- -------- -------- -------- 1 8 8 8.0000 2 12 10 10.9848 3 16 11 13.9613 4 20 12 16.9441 5 23 13 19.9398 6 25 14 22.9069 7 28 16 25.9121 8 32 16 28.8512 9 35 29 31.8418 10 37 27 34.8368 11 41 35 37.7769 12 44 35 40.8079 13 47 40 43.7668 14 49 32 46.7032 15 52 45 49.6626 16 56 50 52.8026 17 58 53 55.6604 18 65 57 58.9733 19 64 60 61.8085 20 67 63 64.8235 2.5 AMC_ACE_W HANZI_len max_len min_len aver_len ----------- -------- -------- -------- 1 8 8 8.0000 2 12 10 10.9848 3 16 11 13.9790 4 19 12 16.9746 5 24 13 19.9955 6 27 14 22.9755 7 29 16 26.0036 8 34 16 28.9908 9 37 29 32.0108 10 41 27 35.0201 11 46 36 38.0537 12 49 28 41.0596 13 52 42 44.1294 14 56 44 47.1385 15 56 47 50.1472 16 62 51 53.3204 17 66 54 56.3082 18 72 58 60.1333 19 69 61 62.7234 20 71 64 65.2059 2.6 AltDUDE HANZI_len max_len min_len aver_len ----------- -------- -------- -------- 1 8 8 8.0000 2 12 9 11.7401 3 16 10 15.4679 4 20 11 19.2468 5 24 12 23.0600 6 28 13 26.7517 7 32 15 30.6092 8 36 15 34.1284 9 40 30 37.9871 10 44 29 41.5955 11 48 38 45.2093 12 52 26 49.0893 13 56 45 52.7266 14 60 49 56.3480 15 64 53 59.7935 16 68 57 63.8964 17 72 61 67.2956 18 76 65 71.6400 19 79 71 75.3404 20 83 72 78.3529 2.7 BRACE HANZI_len max_len min_len aver_len ----------- -------- -------- -------- 1 8 8 8.0000 2 11 9 10.9402 3 14 11 13.9295 4 18 12 17.8337 5 21 14 20.9788 6 24 15 23.9537 7 27 16 26.9647 8 30 18 29.9676 9 34 31 33.8942 10 37 29 36.9766 11 40 36 39.9731 12 43 36 42.9665 13 46 43 45.9814 14 50 43 49.9406 15 53 51 52.9877 16 56 54 55.9806 17 59 52 58.8868 18 62 59 61.9200 19 66 65 65.9787 20 69 69 69.0000 2.8 DUDE HANZI_len max_len min_len aver_len ----------- -------- -------- -------- 1 8 8 8.0000 2 12 9 11.7401 3 16 10 15.4679 4 20 11 19.2468 5 24 12 23.0600 6 28 13 26.7517 7 32 15 30.6092 8 36 15 34.1284 9 40 30 37.9871 10 44 29 41.5955 11 48 38 45.2093 12 52 26 49.0893 13 56 45 52.7266 14 60 49 56.3480 15 64 53 59.7935 16 68 57 63.8964 17 72 61 67.2956 18 76 65 71.6400 19 79 71 75.3404 20 83 72 78.3529 2.9 LACE HANZI_len max_len min_len aver_len ----------- -------- -------- -------- 1 9 9 9.0000 2 12 11 11.9642 3 16 12 15.9868 4 19 14 18.9913 5 22 16 21.9985 6 25 17 24.9972 7 28 19 27.9995 8 32 20 31.9990 9 35 35 35.0000 10 38 36 37.9994 11 41 41 41.0000 12 44 33 43.9934 13 48 48 48.0000 14 51 51 51.0000 15 54 54 54.0000 16 57 57 57.0000 17 60 60 60.0000 18 64 64 64.0000 19 67 67 67.0000 20 70 70 70.0000 2.10 LDUDE HANZI_len max_len min_len aver_len ----------- -------- -------- -------- 1 8 8 8.0000 2 12 9 11.3573 3 16 10 14.4717 4 20 11 17.5182 5 24 12 20.5703 6 28 13 23.5265 7 32 15 26.6787 8 36 15 29.6551 9 40 21 32.7320 10 44 23 35.9557 11 48 28 38.9694 12 51 26 42.1635 13 55 34 45.1844 14 60 37 48.4613 15 63 40 51.3865 16 66 43 54.9903 17 70 45 58.2767 18 75 50 61.8133 19 76 51 62.7234 20 78 57 67.9706 2.11 MACE HANZI_len max_len min_len aver_len ----------- -------- -------- -------- 1 8 7 7.9850 2 12 9 10.9905 3 16 10 13.9855 4 19 11 16.9912 5 24 12 19.9935 6 26 14 22.9938 7 29 15 26.0009 8 34 15 29.0167 9 37 29 32.0114 10 41 30 35.0183 11 46 37 38.0455 12 49 29 41.0432 13 52 42 44.0883 14 56 45 47.1493 15 56 48 50.1718 16 62 53 53.3042 17 66 55 56.3837 18 70 59 60.1467 19 69 62 62.7234 20 71 64 65.2941 2.12 RACE HANZI_len max_len min_len aver_len ----------- -------- -------- -------- 1 9 8 8.0014 2 12 9 11.8931 3 16 11 15.9835 4 19 12 18.9981 5 22 14 21.9994 6 25 16 24.9987 7 28 17 27.9993 8 32 19 31.9989 9 35 35 35.0000 10 38 38 38.0000 11 41 41 41.0000 12 44 44 44.0000 13 48 48 48.0000 14 51 51 51.0000 15 54 54 54.0000 16 57 57 57.0000 17 60 60 60.0000 18 64 64 64.0000 19 67 67 67.0000 20 70 70 70.0000 2.13 UTF-6 HANZI_len max_len min_len aver_len ----------- -------- -------- -------- 1 8 8 8.0000 2 12 9 11.9582 3 16 10 15.9889 4 20 11 19.9985 5 24 17 23.9995 6 28 13 27.9984 7 32 21 31.9993 8 36 23 35.9989 9 40 40 40.0000 10 44 44 44.0000 11 48 48 48.0000 12 52 52 52.0000 13 56 56 56.0000 14 60 60 60.0000 15 64 64 64.0000 16 68 68 68.0000 17 72 72 72.0000 18 76 76 76.0000 19 80 80 80.0000 20 84 84 84.0000 3. Summary ACE_name max_cstring AMC-ACE-M 18 AMC-ACE-O 15 AMC-ACE-R 14 AMC_ACE_V 17 AMC_ACE_W 16 AltDUDE 14 BRACE 18 DUDE 14 LACE 17 LDUDE 15 MACE 16 RACE 17 UTF-6 14 The max_cstring is the maximum length of Chinese Domain Name when max_len is less than 64. For Chinese Domain Name, the max_len is most significant. 4. References [RFC1035] "DOMAIN NAMES - IMPLEMENTATION AND SPECIFICATION", RFC1034, Nov 1987, P. Mockapetris [RACE] "RACE: Row-based ASCII Compatible Encoding for IDN", draft-ietf-idn-race-03.txt, Nov 2000, P Hoffman [BRACE] "BRACE: Bi-mode Row-based ASCII-Compatible Encoding for IDN version 0.1.2" draft-ietf-idn-brace-00.txti, Sep 2000, A Costello [LACE] "LACE: Length-based ASCII Compatible Encoding for IDN" draft-ietf-idn-lace-01.txt, Jan 2001, M Davis, P Hoffman [UTF6] "UTF-6 - Yet Another ASCII-Compatible Encoding for IDN" draft-ietf-idn-utf6-00, Nov 2000, M Welter, B Spolarich [DUDE] "Differential Unicode Domain Encoding (DUDE)" draft-ietf-idn-dude-02.txt, Jun 2001, M Welter, B Spolarich, A Costello [AMCACEM] "AMC-ACE-M version 0.1.0" draft-ietf-idn-amc-ace-m-00.txt, Feb 2001, A Costello [AltDUDE] "AltDUDE version 0.0.2" draft-ietf-idn-altdude-00.txt, Mar 2001, A Costello [AMCACEO] "AMC-ACE-O version 0.0.3" draft-ietf-idn-amc-ace-o-00.txt, Mar 2001, A Costello [AMCACER] "AMC-ACE-R version 0.2.1" draft-ietf-idn-amc-ace-r-01.txt, May 2001, A Costello [AMCACEV] "AMC-ACE-V version 0.1.0" draft-ietf-idn-amc-ace-v-00.txt, May 2001, A Costello [AMCACEW] "AMC-ACE-W version 0.1.0" draft-ietf-idn-amc-ace-w-00.txt, May 2001, A Costello [MACE] "MACE: Modal ASCII Compatible Encoding for IDN" draft-ietf-idn-mace-00.txt, Jun 2001, M Ishisone, Y Yoneya [LDUDE] "Improving ACE using code point reordering v0.9" draft-ietf-idn-lsb-ace-00.txt, Jun 2001, Soobok Lee [MDNKIT] "Multilingual Domain Name tool Kit", http://www.nic.ad.jp/jp/research/idn/mdnkit/download/ 5. Acknowledgements CNNIC Chinese Registry Service Department provided registered Chinese Domain Names. XiaoDong LEE, lee@cnnic.net.cn Wang Yanfeng, wyf@cnnic.net.cn Deng Xiang, deng@cnnic.net.cn 6. Author's Address Sun Guonian China Internet Network Information Center No.4, South 4th street, Zhongguancun, Haidian District, Beijing, China 100080 sun@cnnic.net.cn