Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1791926
  • 博文数量: 335
  • 博客积分: 4690
  • 博客等级: 上校
  • 技术积分: 4341
  • 用 户 组: 普通用户
  • 注册时间: 2010-05-08 21:38
个人简介

无聊之人--除了技术,还是技术,你懂得

文章分类

全部博文(335)

文章存档

2016年(29)

2015年(18)

2014年(7)

2013年(86)

2012年(90)

2011年(105)

分类: Mysql/postgreSQL

2012-11-26 17:26:21

Basic Concepts
"Character" and "byte" are different! You must understand this before continuing. A "byte" is an a-bit thing; it is the unit of space in computers (today). A "character" is composed of one or more bytes, and represents what we think of when reading. 

A byte can represent only 256 different values. There are over 11,000 Korean characters and over 40,000 Chinese characters -- no way to squeeze such a character into a single byte. 

Charset vs collation. These are different things! 'Charset' ('character set'; 'encoding') refers to the bits used to represent 'characters'. 'Collation' refers to how those bits could be compare for inequality (WHERE) and sorting (ORDER BY). GROUP BY and FOREIGN KEY CONSTRAINTS can also involve collation. And it even can involve deciding whether two different bit strings compare 'equal'.
FOR EXAMPLE
CHARACER FOR UTF8

点击(此处)折叠或打开

  1. mysql> select * from character_sets where CHARACTER_SET_NAME='utf8' \G;
  2. *************************** 1. row ***************************
  3.   CHARACTER_SET_NAME: utf8
  4. DEFAULT_COLLATE_NAME: utf8_general_ci
  5.          DESCRIPTION: UTF-8 Unicode
  6.               MAXLEN: 3
  7. 1 row in set (0.00 sec)
FOR COLLATIONS OF UTF8   

点击(此处)折叠或打开

  1. mysql> select * from collations where character_set_name='utf8' \G;
  2. *************************** 1. row ***************************
  3.     COLLATION_NAME: utf8_general_ci
  4. CHARACTER_SET_NAME: utf8
  5.                 ID: 33
  6.         IS_DEFAULT: Yes
  7.        IS_COMPILED: Yes
  8.            SORTLEN: 1
  9. *************************** 2. row ***************************
  10.     COLLATION_NAME: utf8_bin
  11. CHARACTER_SET_NAME: utf8
  12.                 ID: 83
  13.         IS_DEFAULT:
  14.        IS_COMPILED: Yes
  15.            SORTLEN: 1
  16. *************************** 3. row ***************************
  17.     COLLATION_NAME: utf8_unicode_ci
  18. CHARACTER_SET_NAME: utf8
  19.                 ID: 192
  20.         IS_DEFAULT:
  21.        IS_COMPILED: Yes
  22.            SORTLEN: 8
  23. *************************** 4. row ***************************
  24.     COLLATION_NAME: utf8_icelandic_ci
  25. CHARACTER_SET_NAME: utf8
  26.                 ID: 193
  27.         IS_DEFAULT:
  28.        IS_COMPILED: Yes
  29.            SORTLEN: 8
  30. *************************** 5. row ***************************
  31.     COLLATION_NAME: utf8_latvian_ci
  32. CHARACTER_SET_NAME: utf8
  33.                 ID: 194
  34.         IS_DEFAULT:
  35.        IS_COMPILED: Yes
  36.            SORTLEN: 8
  37. *************************** 6. row ***************************
  38.     COLLATION_NAME: utf8_romanian_ci
  39. CHARACTER_SET_NAME: utf8
  40.                 ID: 195
  41.         IS_DEFAULT:
  42.        IS_COMPILED: Yes
  43.            SORTLEN: 8
  44. *************************** 7. row ***************************
  45.     COLLATION_NAME: utf8_slovenian_ci
  46. CHARACTER_SET_NAME: utf8
  47.                 ID: 196
  48.         IS_DEFAULT:
  49.        IS_COMPILED: Yes
  50.            SORTLEN: 8
  51. *************************** 8. row ***************************
  52.     COLLATION_NAME: utf8_polish_ci
  53. CHARACTER_SET_NAME: utf8
  54.                 ID: 197
  55.         IS_DEFAULT:
  56.        IS_COMPILED: Yes
  57.            SORTLEN: 8
  58. *************************** 9. row ***************************
  59.     COLLATION_NAME: utf8_estonian_ci
  60. CHARACTER_SET_NAME: utf8
  61.                 ID: 198
  62.         IS_DEFAULT:
  63.        IS_COMPILED: Yes
  64.            SORTLEN: 8
  65. *************************** 10. row ***************************
  66.     COLLATION_NAME: utf8_spanish_ci
  67. CHARACTER_SET_NAME: utf8
  68.                 ID: 199
  69.         IS_DEFAULT:
  70.        IS_COMPILED: Yes
  71.            SORTLEN: 8
  72. *************************** 11. row ***************************
  73.     COLLATION_NAME: utf8_swedish_ci
  74. CHARACTER_SET_NAME: utf8
  75.                 ID: 200
  76.         IS_DEFAULT:
  77.        IS_COMPILED: Yes
  78.            SORTLEN: 8
  79. *************************** 12. row ***************************
  80.     COLLATION_NAME: utf8_turkish_ci
  81. CHARACTER_SET_NAME: utf8
  82.                 ID: 201
  83.         IS_DEFAULT:
  84.        IS_COMPILED: Yes
  85.            SORTLEN: 8
  86. *************************** 13. row ***************************
  87.     COLLATION_NAME: utf8_czech_ci
  88. CHARACTER_SET_NAME: utf8
  89.                 ID: 202
  90.         IS_DEFAULT:
  91.        IS_COMPILED: Yes
  92.            SORTLEN: 8
  93. *************************** 14. row ***************************
  94.     COLLATION_NAME: utf8_danish_ci
  95. CHARACTER_SET_NAME: utf8
  96.                 ID: 203
  97.         IS_DEFAULT:
  98.        IS_COMPILED: Yes
  99.            SORTLEN: 8
  100. *************************** 15. row ***************************
  101.     COLLATION_NAME: utf8_lithuanian_ci
  102. CHARACTER_SET_NAME: utf8
  103.                 ID: 204
  104.         IS_DEFAULT:
  105.        IS_COMPILED: Yes
  106.            SORTLEN: 8
  107. *************************** 16. row ***************************
  108.     COLLATION_NAME: utf8_slovak_ci
  109. CHARACTER_SET_NAME: utf8
  110.                 ID: 205
  111.         IS_DEFAULT:
  112.        IS_COMPILED: Yes
  113.            SORTLEN: 8
  114. *************************** 17. row ***************************
  115.     COLLATION_NAME: utf8_spanish2_ci
  116. CHARACTER_SET_NAME: utf8
  117.                 ID: 206
  118.         IS_DEFAULT:
  119.        IS_COMPILED: Yes
  120.            SORTLEN: 8
  121. *************************** 18. row ***************************
  122.     COLLATION_NAME: utf8_roman_ci
  123. CHARACTER_SET_NAME: utf8
  124.                 ID: 207
  125.         IS_DEFAULT:
  126.        IS_COMPILED: Yes
  127.            SORTLEN: 8
  128. *************************** 19. row ***************************
  129.     COLLATION_NAME: utf8_persian_ci
  130. CHARACTER_SET_NAME: utf8
  131.                 ID: 208
  132.         IS_DEFAULT:
  133.        IS_COMPILED: Yes
  134.            SORTLEN: 8
  135. *************************** 20. row ***************************
  136.     COLLATION_NAME: utf8_esperanto_ci
  137. CHARACTER_SET_NAME: utf8
  138.                 ID: 209
  139.         IS_DEFAULT:
  140.        IS_COMPILED: Yes
  141.            SORTLEN: 8
  142. *************************** 21. row ***************************
  143.     COLLATION_NAME: utf8_hungarian_ci
  144. CHARACTER_SET_NAME: utf8
  145.                 ID: 210
  146.         IS_DEFAULT:
  147.        IS_COMPILED: Yes
  148.            SORTLEN: 8
  149. *************************** 22. row ***************************
  150.     COLLATION_NAME: utf8_sinhala_ci
  151. CHARACTER_SET_NAME: utf8
  152.                 ID: 211
  153.         IS_DEFAULT:
  154.        IS_COMPILED: Yes
  155.            SORTLEN: 8
  156. *************************** 23. row ***************************
  157.     COLLATION_NAME: utf8_general_mysql500_ci
  158. CHARACTER_SET_NAME: utf8
  159.                 ID: 223
  160.         IS_DEFAULT:
  161.        IS_COMPILED: Yes
  162.            SORTLEN: 1
  163. 23 rows in set (0.00 sec)

  164. ERROR:
  165. No query specified
internal struct :

点击(此处)折叠或打开

  1. typedef struct character_set
  2. {
  3.   unsigned int number; /* character set number */
  4.   unsigned int state; /* character set state */
  5.   const char *csname; /* collation name */
  6.   const char *name; /* character set name */
  7.   const char *comment; /* comment */
  8.   const char *dir; /* character set directory */
  9.   unsigned int mbminlen; /* min. length for multibyte strings */
  10.   unsigned int mbmaxlen; /* max. length for multibyte strings */
  11. } MY_CHARSET_INFO;


 

阅读(1822) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~