Chinaunix首页 | 论坛 | 博客
  • 博客访问: 211796
  • 博文数量: 6
  • 博客积分: 1440
  • 博客等级: 上尉
  • 技术积分: 437
  • 用 户 组: 普通用户
  • 注册时间: 2006-04-29 09:21
文章存档

2009年(6)

我的朋友

分类:

2009-03-26 08:54:38

上一篇文章中提到了IBM 的CCSID信息,今天对CCSID做一个简单的介绍
A CCSID is a Coded Character Set Identifier. The Unicode standard defines a Coded Character Set as "A character set in which each character is assigned a numeric code value." This means that a CCSID is a number that defines a numeric ordering of characters. The IBM Character Data Representation Architecture (CDRA) as defined in SC09-1390, defines CCSIDs that are used with IBM to represent character data. This architecture defines Single Byte Character Set (SBCS) CCSIDs, Multiple Byte Character Set (MBCS) CCSIDs, and Mixed CCSIDs which are a combination of SBCS and MBCS data. MBCS CCSIDs are usually used for languages, such as Chinese, Japanese, and Korean, that define a larger number of characters than can be represented in a single byte.
CCSID是一个字符集的标识。作为unicode标准通过定义一个字符集内每个字符要对应那个数字值的方式定义了一个字符集。这说明CCSID就是一个定义字符集顺序的标识数码罢了。IBM的字符标识架构在文档SC09-1390(;)中做了定义,CCSID是IBM用来标识字符序列的标识代码。这个架构定义了SDCS(单字符集)的CCSID值,MBCS(多字符集)的CCSID值和混合单字符多字符集的混合CCSID值。多字符集的CCSID一般用于语言,比如中文,日文,韩文,这些语言的字符量很大,无法用单字节的码值来代表。

All SBCS CCSIDs define a similar basic set of characters, although they might define them in different numeric ordering. For instance all SBCS EBCDIC CCSIDs define the number "1" as x'F0' and all SBCS ASCII CCSIDs define the number 1 as x'30'.

所有的单字符集的CCSID定义了界定了类似的一整套基本特征,尽管对数字序列可能可以用其他方式定义。举例来说,在单字符集的EBCDIC中数字1的对应值为0XF0,在单字符集的ASCII中其定义就是0X30了。
Conversion can take many forms. One form of conversion is the process of converting from one CCSID to another. For example, converting from ASCII CCSID 1252 to EBCDIC CCSID 37. Another form of conversion is the process of converting string data to another data type. For example, converting a character string to a numeric value. In both of these examples, a CCSID is needed to perform the operation correctly.
CCSID间的转换有多种类型。其中一种转换就是从一种CCSID到另一种CCSID的转换,举例来说从ASCII(CCSID 1252)到EBCDIC(CCSID 37)。另一种是从串数据到另一种数据类型的转换。举例来说转换字符串数据到数值。在所有的这种类型的转换中都必须标识CCSID值来保证转换的正确进行。
 
但是转换是有要求的,第一种转换的前提是转到的CCSID的类型中要包含转换前的CCSID类型中要转换的字符,比如,如果从CCSID1381(S-CHGBPC-DATA )类型的简体中文的PC编码中的一个中文字符"中"字到其他CCSID编码转换到的编码起码要求这个CCSID编码的字符集中包含同样的"中"字。
阅读(1265) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~