NUXI: big-little-endian-jack_zheng-ChinaUnix博客

jack's blog

首页　| 　博文目录　| 　关于我

jack_zheng

博客访问： 130736
博文数量： 17
博客积分： 2000
博客等级：大尉
技术积分： 490
用户组：普通用户
注册时间： 2007-05-13 14:07

文章分类

全部博文（17）

personal（0）
work（0）
security（0）
unix（16）
未分配的博文（1）

文章存档

2011年（1）

2009年（5）

2008年（11）

我的朋友

最近访客

推荐博文

NUXI: big-little-endian

分类：

2009-02-04 11:17:18

Issues with byte order are sometimes called the NUXI problem: UNIX stored on a big-endian machine can show up as NUXI on a little-endian one.

Suppose we want to store 4 bytes (U, N, I and X) as two shorts: UN and IX. Each letter is a entire byte, like our WXYZ example above. To store the two shorts we would write:

short *s; // pointer to set shorts
s = 0;    // point to location 0
*s = UN;  // store first short: U * 256 + N (fictional code)
s = 2;    // point to next location
*s = IX;  // store second short: I * 256 + X

This code is not specific to a machine. If we store "UN" on a machine and ask to read it back, it had better be "UN"! I don't care about endian issues, if we store a value on one machine, we need to get the same value back.

However, if we look at memory one byte at a time (using our char * trick), the order could vary. On a big endian machine we see

Byte:     U N I X
Location: 0 1 2 3

Which make sense. U is the biggest byte in "UN" and is stored first. The same goes for IX: I is the biggest, and stored first.

On a little-endian machine we would see:

Byte:     N U X I
Location: 0 1 2 3

And this makes sense also. "N" is the littlest byte in "UN" and is stored first. Again, even though the bytes are stored "backwards" in memory, the little-endian machine knows it is little endian, and interprets them correctly when reading the values back. Also, note that we can specify hex numbers such as x = 0x1234 on any machine. Even a little-endian machine knows what you mean when you write 0x1234, and won't force you to swap the values yourself (you specify the hex number to write, and it figures out the details and swaps the bytes in memory, under the covers. Tricky.).

This scenario is called the "NUXI" problem because byte sequence UNIX is interpreted as NUXI on the other type of machine. Again, this is only a problem if you exchange data -- each machine is internally consistent.

阅读(961) | 评论(0) | 转发(0) |

上一篇：Linux终端乱码问题

下一篇：博客已升级，请注意变更地址

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6