java中的字符编码-gslsok-ChinaUnix博客

高山流水的博客gslsok.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

gslsok

博客访问： 1487900
博文数量： 295
博客积分： 10051
博客等级：上将
技术积分： 3850
用户组：普通用户
注册时间： 2008-04-11 08:50

文章分类

全部博文（295）

服务器应用（5）

vpn（0）

samba（0）

mail（5）

ftp（0）

dns（0）

web（0）
英语学习（6）
人生感悟（4）

笑话（0）
网络技术（53）

网络协议（29）

网页制作（2）

字符编码（18）

Ros 软路由（1）
通信技术（1）
操作系统（120）

Sun Solaris（3）

freebsd（110）

linux（6）

windows（1）

dos（0）
编程开发（54）

java（54）
数据库（50）

Oracle（7）

Mysql（43）
IT资讯（1）
未分配的博文（1）

文章存档

2011年（1）

2009年（4）

2008年（290）

我的朋友

相关博文

java中的字符编码

分类： Java

2008-04-19 18:49:18

编写下面的程序代码，分析和观察程序的运行结果：

import Java.io.*;
public class TestCodeIO {
   public static void main(String[] args) throws Exception{
         InputStreamReader isr = new InputStreamReader(System.in,"iso8859-1");
         BufferedReader br = new BufferedReader (isr);
         String strLine = br.readLine();
         br.close();
         isr.close();
         System.out.println(strLine);
   }
}
运行程序后，输入“中国”两个字，输出结果为 ???ú
请按照下面两种方法修改上述程序，是输入的中文能够正常输出
1。修改程序中的语句
            InputStreamReader isr = new InputStreamReader(System.in,"iso8859-1");
2。不修改上面的语句，修改下面的语句
            System.out.println(strLine);

第一种该法很简单，只要改成下面这样就可以了，这里不详细讨论
         InputStreamReader isr = new InputStreamReader(System.in,"gb2312");

这里我要详细讨论的是第二种该法怎么改

起初我是这样改的
      System.out.println(new String (strLine.getBytes(),"iso8859-1"));
输入“中国”后输出的结果虽然不是上面所述的乱码，但是还是乱码，显然这种该法是不正确的！

这里我要感谢软件民工  告诉我的正确改法，使我恍然大悟
      System.out.println(new String (strLine.getBytes("iso8859-1")));

这两种改法究竟有什么区别呢？为了方便大家阅读，我先把正确和错误的改法帖出来：
import Java.io.*;
public class TestCodeIO {
      public static void main(String[] args) throws Exception{
            InputStreamReader isr = new InputStreamReader(System.in,"iso8859-1");
                  //Create an InputStreamReader that uses the given charset decoder
            BufferedReader br = new BufferedReader (isr);
            String strLine = br.readLine();
            br.close();
            isr.close();
            System.out.println(strLine);
            System.out.println(new String (strLine.getBytes(),"iso8859-1"));//错误改法
                  //Encodes this String (strLine) into a sequence of bytes using the platform’s
                  //default charset(gb2312) then constructs a new String by decoding the
                  //specified array of bytes using the specified charset (iso8859-1)
                  //because this String (strLine) uses the charset decoder "iso8859-1",so it can
                  //only be encoded by "iso8859-1",cann’t be encoded by the platform’s default
                  //charset "gb2312",so this line is wrong.
            System.out.println(new String (strLine.getBytes("iso8859-1")));//正确改法
                  //Encodes this String (strLine) into a sequence of bytes using the named
                  //charset (iso8859-1),then constructs a new String by decoding the
                  //specified array of bytes using the platform’s default charset (gb2312).
                  //This line is right.
   }
}

上面的英文注释已经说得很清楚了，这里我还是解释一下吧：

首先是错误的改法  System.out.println(new String (strLine.getBytes(),"iso8859-1"));
这句代码是将strLine中的字符串用系统默认的编码方式（这里是gb2312）
转换为字节序列，然后用指定的编码方式（这里是iso8859-1）构造一个新的
String对象，并打印到屏幕上。
错误在哪里呢？
请注意这一段代码
InputStreamReader isr = new InputStreamReader(System.in,"iso8859-1");
BufferedReader br = new BufferedReader (isr);
String strLine = br.readLine();
这里strLine存储的内容是用指定的编码方式（iso8859-1）存储的，而转换成字节码
的时候（这句代码strLine.getBytes()）却使用了系统默认的gb2312编码，所以当然就
输出乱码了！然后用gb2312编码的字节序列构建新的String对象的时候又使用了
iso8859-1编码，所以输出的乱码和System.out.println(strLine)有所不同。

阅读(1087) | 评论(0) | 转发(0) |

上一篇：怎么在地址栏显示自己网站的logo

下一篇：FreeBSD FTP 的架設

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6