分类: PERL
2014-01-20 11:33:47
1.10.1 问题描述
你有一个Unicode的字符串,但是希望Perl按照八字节方式处理(例如,计算字符串长度或是对I/O的使用上)
1.10.2 解决方案
使用“use bytes”编译符可以使perl在词法作用于内的操作将字符串按照八字节处理。当你的代码直接调用perl的对字符不敏感的函数时,可以用这个编译符号
$ff = "\x{FB00}"; # ff ligature
$chars = length($ff); # length is one character
{
use bytes; # force byte semantics
$octets = length($ff); # length is two octets
}
$chars = length($ff); # back to character semantics
Alternatively, the Encode module lets you convert a Unicode string to a string of octets, and back again. Use it when the character-aware code isn't in your lexical scope:
“Encode”编码模块可以允许Unicode字符串和八字节的相互转换,当代码的词法作用域中没有对字符不敏感的代码时可以使用这个模块
use Encode qw(encode_utf8);
sub somefunc; # defined elsewhere
$ff = "\x{FB00}"; # ff ligature
$ff_oct = encode_utf8($ff); # convert to octets
$chars = somefunc($ff); # work with character string
$octets = somefunc($ff_oct); # work with octet string