1.1.1 提出问题
需要把大写字符串转成小写的,或者反之。
1.1.2 解决方案
使用lc和uc函数,或者使用\L,\U进行转义。
$big = uc($little); # "bo peep" -> "BO PEEP"
$little = lc($big); # "JOHN" -> "john"
$big = "\U$little"; # "bo peep" -> "BO PEEP"
$little = "\L$big"; # "JOHN" -> "john"
|
要是只改变一个字符的情况,可以使用lcfirst和ucfirst函数,或者\l和\u进行转义。
$big = "\u$little"; # "bo" -> "Bo"
$little = "\l$big"; # "BoPeep" -> "boPeep"
|
1.1.3 讨论
上面提到的函数跟转义虽然看起来不一样,但是做的是同样的事情。你可以对首字符或者整个字符串转换大小写。甚至可以同时使用2种方法实现首字符大写,其他字符小写的效果。(实际上就是后面会说到的标题效果--titlecase).
$beast = "dromedary"; # 大写 $beast中的一些部分
$capit = ucfirst($beast); # Dromedary
$capit = "\u\L$beast"; # (same)
$capall = uc($beast); # DROMEDARY
$capall = "\U$beast"; # (same)
$caprest = lcfirst(uc($beast)); # dROMEDARY
$caprest = "\l\U$beast"; # (same)
|
这些转换大小写的转义一般用来调整字符串的大小写:
# 使每个单词首字母大写,其余字母小写
$text = "thIS is a loNG liNE"; $text =~ s/(\w+)/\u\L$1/g; print $text; # $>This Is A Long Line
|
你还可以用来进行无视大小写的字符串比较:
if (uc($a) eq uc($b)) { # or "\U$a" eq "\U$b"
print "a and b are the same\n"; }
|
下面这个程序randcap,会有20%的概率随机把输入的字符变成小写。这个会让你看到14年以前的单词:WaREz d00Dz。
#!/usr/bin/perl -p
# randcap: filter to randomly capitalize 20% of the letters
# call to srand( ) is unnecessary as of v5.4
BEGIN { srand(time( ) ^ ($$ + ($$<<15))) } sub randcase { rand(100) < 20 ? "\u$_[0]" : "\l$_[0]" } s/(\w)/randcase($1)/ge;
|
% randcap < genesis | head -9
boOk 01 genesis
001:001 in the BEginning goD created the heaven and tHe earTh.
001:002 and the earth wAS without ForM, aND void; AnD darkneSS was
upon The Face of the dEEp. and the spIrit of GOd movEd upOn
tHe face of the Waters.
001:003 and god Said, let there be ligHt: and therE wAs LigHt.
有一些语言,它们的手写系统是区分全大写(uppercase)跟首字母大写的(titlecase),在这些语言里函数ucfirst()(还有进行转义的\u)可以把字符转成首字母大写(titlecase)。比如,匈牙利语里,这样的字符序列"dz"(译注:dz在该语言里应该算是一个字符),全大写就是“DZ”, 首字母大写是“Dz”, 全小写则是"dz",对应于这三种情况Unicode有三种不同的字符:
Code point Written Meaning
01F1 DZ LATIN CAPITAL LETTER DZ
01F2 Dz LATIN CAPITAL LETTER D WITH SMALL LETTER Z
01F3 dz LATIN SMALL LETTER DZ
使用tr[a-z][A-Z]或者其他类似的方法来进行大小写转换是愚蠢的做法,是错误的,因为非常多的语言包括英语都使用了比如diaereses,变音符号(cedillas),重音符号(accent)等标志来区分字符,而上面说到的这个做法则忽略了这些这些带标志的字符。所以要对这些带标志的数据进行大小写处理比你看到的要复杂得多,就算所有东西都是Unicode的也没有一个简单的答案,当前还不是那么糟,因为Perl的大小写转换函数对Unicode的数据可以工作得很好。可以看看这一章介绍里面的通用字符编码。
1.2.4 参考
uc, lc, ucfirst, 和 lcfirst functions 在 perlfunc(1) 跟大骆驼书 第29章可以看到;
\L, \U, \l, 和 \u 转义在perlop(1)里面的 "Quote and Quote-like Operators" 一节跟 大骆驼书第5章有。
1.2.4 测试程序
#-----------------------------
use locale; # needed in 5.004 or above
$big = uc($little); # "bo peep" -> "BO PEEP"
$little = lc($big); # "JOHN" -> "john"
$big = "\U$little"; # "bo peep" -> "BO PEEP"
$little = "\L$big"; # "JOHN" -> "john"
#-----------------------------
$big = "\u$little"; # "bo" -> "Bo"
$little = "\l$big"; # "BoPeep" -> "boPeep"
#-----------------------------
use locale; # needed in 5.004 or above
$beast = "dromedary"; # capitalize various parts of $beast
$capit = ucfirst($beast); # Dromedary
$capit = "\u\L$beast"; # (same)
$capall = uc($beast); # DROMEDARY
$capall = "\U$beast"; # (same)
$caprest = lcfirst(uc($beast)); # dROMEDARY
$caprest = "\l\U$beast"; # (same)
#-----------------------------
# capitalize each word's first character, downcase the rest
$text = "thIS is a loNG liNE"; $text =~ s/(\w+)/\u\L$1/g; print $text; This Is A Long Line #-----------------------------
if (uc($a) eq uc($b)) { print "a and b are the same\n"; } #-----------------------------
# download the following standalone program
#!/usr/bin/perl -p
# randcap: filter to randomly capitalize 20% of the letters
# call to srand() is unnecessary in 5.004
BEGIN { srand(time() ^ ($$ + ($$ << 15))) } sub randcase { rand(100) < 20 ? "\u$_[0]" : "\l$_[0]" } s/(\w)/randcase($1)/ge;
#% randcap < genesis | head -9
#boOk 01 genesis
#
#
#001:001 in the BEginning goD created the heaven and tHe earTh.
#
#
#
#001:002 and the earth wAS without ForM, aND void; AnD darkneSS was
#
# upon The Face of the dEEp. and the spIrit of GOd movEd upOn
#
# tHe face of the Waters.
#
#
#001:003 and god Said, let there be ligHt: and therE wAs LigHt.
#-----------------------------
sub randcase { rand(100) < 20 ? ("\040" ^ $_[0]) : $_[0]; } #-----------------------------
$string &= "\177" x length($string); #-----------------------------
|
阅读(910) | 评论(0) | 转发(0) |