Chinaunix首页 | 论坛 | 博客
  • 博客访问: 470473
  • 博文数量: 142
  • 博客积分: 4126
  • 博客等级: 上校
  • 技术积分: 1545
  • 用 户 组: 普通用户
  • 注册时间: 2008-02-22 10:03
文章分类

全部博文(142)

文章存档

2011年(8)

2010年(7)

2009年(64)

2008年(63)

我的朋友

分类:

2009-04-24 10:37:41

1.1.1 提出问题
需要把大写字符串转成小写的,或者反之。
1.1.2 解决方案
使用lc和uc函数,或者使用\L,\U进行转义。
 

$big = uc($little); # "bo peep" -> "BO PEEP"

$little = lc($big); # "JOHN" -> "john"

$big = "\U$little"; # "bo peep" -> "BO PEEP"

$little = "\L$big"; # "JOHN"    -> "john"

要是只改变一个字符的情况,可以使用lcfirst和ucfirst函数,或者\l和\u进行转义。

 

$big = "\u$little"; # "bo" -> "Bo"

$little = "\l$big"; # "BoPeep" -> "boPeep"

1.1.3 讨论
上面提到的函数跟转义虽然看起来不一样,但是做的是同样的事情。你可以对首字符或者整个字符串转换大小写。甚至可以同时使用2种方法实现首字符大写,其他字符小写的效果。(实际上就是后面会说到的标题效果--titlecase).

 

$beast = "dromedary";
# 大写 $beast中的一些部分

$capit = ucfirst($beast); # Dromedary

$capit = "\u\L$beast"; # (same)

$capall = uc($beast); # DROMEDARY

$capall = "\U$beast"; # (same)

$caprest = lcfirst(uc($beast)); # dROMEDARY

$caprest = "\l\U$beast"; # (same)

这些转换大小写的转义一般用来调整字符串的大小写:

 

# 使每个单词首字母大写,其余字母小写

$text = "thIS is a loNG liNE";
$text =~ s/(\w+)/\u\L$1/g;
print $text;
# $>This Is A Long Line

你还可以用来进行无视大小写的字符串比较:

 

if (uc($a) eq uc($b)) { # or "\U$a" eq "\U$b"

    print "a and b are the same\n";
}

下面这个程序randcap,会有20%的概率随机把输入的字符变成小写。这个会让你看到14年以前的单词:WaREz d00Dz。

 

#!/usr/bin/perl -p

  # randcap: filter to randomly capitalize 20% of the letters

  # call to srand( ) is unnecessary as of v5.4

  BEGIN { srand(time( ) ^ ($$ + ($$<<15))) }
  sub randcase { rand(100) < 20 ? "\u$_[0]" : "\l$_[0]" }
  s/(\w)/randcase($1)/ge;

% randcap < genesis | head -9
boOk 01 genesis
001:001 in the BEginning goD created the heaven and tHe earTh.
     
001:002 and the earth wAS without ForM, aND void; AnD darkneSS was
        upon The Face of the dEEp. and the spIrit of GOd movEd upOn
        tHe face of the Waters.
001:003 and god Said, let there be ligHt: and therE wAs LigHt.

有一些语言,它们的手写系统是区分全大写(uppercase)跟首字母大写的(titlecase),在这些语言里函数ucfirst()(还有进行转义的\u)可以把字符转成首字母大写(titlecase)。比如,匈牙利语里,这样的字符序列"dz"(译注:dz在该语言里应该算是一个字符),全大写就是“DZ”, 首字母大写是“Dz”, 全小写则是"dz",对应于这三种情况Unicode有三种不同的字符:

Code point  Written   Meaning
01F1        DZ        LATIN CAPITAL LETTER DZ
01F2        Dz        LATIN CAPITAL LETTER D WITH SMALL LETTER Z
01F3        dz        LATIN SMALL LETTER DZ

使用tr[a-z][A-Z]或者其他类似的方法来进行大小写转换是愚蠢的做法,是错误的,因为非常多的语言包括英语都使用了比如diaereses,变音符号(cedillas),重音符号(accent)等标志来区分字符,而上面说到的这个做法则忽略了这些这些带标志的字符。所以要对这些带标志的数据进行大小写处理比你看到的要复杂得多,就算所有东西都是Unicode的也没有一个简单的答案,当前还不是那么糟,因为Perl的大小写转换函数对Unicode的数据可以工作得很好。可以看看这一章介绍里面的通用字符编码。

1.2.4 参考
uc, lc, ucfirst, 和 lcfirst functions 在 perlfunc(1) 跟大骆驼书 第29章可以看到;
\L, \U, \l, 和 \u 转义在perlop(1)里面的 "Quote and Quote-like Operators" 一节跟 大骆驼书第5章有。

1.2.4 测试程序

 

#-----------------------------

use locale; # needed in 5.004 or above


$big = uc($little); # "bo peep" -> "BO PEEP"

$little = lc($big); # "JOHN" -> "john"

$big = "\U$little"; # "bo peep" -> "BO PEEP"

$little = "\L$big"; # "JOHN" -> "john"

#-----------------------------

$big = "\u$little"; # "bo" -> "Bo"

$little = "\l$big"; # "BoPeep" -> "boPeep"

#-----------------------------

use locale; # needed in 5.004 or above


$beast = "dromedary";
# capitalize various parts of $beast

$capit = ucfirst($beast); # Dromedary

$capit = "\u\L$beast"; # (same)

$capall = uc($beast); # DROMEDARY

$capall = "\U$beast"; # (same)

$caprest = lcfirst(uc($beast)); # dROMEDARY

$caprest = "\l\U$beast"; # (same)

#-----------------------------

# capitalize each word's first character, downcase the rest

$text = "thIS is a loNG liNE";
$text =~ s/(\w+)/\u\L$1/g;
print $text;
This Is A Long Line
#-----------------------------

if (uc($a) eq uc($b)) {
    print "a and b are the same\n";
}
#-----------------------------

# download the following standalone program

#!/usr/bin/perl -p

# randcap: filter to randomly capitalize 20% of the letters

# call to srand() is unnecessary in 5.004

BEGIN { srand(time() ^ ($$ + ($$ << 15))) }
sub randcase { rand(100) < 20 ? "\u$_[0]" : "\l$_[0]" }
s/(\w)/randcase($1)/ge;


#% randcap < genesis | head -9

#boOk 01 genesis

#

#

#001:001 in the BEginning goD created the heaven and tHe earTh.

#

#

#

#001:002 and the earth wAS without ForM, aND void; AnD darkneSS was

#

#     upon The Face of the dEEp. and the spIrit of GOd movEd upOn

#

#     tHe face of the Waters.

#

#

#001:003 and god Said, let there be ligHt: and therE wAs LigHt.

#-----------------------------

sub randcase {
    rand(100) < 20 ? ("\040" ^ $_[0]) : $_[0];
}
#-----------------------------

$string &= "\177" x length($string);
#-----------------------------

阅读(860) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~