PERL 学习 - 第三天-penny_kan-ChinaUnix博客

不应不答不语noname.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

penny_kan

博客访问： 514961
博文数量： 143
博客积分： 4072
博客等级：上校
技术积分： 1442
用户组：普通用户
注册时间： 2007-02-20 19:27

文章分类

全部博文（143）

RFC（1）
中医中药（2）

中医基础（1）

人参（1）

五加科（0）
财经（5）
行者（2）
语言学习（14）

Bash（1）

Perl学习（12）
网络服务（20）

mysql（1）

squid（0）

NFS（1）

weblogic（1）

DNS（4）

FTP（2）

Mail Server（8）

Apache（2）

Web Server（1）
操作系统（17）

linux（9）

Windows（5）

软件安装（1）

BSD（1）
CDN技术（2）
工具使用（14）
工作相关（24）
安全审计（1）
心情（31）
评论（9）
General（0）
未分配的博文（1）

文章存档

2014年（2）

2011年（4）

2010年（1）

2009年（9）

2008年（34）

2007年（93）

我的朋友

大鬼不动

最近访客

推荐博文

PERL 学习 - 第三天

分类：

2007-02-21 15:11:55

一段代码例，集成很多PERL的基本使用方法，摘自perl in 20 pages，一般的应用可以从中extract部分代码即可实现。重要是吸收代码中的知识点

功能： 从一个或多个html文件中抽出含有 ...行的内容，并且排序。

执行方法：

    % perl -w proto-getH1.pl [-o outputfile] input-file(s)
     or
    % proto-getH1.pl [-o outputfile] input-file(s)

结果：默认是 stdout,如果-o outputfile则输出至文件。

Script: 表格中

#! /usr/bin/perl -w

# Example perl file - extract H1,H2 or H3 headers from HTML files
# Run via:
# perl this-perl-script.pl [-o outputfile] input-file(s)
# E.g.
# perl proto-getH1.pl -o headers *.html
# perl proto-getH1.pl -o output.txt homepage.htm
#
# Russell Quong 2/19/98

require 5.003; # need this version of Perl or newer
use English; # use English names, not cryptic ones
use FileHandle; # use FileHandles instead of open(),close()
use Carp; # get standard error / warning messages
use strict; # force disciplined use of variables

## define some variables.
my($author) = "Russell W. Quong";
my($version) = "Version 1.0";
my($reldate) = "Jan 1998";

my($lineno) = 0; # variable, current line number
my($OUT) = \*STDOUT; # default output file stream, stdout
my(@headerArr) = (); # array of HTML headers

# print out a non-crucial for-your-information messages.
# By making fyi() a function, we enable/disable debugging messages easily.
sub fyi ($) {

my($str) = @_;
print "$str\n";
}

sub main () {
fyi("perl script = $PROGRAM_NAME, $version, $author, $reldate.");
handle_flags();
# handle remaining command line args, namely the input files
if (@ARGV == 0) { # @ARGV used in scalar context = number of
handle_file('-');
} else {
my($i);
foreach $i (@ARGV) {
handle_file($i);
}
}
postProcess(); # additional processing after reading input
}

# handle all the arguments, in the @ARGV array.
# we assume flags begin with a '-' (dash or minus sign).
#
sub handle_flags () {
my($a, $oname) = (undef, undef);
foreach $a (@ARGV) {
if ($a =~ /^-o/) {
shift @ARGV; # discard ARGV[0] = the -o flag
$oname = $ARGV[0]; # get arg after -o
shift @ARGV; # discard ARGV[0] = output file name
$OUT = new FileHandle "> $oname";
if (! defined($OUT) ) {
croak "Unable to open output file: $oname. Bye-bye.";
exit(1);
}
} else {
last; # break out of this loop
}
}
}

# handle_file (FILENAME);
# open a file handle or input stream for the file named FILENAME.
# if FILENAME == '-' use stdin instead.
sub handle_file ($) {
my($infile) = @_;
fyi(" handle_file($infile)");
if ($infile eq "-") {
read_file(\*STDIN, "[stdin]"); # \*STDIN=input stream for STDIN.
} else {
my($IN) = new FileHandle "$infile";
if (! defined($IN)) {
fyi("Can't open spec file $infile: $!\n");
return;
}
read_file($IN, "$infile"); # $IN = file handle for $infile
$IN->close(); # done, close the file.
}
}

# read_file (INPUT_STREAM, filename);
#
sub read_file ($$) {
my($IN, $filename) = @_;
my($line, $from) = ("", "");
$lineno = 0; # reset line number for this file
while ( defined($line = <$IN>) ) {
$lineno++;
chomp($line); # strip off trailing '\n' (newline)
do_line($line, $lineno, $filename);
}
}

# do_line(line of text data, line number, filename);

# process a line of text.
sub do_line ($$$) {
my($line, $lineno, $filename) = @_;
my($heading, $htype) = undef;
# search for a .... line, save the .... in $header.
# where Hx = H1, H2 or H3.
if ( $line =~ m:()(.*):i ) {
$htype = $1; # either H1, H2, or H3
$heading = $2; # text matched in the parethesis in the regex
fyi("FYI: $filename, $lineno: Found ($heading)");
print $OUT "$filename, $lineno: $heading\n";

# we'll also save the all the headers in an array, headerArr
push(@headerArr, "$heading ($filename, $lineno)");
}
}

# print out headers sorted alphabetically
#
sub postProcess() {
my(@sorted) = sort { $a cmp $b } @headerArr; # example using sort
print $OUT "\n--- SORTED HEADERS ---\n";
my($h);
foreach $h (@sorted) {
print $OUT "$h\n";
}
my $now = localtime();
print $OUT "\nGenerated $now.\n"

}
# start executing at main()
#
main();
0; # return 0 (no error from this script)

阅读(1316) | 评论(0) | 转发(0) |

上一篇：通过pam.d限制ssh登录用户

下一篇：行者--西湖七月

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6