Chinaunix首页 | 论坛 | 博客
  • 博客访问: 2000420
  • 博文数量: 369
  • 博客积分: 10093
  • 博客等级: 上将
  • 技术积分: 4271
  • 用 户 组: 普通用户
  • 注册时间: 2005-03-21 00:59
文章分类

全部博文(369)

文章存档

2013年(1)

2011年(2)

2010年(10)

2009年(16)

2008年(33)

2007年(146)

2006年(160)

2005年(1)

分类:

2006-04-23 14:44:29

以前做歌词秀的时候,苦于没有那个LRC的歌词站点能够直接搜索歌词的,有的歌词库又太小,前段时间一个不小心发现百度已经支持LRC歌词的在线搜索了,甚是惊喜了一番。
闲暇之余,自己用Perl语言做了一个简单的搜索和下载脚本。此脚本已经支持查找和下载功能了。
系统要求:
当然首先需要装上Perl语言的,其次还需要Perl语言的LWP模块,因为不同系统安装方法不同,请自行解决。
具体使用:
xiaosuo@gentux perl $ ./baidulrc -h
 Usage: baidulrc [options] MusicName.
        -l              get the lyrics list.
        -p num          set the page to n, default 1.
        -n num          set the number to n.
        -d              download the special lyrics.
        -x              output the content in XML format.
        -h              show this help page.
其它说明:
  1. 因为百度的歌词搜索不仅基于歌名,还基于歌词内容,所以可能不符合要求的结果偏多,不过有的情况下,可能变成好事,比如你只知道其中的某句话但想不起来歌名了。
  2. 因为这个脚本可以按照XML的格式输出内容,并且歌词默认是输出到标准输出的,所以它很容易和其它的程序接合,我有时间会把它集成到我的XLyircs当中。
具体代码:
#!/usr/bin/perl
#
# Copyright (C) xiaosuo
# License GPL2 or above
#
use strict;
use warnings;
use URI;
use HTML::TreeBuilder;
require LWP::UserAgent;
use Getopt::Std;

sub get_musics
{
        my $tree = $_[0] || die "No tree";
        my @musics;

        foreach my $div (
                $tree->look_down('_tag', 'div',
                        sub{
                                return 0 unless $_[0]->attr('class');
                                return 0 unless($_[0]->attr('class') =~ /BlueBG/);
                        })){
                my ($title) = ($div->as_text =~ /歌曲:(.*)/);
                push @musics, $title;
        }

        return @musics;
}

sub get_actors
{
        my $tree = $_[0] || die "No tree";
        my @actors;

        foreach my $div (
                $tree->look_down('_tag', 'div',
                        sub{
                                return 0 unless $_[0]->attr('style');
                                return 0 unless $_[0]->attr('style') =~ /padding-top:10px;padding-left:15px/;
                        })){
                if($div->as_text =~ /歌手:(.*) 专辑:(.*)/){
                        push @actors, $1;
                }elsif($div->as_text =~ /歌手:(.*)/){
                        push @actors, $1;
                }else{
                        push @actors, $div->as_text;
                }
        }

        return @actors;
}

sub get_links
{
        my $tree = $_[0] || die "No tree";
        my @links;

        foreach my $div (
                $tree->look_down('_tag', 'div',
                        sub{
                                return 0 unless $_[0]->attr('class');
                                return 0 unless $_[0]->attr('class') =~ /unnamed3/;
                        })){
                if(my ($a) = ($div->look_down('_tag', 'a',
                                sub{
                                        return 0 unless $_[0]->as_text;
                                        return 0 unless $_[0]->as_text =~ /LRC歌词/;
                                }))){
                        push @links, $a->attr('href');
                }else{
                        push @links, '';
                }
        }

        return @links;
}

sub get_page_num
{
        my $tree = $_[0] || die "No tree";
        my $num = 1;

        if(my ($div) = (
                $tree->look_down('_tag', 'div',
                        sub{
                                return 0 unless $_[0]->attr('class');
                                return 0 unless $_[0]->attr('class') =~ /pg/;
                        }))){
                foreach my $a ($div->look_down('_tag', 'a',
                                sub{
                                        return 0 unless $_[0]->as_text;
                                })){
                        if($a->as_text =~ /\[(.*)\]/){
                                $num = int($1);
                        }
                }
        }

        return $num;
}

# Usage: baidu
sub usage
{
        print " Usage: baidulrc [options] MusicName.\n";
        print "         -l              get the lyrics list.\n";
        print "         -p num          set the page to n, default 1.\n";
        print "         -n num          set the number to n.\n";
        print "         -d              download the special lyrics.\n";
        print "         -x              output the content in XML format.\n";
        print "         -h              show this help page.\n";
}

# parse the options.
my %opts;
if(!getopts("lp:n:dxh", \%opts)){
        usage();
        exit(1);
}
if($opts{"h"}){
        usage();
        exit(0);
}
if($#ARGV != 0){
        usage();
        exit(1);
}
my $music_name = $ARGV[0];
my $page_number = 0;
my $music_number = -1;
if($opts{"p"} && $opts{"p"} > 1){
        $page_number = ($opts{"p"} - 1) * 10;
}
if($opts{"n"}){
        if($opts{"n"} < 1 || $opts{"n"} > 10){
                print "The lyrics number must between 0 and 9.\n";
                exit(1);
        }
        $music_number = $opts{"n"} - 1;
}

# get the lyrics list
my $uri = URI->new('');
$uri->query_form(
#       'f' => 'ms',
        'rn' => '10',
        'tn' => 'baidump3lyric',
        'ct' => '150994944',
        'word' => $music_name,
        'lm' => '-1',
        'z' => '0',
        'cl' => '3',
        'sn' => '',
        'cm' => '1',
        'sc' => '1',
        'bu' => '',
        'pn' => "$page_number"
);
my $ua = LWP::UserAgent->new;
my $response = $ua->get($uri->as_string) || die("get $uri->as_string failed.\n");
my $tree = HTML::TreeBuilder->new;
if($response->is_success){
        $tree->parse_content($response->content) || die("parse failed.\n");
}else{
        die $response->status_line;
}

my @links = get_links $tree;
if($opts{"d"}){ # download the lyrics
        if($music_number < 0){
                print "You must give the music mumber.\n";
                usage();
                exit(1);
        }
        if($links[$music_number] eq ""){
                print "There is no LRC lyrics for this music, try the others.\n";
                exit(1);
        }
        my $fua = LWP::UserAgent->new;
        my $fres = $fua->get($links[$music_number]) || die("get $links[$music_number] failed.\n");
        if(!$fres->is_success){
                print "get $links[$music_number] failed.\n";
                exit(1);
        }
        if($opts{"x"}){
                print "\n";
                print "\n";
        }
        foreach my $line (split("\n", $fres->content)){
                if($line =~ /^\[.*:.*\].*$/){
                        print $line . "\n";
                }
        }
        if($opts{"x"}){
                print "
\n";
        }
        exit 0;
}

my @musics = get_musics $tree;
my @actors = get_actors $tree;
my $num = get_page_num $tree;
# output the list
if($opts{"x"}){
        print "\n";
        print "\n";
        for(my $i = 0; $i <= $#musics; $i ++){
                print "\n";
                print "$musics[$i]\n";
                print "$actors[$i]\n";
                print "$links[$i]\n";
                print "
\n";
        }
        print "$num\n";
        print "
\n";
}else{
        for(my $i = 0; $i <= $#musics; $i ++){
                print $i + 1 . "\t $musics[$i] - $actors[$i]\n"
        }
        print "Total page number: $num.\n";
}

# free the memory and exit
$tree->delete;
exit(0);

阅读(2086) | 评论(0) | 转发(0) |
0

上一篇:柳絮飘飘

下一篇:Syn flood 原理兼程序

给主人留下些什么吧!~~