一直想找一个能够完成FTP Synchronizer差不多功能的的工具,花费了很多时间找到了一个perl脚本,原始版本,详细注释 ,升级版本,
首先感谢那三位作者的贡献,但是我发现该脚本仍然有几个问题:
1。ftp固有的不能设置修改时间的问题。
2。不能报告同步冲突
3。同名且大小相等并不能保证两个文件一致。
所以我对代码进行了修改,分享给大家,也希望大家帮我检查一下里面可能存在的问题(我已经进行过测试,但难保没问题。),让这个程序能够完全代替ftp synchronizer.
用法:
-h 帮助
-v verbose模式
-d 输出ftp调试信息
-k 输出同步动作,并不执行。强烈建议同步前做一次,以免误伤。
-P 设置ftp passive mode。防火墙后的机器需要。
-i 不同步的文件
-s ftp服务器地址
-u ftp用户名
-p ftp密码
-r ftp同步目录
-l 本地同步目录
-o 时差偏移量,单位s,本地时区-ftp时区。第一次同步时需要,之后不用。
如下:
#!/usr/bin/perl
# This script is (c) 2002 Luis E. Muñoz, All Rights Reserved
# (c) 2005 Peter Orvos, All Rights Reserved
# (c) 2006 Edwin Zuidema, All Rights Reserved
# (c) 2008 Plato, All Rights Reserved
# This code can be used under the same terms as Perl itself. It comes
# with absolutely NO WARRANTY. Use at your own risk.
#
# TO BE DONE
# - mtime: if from L->R, R has current mtime. Then next round R will go L(newer)
# And then L-R and so on. How to solve? Remote mtime? update local time?
#
use strict;
use warnings;
use Net::FTP;
use File::Find;
use File::Listing; # Try EZ
use Pod::Usage;
use Getopt::Std;
use POSIX 'strftime';
#Plato Wu,2008/09/06
use Storable;
use Digest::MD5;
use vars qw($opt_s $opt_k $opt_u $opt_l $opt_p $opt_r $opt_h $opt_v
$opt_d $opt_P $opt_i $opt_o);
getopts('i:o:l:s:u:p:r:hkvdP');
if ($opt_h)
{
pod2usage({-exitval => 2,
-verbose => 2});
}
# Defaults are set here
$opt_s ||= 'localhost';
$opt_u ||= 'anonymous';
$opt_p ||= 'someuser@';
$opt_r ||= '/';
$opt_l ||= '.';
$opt_o ||= 0;
$opt_i = qr/$opt_i/ if $opt_i;
$|++; # Autoflush STDIN
my %rem = ();
my %loc = ();
my $last_file = ".last";
print "Using time offset of $opt_o seconds\n" if $opt_v and $opt_o;
# Phase 0: Scan local path and see what we
# have
print "\n### Phase 0: Scanning local ###\n";
print "dir: $opt_l\n";
chdir $opt_l or die "Cannot change dir to $opt_l: $!\n";
# First get date/time of last sync
my $last = ((stat($last_file))[9] || 0);
my $mdtm_form = strftime("%c",localtime($last));
print "Last time synced: $mdtm_form\n";
find(
{
no_chdir => 1,
follow => 0, # No symlinks, please
wanted => sub
{
return if $File::Find::name eq '.';
$File::Find::name =~ s!^\./!!;
if (($opt_i and $File::Find::name =~ m/$opt_i/) ||
($File::Find::name =~ m/$last_file/))
{
print "local: IGNORING $File::Find::name\n" if $opt_d;
return;
}
stat($File::Find::name);
my $type = -f _ ? 'f' : -d _ ? 'd' : -l $File::Find::name ? 'l' : '?';
my @dirs = split /\//, $File::Find::name;
open(F, $File::Find::name) or die "open error";
binmode(F);
my $r = $loc{$File::Find::name} =
{
md5 => Digest::MD5->new->addfile(*F)->hexdigest,
mdtm => (stat(_))[9],
size => (stat(_))[7],
type => $type,
};
close F;
my $mdtm_form = strftime("%c",localtime($r->{mdtm}));
print "local: adding $File::Find::name (",
"$r->{mdtm}, $mdtm_form, $r->{size}, $r->{type})\n" if $opt_d;
},
}, '.' );
# Phase 1: Build a representation of what's
# in the remote site
print "\n### Phase 1: Scanning FTP ###\n";
my $ftp = new Net::FTP ($opt_s,
Timeout => 999,
Debug => $opt_d,
Passive => $opt_P,
);
die "Failed to connect to server '$opt_s': $!\n" unless $ftp;
die "Failed to login as $opt_u\n" unless $ftp->login($opt_u, $opt_p);
die "Cannot change directory to $opt_r\n" unless $ftp->cwd($opt_r);
warn "Failed to set binary mode\n" unless $ftp->binary();
my $needhome = 0;
print "connected\n" if $opt_v;
sub scan_ftp
{
my $ftp = shift;
my $path = shift;
my $rrem = shift;
print "scan_ftp $ftp, path $path, rrem $rrem\n";
# my $rdir = length($path) ? $ftp->dir($path) : $ftp->dir();
# parse_dir of File:Listing better parses mtime for directories
# my $rdir = length($path) ? parse_dir($ftp->dir($path)) :
parse_dir($ftp->dir());
my $rdir;
my @r2dir;
# $path =~ s/\s/\\ /g;
# $path = "\"$path\"";
# print "scan_ftp $ftp, path $path, rrem $rrem\n";
if (length($path)) {
# Already in a path
$ftp->cwd("$opt_r/$path");
# Plato Wu,2008/09/08
# it enter sub directory, then set a flag to use in the future.
$needhome = 1;
} else {
print "first call\n";
$ftp->cwd("$opt_r");
}
$rdir = parse_dir($ftp->dir());
return unless $rdir and @$rdir;
# print "Going through the files in this dir ($path)\n";
for my $f (@$rdir)
{
# print "a file found in this dir ($path)\n";
next if $f =~ m/^d.+\s\.\.?$/;
# my @line = split(/\s+/, $f, 9);
# my $n = (@line == 4) ? $line[3] : $line[8]; # Compatibility with windows FTP
# next unless defined $n;
# print "parsing entry (in dir $path)\n";
my ($n, $type, $size, $mtime, $mode) = @$f;
my $name = '';
$name = $path . '/' if $path;
$name .= $n;
if ($opt_i and $name =~ m/$opt_i/)
{
print "remote: IGNORING $name\n" if $opt_d;
next;
}
# print "name '$name'\n" if $opt_v;
next if exists $rrem->{$name};
my $mdtm = ($mtime || 0) + $opt_o;
$size = $size || 0;
# my $mdtm = ($ftp->mdtm($name) || 0) + $opt_o;
# my $size = $ftp->size($name) || 0;
# my $type = (@line == 4) ? ($line[2] =~/\/i ? 'd' : 'f') : substr($f, 0, 1); # Compatibility with windows FTP
$type =~ s/-/f/;
my $mdtm_form = strftime("%c",localtime($mdtm));
if ($type eq 'd') {
print "remote: recursing in dir $name: calling scan_ftp($ftp,
$name, $rrem)\n" if $opt_v;
scan_ftp($ftp, $name, $rrem);
}
# } else {
print "remote: adding file $name (offset mtime $mdtm_form)\n" if
$opt_v;
$rrem->{$name} =
{
mdtm => $mdtm,
size => $size,
type => $type,
}
# }
}
}
# Plato Wu,2008/09/06
if ($ftp->get($last_file, $last_file."remote")){
# it seems no using
# To do use parse_dir instead of mdtm for some ftp does not support it.
# utime $ftp->mdtm($last_file), $ftp->mdtm($last_file), $last_file."remote";
my $hash_ref = retrieve $last_file."remote";
%rem = %$hash_ref;
# unlink $last_file."remote";
# exit;
}else{
scan_ftp($ftp, '', \%rem);
}
$ftp->cwd($opt_r) if $needhome;
#
# Phase 2: Handle missing files
#
print "\n### Phase 2: Missing files ###\n";
# Algorithm
# If file is older than last sync delete it
# If file is newer than last sync sync it
# For local files:
for my $ml (sort { length($a) <=> length($b) } keys %loc)
{
if ($loc{$ml}->{type} eq 'l')
{
warn "Symbolic link $ml not supported\n";
next;
}
# Skip if file/dir exists also remotely (will be handled in phase 3)
next if exists $rem{$ml};
# File/dir exists locally but not remotely
print "$ml file/dir missing from the FTP repository\n" if $opt_v;
# Check if newer than last sync
print "mdtm $loc{$ml}->{mdtm} last $last\n" if $opt_v;
if ($loc{$ml}->{mdtm} > $last) {
# Newer, so copy to remote
if ($loc{$ml}->{type} eq 'd')
{
print "$ml dir missing remotely, making remotely\n" if $opt_v;
$opt_k ? print "Kidding: MKDIR $ml\n" : $ftp->mkdir($ml)
or die "Failed to MKDIR $ml\n";
}
else # Regular file
{
print "$ml file missing remotely, PUTting\n" if $opt_v;
$opt_k ? print "Kidding: PUT $ml $ml\n" : $ftp->put($ml, $ml)
or print "*** Failed to PUT $ml ***\n";
}
} else {
# Local file older than last sync, so deleted from remote. Also delete locally
if ($loc{$ml}->{type} eq 'd') {
print "$ml dir removed remotely, removing locally\n" if $opt_v;
$opt_k ? print "Kidding: rmdir $ml\n" : rmdir($ml)
or print "*** Failed to rmdir dir $ml ***\n";
} else {
print "$ml file removed remotely, removing locally\n" if $opt_v;
$opt_k ? print "Kidding: rm $ml\n" : unlink($ml)
or print "*** Failed to rm $ml ***\n";
}
# Plato Wu,2008/09/07
# maintain %loc
delete $loc{$ml};
}
}
# For remote files:
for my $mr (sort { length($a) <=> length($b) } keys %rem)
{
if ($rem{$mr}->{type} eq 'l')
{
warn "Symbolic link $mr not supported\n";
next;
}
# Skip if file/dir exists also locally (will be handled in phase 3)
next if exists $loc{$mr};
print "$mr file/dir missing locally\n" if $opt_v;
# Check if newer than last sync
print "mdtm $rem{$mr}->{mdtm} last $last\n" if $opt_v;
if ($rem{$mr}->{mdtm} > $last) {
# Plato Wu,2008/09/07
# maintain %loc
$loc{$mr} = $rem{$mr};
# Newer, so copy to local
if ($rem{$mr}->{type} eq 'd') {
print "$mr dir missing in the local repository, making locally\n"
if $opt_v;
$opt_k ? print "Kidding: mkdir $mr\n" : mkdir($mr)
or print "*** Failed to MKDIR $mr ***\n";
} else {
print "$mr file missing in the local repository, GETting\n" if
$opt_v;
$opt_k ? print "Kidding: GET $mr $mr\n" : $ftp->get($mr, $mr)
or print "*** Failed to GET $mr ***\n";
}
# Added EZ: Set the file time to the mdtm
my $mdtm_form = strftime("%c",localtime($rem{$mr}->{mdtm}));
print "Setting mtime $mdtm_form to local $mr\n" if $opt_v;
$opt_k ? print "Kidding: Set Utime\n" : utime $rem{$mr}->{mdtm},
$rem{$mr}->{mdtm}, $mr;
} else {
# Remote file older than last sync, so deleted locally
# Also delete remotely
if ($rem{$mr}->{type} eq 'd') {
print "$mr dir deleted locally, removing remotely\n" if $opt_v;
$opt_k ? print "Kidding: ftp->rmdir $mr\n" : $ftp->rmdir($mr)
or print "*** Failed to remote rmdir $mr ***\n";
} else {
print "$mr file deleted locally, removing remotely\n" if $opt_v;
$opt_k ? print "Kidding: ftp->delete $mr\n" : $ftp->delete($mr)
or print "*** Failed to remote delete $mr ***\n";
}
}
}
#
# Phase 3: For files that exist on both sides
#
print "\n### Phase 3: Files on both sides ###\n";
# For remote files: Download if newer
for my $dl (sort { length($a) <=> length($b) } keys %rem)
{
# only handle files that exist on both sides
next if not exists $loc{$dl};
warn "Symbolic link $dl not supported\n"
if $rem{$dl}->{type} eq 'l';
# forget dirs?
if ($rem{$dl}->{type} eq 'f')
{
# Plato Wu,2008/09/07
# remarks for handle exactly problem in the other place
# Skip if exactly the same size
# next if $rem{$dl}->{size} eq $loc{$dl}->{size};
# Skip if remote older (local newer)
next if $rem{$dl}->{mdtm} <= $loc{$dl}->{mdtm};
# # If remote smaller, remove remote and PUT
# if ($rem{$dl}->{size} < $loc{$dl}->{size})
# {
# print "$dl file smaller in the remote repository ";
# print "(local: $loc{$dl}->{size} remote: $rem{$dl}->{size})\n";
# print "DELETEing\n";
# $opt_k ? print "Kidding: ftp->delete $dl\n" : $ftp->delete($dl)
# or print "*** Failed to remote delete $dl ***\n";
# print "PUTting\n";
# $opt_k ? print "Kidding: PUT $dl $dl\n" : $ftp->put($dl, $dl)
# or print "*** Failed to PUT $dl ***\n";
# } else {
# GET if file local older
my $mdtm_form_loc = strftime("%c",localtime($loc{$dl}->{mdtm}));
my $mdtm_form_rem = strftime("%c",localtime($rem{$dl}->{mdtm}));
# Plato Wu,2008/09/07
# next if exactly the same size and md5 checksum
if (($rem{$dl}->{size} eq $loc{$dl}->{size}) && ($rem{$dl}->{md5} eq
$loc{$dl}->{md5})){
if($rem{$dl}->{mdtm} > $loc{$dl}->{mdtm}){
print "Setting mtime $mdtm_form_rem to local $dl\n" if $opt_v;
$opt_k ? print "Kidding: Set Utime\n" : utime
$rem{$dl}->{mdtm}, $rem{$dl}->{mdtm}, $dl;
}
next;
}
# Plato, 08/09/06
# if remote > local >= last sync, it mean there is a conflict after last sync
# use = for cautious
if ($loc{$dl}->{mdtm} >= $last) {
print "there is a newer file $dl in local and cause a conflict,
please handle it\n";
print $mdtm_form_loc, ",", $mdtm_form_rem, ",", $mdtm_form, "\n";
next;
}
if ($opt_v)
{
print "$dl file older in the local repository ";
print "(local: $loc{$dl}->{mdtm} $mdtm_form_loc remote:
$rem{$dl}->{mdtm}) $mdtm_form_rem\n";
print "GETting\n"
}
$opt_k ? print "Kidding: GET $dl $dl\n" : $ftp->get($dl, $dl)
or print "*** Failed to GET $dl ***\n";
# Added EZ: Set the file time to the mdtm
print "Setting mtime $mdtm_form_rem to local $dl\n" if $opt_v;
$opt_k ? print "Kidding: Set Utime\n" : utime $rem{$dl}->{mdtm},
$rem{$dl}->{mdtm}, $dl;
# Plato Wu,2008/09/07
# maintain %loc for put it in the future.
$loc{$dl} = $rem{$dl};
}
# }
}
# For local files: Upload if newer
for my $ul (sort { length($a) <=> length($b) } keys %loc)
{
# only handle files that exist on both sides
next if not exists $rem{$ul};
warn "Symbolic link $ul not supported\n"
if $loc{$ul}->{type} eq 'l';
if ($loc{$ul}->{type} eq 'f')
{
# Skip if local older (remote newer)
# fix with 100s for rounding errors
# Plato Wu,2008/09/08
# now it does not need fix rounding error for it use actual modification time
# not ftp put time.
# next if ($rem{$ul}->{mdtm} + 100) >= $loc{$ul}->{mdtm};
next if $rem{$ul}->{mdtm} >= $loc{$ul}->{mdtm};
# Plato Wu,2008/09/07
# next if exactly the same size and md5 checksum
next if ($rem{$ul}->{size} eq $loc{$ul}->{size}) && ($rem{$ul}->{md5}
eq $loc{$ul}->{md5}) ;
# PUT if file remote older
my $mdtm_form_loc = strftime("%c",localtime($loc{$ul}->{mdtm}));
my $mdtm_form_rem = strftime("%c",localtime($rem{$ul}->{mdtm}));
# Plato, 08/09/06
# if local > remote >= last sync, it mean there is a conflict after last sync
# use = for cautious
if ($rem{$ul}->{mdtm} >= $last) {
print "there is a newer file $ul in remote and cause a conflict,
please handle it\n";
print $mdtm_form_loc, ",", $mdtm_form_rem, ",", $mdtm_form, "\n";
next;
}
if ($opt_v)
{
print "$ul file older in the FTP repository ";
print "(local: $loc{$ul}->{mdtm} $mdtm_form_loc remote:
$rem{$ul}->{mdtm}) $mdtm_form_rem\n";
print "PUTting\n"
}
$opt_k ? print "Kidding: PUT $ul $ul\n" : $ftp->put($ul, $ul)
or print "*** Failed to PUT $ul ***\n";
}
}
# Update last sync time
my $now = time;
# Plato, 08/09/05, if file does not exist, utime can not create it, so add a open & close sentence
# $opt_k ? print "Kidding: TOUCH $last_file\n" : utime $now, $now, $last_file or (open F, ">$last_file") && (close F);
# Plato Wu,2008/09/06
# save local file information
if($opt_k){
print "Kidding: Store sync file\n"
}else{
open F, ">$last_file"; #or print "open error";
store \%loc, $last_file; # or print "write error";
close F;
}
# open F, ">$last_file"; #or print "open error";
# $opt_k ? print "Kidding: Store sync file\n" : store \%loc, $last_file;
# # or print "write error";
# close F;
# Plato Wu,2008/09/07
$opt_k ? print "Kidding: PUT $last_file\n" : $ftp->put($last_file, $last_file);
print "### Done ###\n";
__END__
=pod
=head1 NAME
ftpsync - Sync a hierarchy of local files with a remote FTP repository
=head1 SYNOPSIS
ftpsync [-h] [-v] [-d] [-k] [-P] [-s server] [-u username] [-p password] [-r
remote] [-l local] [-i ignore] [-o offset]
=head1 ARGUMENTS
The recognized flags are described below:
=over 2
=item B<-h>
Produce this documentation.
=item B<-v>
Produce verbose messages while running.
=item B<-d>
Put the C object in debug mode and also emit some debugging
information about what's being done.
=item B<-k>
Just kidding. Only announce what would be done but make no change in
neither local nor remote files.
=item B<-P>
Set passive mode.
=item B<-i ignore>
Specifies a regexp. Files matching this regexp will be left alone.
=item B<-s server>
Specify the FTP server to use. Defaults to C.
=item B<-u username>
Specify the username. Defaults to 'anonymous'.
=item B<-p password>
Password used for connection. Defaults to an anonymous pseudo-email
address.
=item B<-r remote>
Specifies the remote directory to match against the local directory.
=item B<-l local>
Specifies the local directory to match against the remote directory.
=item B<-o offset>
Allows the specification of a time offset between the FTP server and
the local host. This makes it easier to correct time skew or
differences in time zones.
=back
=head1 DESCRIPTION
This is an example script that should be usable as is for simple
website maintenance. It synchronizes a hierarchy of local files /
directories with a subtree of an FTP server.
The synchronyzation is quite simplistic. It was written to explain how
to C