将g1转前面的数字去掉,然后加上前缀,并且去掉.xml结尾的页面,然后下载到本地:
>cat g1 | less
2 /wii-memory-card-c-985_1523.html
6 /wii-controllers-guns-c-985_1525.html
2 /wii-cases-skins-c-985_1467.html
5 /wii-bags-c-985_1524.html
2 /wii-accessories-c-985_1458.html
2 /wifi-80211bg-bluetooth-wireless-pci-bit-card-p-15614.html
3 /white-womens-terry-spa-body-wrap-towel-shower-bath-robe-p-26644.html
2 /white-usb-docking-charger-for-ipod-shuffle-2nd-generation-p-24575.html
2 /white-usb-charging-cable-for-ipod-nano-touch-iphone-p-33068.html
2 /white-stereo-35mm-earphones-headphones-special-for-apple-iphone-p-17759.html
>cat g5 | less
wget
wget
wget
wget
wget
wget
wget
wget
wget
wget
处理过程如下:
cat * > g1
cut -c 9- g1 > g2
sort g2 > g3
uniq g3 > g4
vim g4 -- :1,$ s#^/#wget
cat g4 | grep -v '.xml$' > g5
chmod +x g5
./g5
%3E%20%20%20%20%20sourcingmap%20%20%20%2
在下载的时候遇到这样的网址,不能下载,而且会终断下载继续进行,如下方法解决:
cat g5 |grep -v '%' > g6
阅读(685) | 评论(0) | 转发(0) |