!!!!毛片是学习的动力系列!!!!
为了下个毛漫画哥们我花了不少时间折腾桀桀......本来re模块还不怎么熟悉的,现在好了...
其实最折腾的地方是[0-9]{n,m},因为sed里要转义,必须写成[0-9]\{n,m\},搞得我re里也这样写半天匹配不了还想不通哪里错了!
- #! /usr/bin/python
-
# -*- coding: UTF-8 -*-
-
import re,urllib,urllib2
-
import os,sys,time
-
#import binascii
-
-
url_link = ''
-
-
def downJPG(link,num):
-
jpg_file = '/root/1/%d.jpg' % num
-
print "print download jpg url %s to file %s" %(link,jpg_file)
-
data = urllib.urlretrieve(link,jpg_file)
-
print "download ok!"
-
# print len(data)
-
# f = file(jpg_file,'wb')
-
# f.write(data)
-
# f.close()
-
-
-
def findLink(link,num):
-
if num > 18:sys.exit(0)
-
respone = urllib2.urlopen(link)
-
text = respone.read()
-
respone.close()
-
down_link = re.compile(r'http:\/\/[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}:[0-9]{1,5}\/h\/.*\/keystamp\=[0-9]{1,}-.[0-9a-zA-Z]{,20}/[0-9a-zA-Z]{1,6}\.jpg').findall(text)
-
if len(down_link)>0:
-
downJPG(down_link[0],num)
-
next_link = re.compile(r'<\/iframe>).findall(text)[0][18:]
-
findLink(next_link,num+1)
-
-
-
if __name__ == '__main__':
-
findLink(url_link,1)
好孩子请勿下载!!看看代码就好orz
阅读(1739) | 评论(0) | 转发(0) |