出于想投资股票的想法,想获取些投资数据,同时练习下python,于是走上了程序猿的道路。
-
# encoding:utf-8
-
import sys
-
import re
-
from urllib2 import Request, urlopen, URLError, HTTPError
-
-
def get_packet(url):
-
packet = urlopen(url)
-
content = packet.read()
-
return content.decode('gb2312')
-
def get_data(packet):
-
xiangmu = '~'
-
tmp = re.findall(r'(.*)',packet)
-
tmp2 = re.findall('
(.*) | ',packet)
-
if tmp is not None:
-
xiangmu = tmp
-
items = []
-
for items in xiangmu:
-
print items
-
##print items.encode('utf-16')
-
if tmp2 is not None :
-
shuju = tmp2
-
items2 = []
-
for items2 in shuju:
-
print items2
-
print tmp.encode('gb2312')
-
if __name__=='__main__':
-
url = ''
-
packet = get_packet(url)
-
if packet =='~':
-
sys.exit(0)
-
get_data(packet)
首先获取网页源文件get_packet,然后用正则表达式把想要的字符扣出来。
(.*)就是扣出来的文字,然后打印输出。re.findall是查找全部匹配的结果,然后返回一个列表。
阅读(2556) | 评论(0) | 转发(0) |