原文地址:
http://blog.csdn.net/yueguanghaidao/article/details/41043375
今天在用爬虫时gevent报了AssertionError: Impossible to call blocking function in the event loop callback
异常,很奇怪,难道是patch_socket惹的货,因为之前没有使用patch_socket是正常的,代码简化如下
-
import urllib
-
import gevent
-
from gevent.monkey import patch_socket
-
from gevent.hub import get_hub
-
-
def f():
-
r = urllib.urlopen("").read()
-
print r[:10]
-
-
def timer(after, repeat, f):
-
t = get_hub().loop.timer(after, repeat)
-
t.start(f)
-
return t
-
-
def run():
-
patch_socket()
-
timer(1, 5, f)
-
gevent.sleep(100)
-
-
run()
这段代码就是每5秒调用一次f,f也就是很简单的打压百度首页前10个字符,各位看官在揭开答案请先想想为什么为这样?
我把异常栈也贴在下面,有助有分析
-
File "C:\Python27\lib\httplib.py", line 772, in connect
-
self.timeout, self.source_address)
-
File "C:\Python27\lib\site-packages\gevent\socket.py", line 570, in create_connection
-
for res in getaddrinfo(host, port, 0 if has_ipv6 else AF_INET, SOCK_STREAM):
-
File "C:\Python27\lib\site-packages\gevent\socket.py", line 621, in getaddrinfo
-
return get_hub().resolver.getaddrinfo(host, port, family, socktype, proto, flags)
-
File "C:\Python27\lib\site-packages\gevent\resolver_thread.py", line 34, in getaddrinfo
-
return self.pool.apply_e(self.expected_errors, _socket.getaddrinfo, args, kwargs)
-
File "C:\Python27\lib\site-packages\gevent\threadpool.py", line 222, in apply_e
-
success, result = self.spawn(wrap_errors, expected_errors, function, args, kwargs).get()
-
File "C:\Python27\lib\site-packages\gevent\event.py", line 226, in get
-
result = self.hub.switch()
-
File "C:\Python27\lib\site-packages\gevent\hub.py", line 330, in switch
-
switch_out()
-
File "C:\Python27\lib\site-packages\gevent\hub.py", line 334, in switch_out
-
raise AssertionError('Impossible to call blocking function in the event loop callback')
-
AssertionError: Impossible to call blocking function in the event loop callback
-
0x2652ed0 callback=0x026B0070> args=()> failed with AssertionError
刚开始我百思不得其解,就这么简单为什么会有问题?
看异常栈是调用hub的switch_out出的问题,
-
def switch(self):
-
switch_out = getattr(getcurrent(), 'switch_out', None)
-
if switch_out is not None:
-
switch_out()
-
return greenlet.switch(self)
-
-
def switch_out(self):
-
raise AssertionError('Impossible to call blocking function in the event loop callback')
以前文章提过,gevent提供了switch_out方法用于当前greenlet换出时调用,咦,可为什么调用的hub的
switch_out?按理说应该调用其它greenlet的switch_out,怪不得有问题,hub都被换出了,谁去做调度呢?
问题就出在这里?你有没有发现,在上面的代码中只有hub,压根没有其它的greenlet。
我们走一遍代码逻辑,首先给系统注册一定时器f,当调用f时由于socket阻塞,所以会切换到hub,此时会调用之前greenlet的switch_out方法,可不幸的是之前的greenlet就是hub,所以出问题了。
知道了问题所在就好解决了,也就是用一个greenlet包装一下f,代码如下:
-
import urllib
-
import gevent
-
from gevent.monkey import patch_socket
-
from gevent.hub import get_hub
-
-
def patch_greenlet(f):
-
def inner(*args, **kwargs):
-
return gevent.spawn(f, *args, **kwargs)
-
return inner
-
-
@patch_greenlet
-
def f():
-
r = urllib.urlopen("").read()
-
print r[:10]
-
-
def timer(after, repeat, f):
-
t = get_hub().loop.timer(after, repeat)
-
t.start(f)
-
return t
-
-
def run():
-
patch_socket()
-
timer(1, 0, f)
-
gevent.sleep(100)
-
-
run()
不得不说使用gevent会碰到很多问题,这也许就是协成让人痴迷的一个原因吧,享受"找虐"的兴趣,越享受,越能驾驭它。
阅读(2395) | 评论(0) | 转发(0) |