在 tornado 中异步无阻塞的执行耗时任务-yueming-ChinaUnix博客

疯狂Erlangyueming.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

yueming

博客访问： 5182659
博文数量： 921
博客积分： 16037
博客等级：上将
技术积分： 8469
用户组：普通用户
注册时间： 2006-04-05 02:08

文章分类

全部博文（921）

计算机网络（2）
git（2）
数据结构和算法（4）
Erlang（100）

mnesia（1）
云计算（5）
游戏开发（30）
C++/C（1）
Flex（2）

Flex框架（0）

mxml（0）

AS3（0）
UML（1）
数据库（54）

MongoDB（1）

NOSQL（4）

关系型(Mysql)（0）

redis（49）
python（266）

gevent（2）

Django（7）

Twisted（94）

wxpython（0）
WEB系统架构（6）
英文文档翻译（0）

Magento文档翻译（0）
PHP5（82）
jQuery（4）
zend framework（36）
AJAX（6）
js（19）
css+div（0）
web2.0技术（1）
Linux（52）
教学内容（4）
IT生活杂谈（12）

C/C++（4）
ksh&sh&csh（14）
WINDOWS（9）

AMP（9）

平面&三维设计（0）

网页三剑客&&html（0）

asp&&sqlserver（0）
netbsd&&openbsd（0）
gcc&&makefile（6）
FAMP（151）
FreeBSD（41）
未分配的博文（11）

文章存档

2020年（1）

2019年（3）

2018年（3）

2017年（6）

2016年（47）

2015年（72）

2014年（25）

2013年（72）

2012年（125）

2011年（182）

2010年（42）

2009年（14）

2008年（85）

2007年（89）

2006年（155）

我的朋友

tornado 中使用多线程无阻塞来处理 dns 请求

# 删除了注释 
class ThreadedResolver(ExecutorResolver):
    _threadpool = None
    _threadpool_pid = None
    def initialize(self, io_loop=None, num_threads=10):
        threadpool = ThreadedResolver._create_threadpool(num_threads)
        super(ThreadedResolver, self).initialize(
            io_loop=io_loop, executor=threadpool, close_executor=False)
    @classmethod
    def _create_threadpool(cls, num_threads):
        pid = os.getpid()
        if cls._threadpool_pid != pid:
            # Threads cannot survive after a fork, so if our pid isn't what it
            # was when we created the pool then delete it.
            cls._threadpool = None
        if cls._threadpool is None:
            from concurrent.futures import ThreadPoolExecutor
            cls._threadpool = ThreadPoolExecutor(num_threads)
            cls._threadpool_pid = pid
        return cls._threadpool

ThreadedResolver 是 ExecutorEesolver 的子类，看看它的是实现。

class ExecutorResolver(Resolver):
    def initialize(self, io_loop=None, executor=None, close_executor=True):
        self.io_loop = io_loop or IOLoop.current()
        if executor is not None:
            self.executor = executor
            self.close_executor = close_executor
        else:
            self.executor = dummy_executor
            self.close_executor = False
    def close(self):
        if self.close_executor:
            self.executor.shutdown()
        self.executor = None
    @run_on_executor
    def resolve(self, host, port, family=socket.AF_UNSPEC):
        addrinfo = socket.getaddrinfo(host, port, family, socket.SOCK_STREAM)
        results = []
        for family, socktype, proto, canonname, address in addrinfo:
            results.append((family, address))
        return results

从 ExecutorResolver 的实现可以看出来，它的关键参数是 ioloop 和 executor，干活的 resolve 函数被@run_on_executor 修饰，结合起来看 ThreadedResolver 的实现，那么这里的 executor 就是from concurrent.futures import ThreadPoolExecutor

再来看看 @run_on_executor 的实现。

run_on_executor 的实现在 concurrent.py 文件中，它的源码如下：

def run_on_executor(fn):
    @functools.wraps(fn)
    def wrapper(self, *args, **kwargs):
        callback = kwargs.pop("callback", None)
        future = self.executor.submit(fn, self, *args, **kwargs)
        if callback:
            self.io_loop.add_future(future,
                                    lambda future: callback(future.result()))
        return future
    return wrapper

关于 functions.wraps() 的介绍可以参考官方文档 functools — Higher-order functions and operations on callable objects

简单的说，这里对传递进来的函数进行了封装，并用 self.executor.submit() 对的函数进行了执行，并判断是否有回调，如果有，就加入到 ioloop 的 callback 里面。

对比官方的 concurrent.futures.Executor 的，里面有个 submit() 方法，从头至尾看看ThreadedResolver 的实现，就是使用了 concurrent.futures.ThreadPoolExecutor 这个 Executor 的子类。

所以 tornado 中解析 dns 使用的多线程无阻塞的方法的实质就是使用了 concurrent.futures 提供的ThreadPoolExecutor 功能。

使用多线程无阻塞方法来执行耗时的任务

借鉴 tornado 的使用方法，在我们自己的程序中也使用这种方法来处理耗时的任务。

from tornado.concurrent import run_on_executor
from concurrent.futures import ThreadPoolExecutor
class LongTimeTask(tornado.web.RequestHandler):
    executor = ThreadPoolExecutor(10)
    @run_on_executor()
    def get(self, data):
        long_time_task(data)

上面就是一个基本的使用方法，下面展示一个使用 sleep() 来模拟耗时的完整程序。

#!/usr/bin/env python
#-*-coding:utf-8-*-
import tornado.ioloop
import tornado.web
import tornado.httpserver
from concurrent.futures import ThreadPoolExecutor
from tornado.concurrent import run_on_executor
import time
class App(tornado.web.Application):
    def __init__(self):
        handlers = [
            (r'/', IndexHandler),
            (r'/sleep/(\d+)', SleepHandler),
        ]
        settings = dict()
        tornado.web.Application.__init__(self, handlers, **settings)
class BaseHandler(tornado.web.RequestHandler):
    executor = ThreadPoolExecutor(10)
class IndexHandler(tornado.web.RequestHandler):
    def get(self):
        self.write("Hello, world %s" % time.time())
class SleepHandler(BaseHandler):
    @run_on_executor
    def get(self, n):
        time.sleep(float(n))
        self._callback()
    def _callback(self):
        self.write("after sleep, now I'm back %s" % time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()))
if __name__ == "__main__":
    app = App()
    server = tornado.httpserver.HTTPServer(app, xheaders=True)
    server.listen(8888)
    tornado.ioloop.IOLoop.instance().start()

此时先调用 127.0.0.1:8888/sleep/10 不会阻塞 127.0.0.1:8888/ 了。

以上，就是完整的在 tornado 中利用多线程来执行耗时的任务。

结语

epoll 的好处确实很多，事件就绪通知后，上层任务函数执行任务，如果任务本身需要较耗时，那么就可以考虑这个方法了，
当然也有其他的方法，比如使用 celery 来调度执行耗时太多的任务，比如频繁的需要写入数据到不同的文件中，我公司的一个中，需要把数据写入四千多个文件中，每天产生几亿条数据，就是使用了 tornado + redis + celery 的方法来高效的执行写文件任务。

完。

本文来自：http://www.cnblogs.com/DjangoBlog/p/5267006.html

阅读(13120) | 评论(0) | 转发(0) |

上一篇：redis cluster介绍和proxy实现

下一篇：curl工具发送请求

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6