C++模板元编程技术-剑心通明-ChinaUnix博客

BSD爱好者乐园jxtm.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

剑心通明

博客访问： 19891499
博文数量： 7460
博客积分： 10434
博客等级：上将
技术积分： 78178
用户组：普通用户
注册时间： 2008-03-02 22:54

文章分类

全部博文（7460）

武林英雄攻略（0）
淘宝网推荐（0）

节日礼物（0）

特产美食（0）

运动户外（0）

计算机相关（0）

女人（0）

男人（0）

母婴（0）

居家（0）

美容时尚（0）

手机数码（0）
其他（0）
数据库/php网页编（0）
交换机/路由器/网（0）
AIX（0）
MacOS（0）
C程序设计（0）
BSD相关（0）
shell脚本（0）
未分配的博文（7460）

文章存档

2011年（1）

2009年（669）

2008年（6790）

我的朋友

相关博文

C++模板元编程技术

分类： C/C++

2008-05-30 21:10:41

本文描述了模板元编程技术的起源、概念和机制，并介绍了模板元编程技术在Blitz++和Loki程序库中的应用。

导言

    1994年，C++标准委员会在圣迭哥举行的一次会议期间Erwin Unruh展示了一段可以产生质数的代码。这段代码的特别之处在于质数产生于编译期而非运行期，在编译器产生的一系列错误信息中间夹杂着从2到某个设定值之间的所有质数：

// Prime number computation by Erwin Unruh
template struct D { D(void*); operator int(); };

template struct is_prime {
    enum { prim = (p%i) && is_prime<(i > 2 ? p : 0), i -1> :: prim };
};

template < int i > struct Prime_print {
    Prime_print a;
    enum { prim = is_prime::prim };
    void f() { D d = prim; }
};

struct is_prime<0,0> { enum {prim=1}; };
struct is_prime<0,1> { enum {prim=1}; };
struct Prime_print<2> { enum {prim = 1}; void f() { D<2> d = prim; } };
#ifndef LAST
#define LAST 10
#endif
main () {
    Prime_print a;
}

类模板D只有一个参数为void*的构造器，而只有0才能被合法转换为void*。1994年，Erwin Unruh采用Metaware 编译器编译出错信息如下（以及其它一些信息，简短起见，它们被删除了）：
| Type `enum{}′ can′t be converted to txpe `D<2>′ (\"primes.cpp\",L2/C25).
| Type `enum{}′ can′t be converted to txpe `D<3>′ (\"primes.cpp\",L2/C25).
| Type `enum{}′ can′t be converted to txpe `D<5>′ (\"primes.cpp\",L2/C25).
| Type `enum{}′ can′t be converted to txpe `D<7>′ (\"primes.cpp\",L2/C25).
如今，上面的代码已经不再是合法的C++程序了。以下是Erwin Unruh亲手给出的修订版，可以在今天符合标准的C++编译器上进行编译：

// Prime number computation by Erwin Unruh

template struct D { D(void*); operator int(); };

template struct is_prime {
    enum { prim = (p==2) || (p%i) && is_prime<(i>2?p:0), i-1> :: prim };
};

template struct Prime_print {
Prime_print a;
    enum { prim = is_prime::prim };
    void f() { D d = prim ? 1 : 0; a.f();}
};

template<> struct is_prime<0,0> { enum {prim=1}; };
template<> struct is_prime<0,1> { enum {prim=1}; };

template<> struct Prime_print<1> {
    enum {prim=0};
    void f() { D<1> d = prim ? 1 : 0; }; [Page]
};

#ifndef LAST
#define LAST 18
#endif

main() {
    Prime_print a;
    a.f();
}
在GNU C++ (MinGW Special) 3.2中编译这段程序时，编译器将会给出如下出错信息（以及其它一些信息，简短起见，它们被删除了）：
Unruh.cpp:12: initializing argument 1 of `D::D(void*) [with int i = 17]\'
Unruh.cpp:12: initializing argument 1 of `D::D(void*) [with int i = 13]\'
Unruh.cpp:12: initializing argument 1 of `D::D(void*) [with int i = 11]\'
Unruh.cpp:12: initializing argument 1 of `D::D(void*) [with int i = 7]\'
Unruh.cpp:12: initializing argument 1 of `D::D(void*) [with int i = 5]\'
Unruh.cpp:12: initializing argument 1 of `D::D(void*) [with int i = 3]\'
Unruh.cpp:12: initializing argument 1 of `D::D(void*) [with int i = 2]\'
这个例子展示了可以利用模板实例化机制于编译期执行一些计算。这种通过模板实例化而执行的编译期计算技术即被称为模板元编程。

一个可以运行的模板元编程例子

模板元编程（Template Metaprogramming）更准确的含义应该是“编‘可以编程序的’程序”，而模板元程序（Template Metaprogram）则是“‘可以编程序的’程序”。也就是说，我们给出代码的产生规则，编译器在编译期解释这些规则并生成新代码来实现我们预期的功能。
Erwin Unruh的那段经典代码并没有执行，它只是以编译出错信息的方式输出中间计算结果。让我们来看一个可以运行的模板元编程例子 — 计算给定整数的指定次方：

// xy.h

//原始摸板
template
class XY
{
public:
    enum { result_ = Base * XY::result_ };
};

//用于终结递归的局部特化版
template
class XY
{
public:
    enum { result_ = 1 };
};
模板元编程技术之根本在于递归模板实例化。第一个模板实现了一般情况下的递归规则。当用一对整数来实例化模板时，模板XY需要计算其result_的值，将同一模板中针对实例化所得结果乘以X即可。第二个模板是一个局部特化版本，用于终结递归。
[NextPage]

让我们看看使用此模板来计算5^4 （通过实例化XY<5, 4>）时发生了什么：
// xytest.cpp

#include
#include \"xy.h\"

int main()
{
    std::cout << \"X^Y<5, 4>::result_ = \" << XY<5, 4>::result_;
}
首先，编译器实例化XY<5, 4>，它的result_为5 * XY<5, 3>::result_，如此一来，又需要针对<5, 3>实例化同样的模板，后者又实例化XY<5, 2>…… 当实例化到XY<5, 0>的时候，result_的值被计算为1，至此递归结束。

递归模板实例化的深度和终结条件

可以想象，如果我们以非常大的Y值来实例化类模板XY，那肯定会占用大量的编译器资源甚至会迅速耗尽可用资源（在计算结果溢出之前），因此，在实践中我们应该有节制地使用模板元编程技术。 [Page]
虽然 C++标准建议的最小实例化深度只有17层，然而大多数编译器都能够处理至少几十层，有些编译器允许实例化至数百层，更有一些可达数千层，直至资源耗尽。
假如我们拿掉XY模板局部特化版本，情况会如何？
// xy2.h

//原始摸板
template
class XY
{
public:
    enum { result_ = Base * XY::result_ };
};
测试程序不变：
// xytest2.cpp

#include
#include \"xy2.h\"

int main()
{
    std::cout << \"X^Y<5, 4>::result_ = \" << XY<5, 4>::result_;
}
执行如下编译命令：
C:\\>g++ -c xytest2.cpp
你将会看到递归实例化将一直进行下去，直到达到编译器的极限。
GNU C++ (MinGW Special) 3.2的默认实例化极限深度为500层，你也可以手工调整实例化深度：
C:\\>g++ -ftemplate-depth-3400 -c xytest2.cpp
事实上，g++ 3.2允许的模板实例化极限深度还可以再大一些（我的测试结果是不超过3450层）。
因此，在使用模板元编程技术时，我们总是要给出原始模板的特化版（局部特化版或完全特化版或兼而有之），以作为递归模板实例化的终结准则。

利用模板元编程技术解开循环

模板元编程技术最早的实际应用之一是用于数值计算中的解循环。举个例子，对一个数组进行求和的常见方法是：
// sumarray.h

template
inline T sum_array(int Dim, T* a)
{
    T result = T();
    for (int i = 0; i < Dim; ++i)
    {
        result += a[i];
    }
    return result;
}
这当然可行，但我们也可以利用模板元编程技术来解开循环：
// sumarray2.h

// 原始模板
template
class Sumarray
{
public:
    static T result(T* a)
    {
        return a[0] + Sumarray::result(a+1);
    }
};

// 作为终结准则的局部特化版
template
class Sumarray<1, T>
{
public:
    static T result(T* a)
    {
        return a[0];
    }
};

用法如下：

// sumarraytest2.cpp

#include
#include \"sumarray2.h\"

int main()
{
    int a[6] = {1, 2, 3, 4, 5, 6};
    std::cout << \" Sumarray<6>(a) = \" << Sumarray<6, int>::result(a);
}
当我们计算Sumarray<6, int>::result(a)时，实例化过程如下：
Sumarray<6, int>::result(a)
= a[0] + Sumvector<5, int>::result(a+1)
= a[0] + a[1] + Sumvector<4, int>::result(a+2)
= a[0] + a[1] + a[2] + Sumvector<3, int>::result(a+3)[Page]
= a[0] + a[1] + a[2] + a[3] + Sumvector<2, int>::result(a+4)
= a[0] + a[1] + a[2] + a[3] + a[4] + Sumvector<1, int>::result(a+5)
= a[0] + a[1] + a[2] + a[3] + a[4] + a[5]
可见，循环被展开为a[0] + a[1] + a[2] + a[3] + a[4] + a[5]。这种直截了当的展开运算几乎总是比循环来得更有效率。
也许拿一个有着600万个元素的数组来例证循环开解的优势可能更有说服力。生成这样的数组很容易，有兴趣，你不妨测试、对比一下。

模板元编程在数值计算程序库中的应用

Blitz++之所以“快如闪电”（这正是blitz的字面含义），离不开模板元程序的功劳。Blitz++淋漓尽致地使用了元编程技术，你可以到这些文件源代码中窥探究竟：

dot.h

matassign.h

matmat.h

matvec.h

metaprog.h

product.h

sum.h

vecassign.h

让我们看看Blitz++程序库dot.h文件中的模板元程序：

template
class _bz_meta_vectorDot {
public:
    enum { loopFlag = (I < N-1) ? 1 : 0 };

    template
    static inline BZ_PROMOTE(_bz_typename T_expr1::T_numtype, _bz_typename T_expr2::T_numtype)

[NextPage]

    f(const T_expr1& a, const T_expr2& b)
    {
        return a[I] * b[I] + _bz_meta_vectorDot::f(a,b);
    }

    template
    static inline BZ_PROMOTE(_bz_typename T_expr1::T_numtype, _bz_typename T_expr2::T_numtype)
    f_value_ref(T_expr1 a, const T_expr2& b)
    {
        return a[I] * b[I] + _bz_meta_vectorDot::f(a,b);
    }

    template
    static inline BZ_PROMOTE(_bz_typename T_expr1::T_numtype, _bz_typename T_expr2::T_numtype)
    f_ref_value(const T_expr1& a, T_expr2 b)
    {
        return a[I] * b[I] + _bz_meta_vectorDot::f(a,b);
    }

    template
    static inline BZ_PROMOTE(_bz_typename T_expr1::T_numtype, P_numtype2)
    dotWithArgs(const T_expr1& a, P_numtype2 i1, P_numtype2 i2=0,
                P_numtype2 i3=0, P_numtype2 i4=0, P_numtype2 i5=0, P_numtype2 i6=0,[Page]
                P_numtype2 i7=0, P_numtype2 i8=0, P_numtype2 i9=0, P_numtype2 i10=0)
    {
        return a[I] * i1 + _bz_meta_vectorDot::dotWithArgs
                                                                                   (a, i2, i3, i4, i5, i6, i7, i8, i9);
    }
};

template<>
class _bz_meta_vectorDot<0,0> {
public:
    template
    static inline _bz_meta_nullOperand f(const T_expr1&, const T_expr2&)
    { return _bz_meta_nullOperand(); }

    template
    static inline _bz_meta_nullOperand
    dotWithArgs(const T_expr1& a, P_numtype2 i1, P_numtype2 i2=0,
                P_numtype2 i3=0, P_numtype2 i4=0, P_numtype2 i5=0, P_numtype2 i6=0,
                P_numtype2 i7=0, P_numtype2 i8=0, P_numtype2 i9=0, P_numtype2 i10=0)
    {
        return _bz_meta_nullOperand();
    }
};
这段代码远比它乍看上去的简单。_bz_meta_vectorDot类模板使用了一个临时变量loopFlag来存放每一步循环条件的评估结果，并使用了一个完全特化版作为递归终结的条件。需要说明的是，和几乎所有元程序一样，这个临时变量作用发挥于编译期，并将从运行代码中优化掉。
Todd是在Blitz++数值数组库的主要作者。这个程序库（以及MTL和POOMA等程序库）例证了模板元程序可以为我们带来更加高效的数值计算性能。Todd宣称Blitz++的性能可以和对应的Fortran程序库媲美。

Loki程序库：活用模板元编程技术的典范

模板元编程的价值仅仅在于高性能数值计算吗？不仅如此。Loki程序库以对泛型模式的开创性工作闻名于C++社群。它很巧妙地利用了模板元编程技术实现了Typelist组件。Typelist是实现Abstract Factory、Visitor等泛型模式不可或缺的基础设施。
就像C++标准库组件std::list提供对一组数值的操作一样，Typelist可以用来操纵一组类型，其定义非常简单（摘自Loki程序库Typelist.h单元）： [Page]
template
struct Typelist
{
    typedef T Head;
    typedef U Tail;
};
显然，Typelist没有任何状态，也未定义任何操作，其作用只在于携带类型信息，它并未打算被实例化，因此，对于Typelist的任何处理都必然发生于编译期而非运行期。
Typelist可以被无限扩展，因为模板参数可以是任何类型（包括该模板的其他具现体）。例如：
Typelist > >
就是一个包含有char、int、float三种类型的Typelist。
按照Loki的约定，每一个Typelist都必须以NullType结尾。NullType的作用类似于传统C字符串的“\\0”，它被声明于Loki程序库的NullType.h文件中：
class NullType;
NullType只有声明，没有定义，因为Loki程序库永远都不需要创建一个NullType对象。
让我们看看IndexOf模板元程序，它可以在一个Typelist中查找给定类型的位置（摘自Loki程序库的Typelist.h单元）：

[NextPage]

template
struct IndexOf;

template
struct IndexOf
{
    enum { value = -1 };
};

template
struct IndexOf, T>
{
    enum { value = 0 };
};

template
struct IndexOf, T>
{
private:
    enum { temp = IndexOf::value };
public:
    enum { value = (temp == -1 ? -1 : 1 + temp) };
};

IndexOf提供了一个原始模板和三个局部特化版。算法非常简单：如果TList（就是一个Typelist）是一个NullType，则value为-1。如果TList的头部就是T，则value为0。否则将IndexOf施行于TList的尾部和T，并将评估结果置于一个临时变量temp中。如果temp为-1，则value为-1，否则value为1 + temp。
为了加深你对Typelist采用的模板元编程技术的认识，我从Loki程序库剥离出如下代码，放入一个typelistlite.h文件中：
// typelistlite.h

// 声明Nulltype
class NullType;

// Typelist的定义
template
struct Typelist
{
    typedef T Head;
    typedef U Tail;
};

// IndexOf的定义

// IndexOf原始模板
template struct IndexOf;

// 针对NullType的局部特化版
template
struct IndexOf
{
    enum { value = -1 };
};

// 针对“Tlist头部就是我们要查找的T”的局部特化版
template
struct IndexOf, T>
{
    enum { value = 0 };
};

// 处理Tlist尾部的局部特化版
template
struct IndexOf, T>
{
private:
    enum { temp = IndexOf::value };[Page]
public:
    enum { value = (temp == -1 ? -1 : 1 + temp) };
};

测试程序如下：

// typelistlite_test.cpp

#include
#include \"typelistlite.h\"

// 自定义类型Royal
class Royal {};

// 定义一个包含有char、int、Royal和float的Typelist
typedef Typelist > > > CIRF;

int main()
{
    std::cout << \"IndexOf::value = \" << IndexOf::value << \"\\n\";
    std::cout << \"IndexOf::value = \" << IndexOf::value << \"\\n\";
    std::cout << \"IndexOf::value = \" << IndexOf::value << \"\\n\";
}

程序输出如下：

IndexOf::value = 1
IndexOf::value = 2
IndexOf::value = -1

结语

模板元编程技术并非都是优点，比方说，模板元程序编译耗时，带有模板元程序的程序生成的代码尺寸要比普通程序的大，而且通常这种程序调试起来也比常规程序困难得多。另外，对于一些程序员来说，以类模板的方式描述算法也许有点抽象。
编译耗时的代价换来的是卓越的运行期性能。通常来说，一个有意义的程序的运行次数（或服役时间）总是远远超过编译次数（或编译时间）。为程序的用户带来更好的体验，或者为性能要求严格的数值计算换取更高的性能，值得程序员付出这样的代价。
很难想象模板元编程技术会成为每一个普通程序员的日常工具，相反，就像Blitz++和Loki那样，模板元程序几乎总是应该被封装在一个程序库的内部。对于库的用户来说，它应该是透明的。模板元程序可以（也应该）用作常规模板代码的内核，为关键的算法实现更好的性能，或者为特别的目的实现特别的效果。
模板元编程技术首次正式亮相于Todd Veldhuizen的Using C++ Template Metaprograms论文之中。这篇文章首先发表于1995年5月的C++ Report期刊上，后来Stanley Lippman编辑C++ Gems一书时又收录了它。参考文献中给出了这篇文章的链接，它还描述了许多本文没有描述到的内容。
David Vandevoorde和Nicolai M. Josuttis合著的C++ Templates: The Complete Guide一书花了一整章的篇幅介绍模板元编程技术，它同样是本文的参考资料并且也应该作为你的补充阅读材料。
Andrei Alexandrescu的天才著作Modern C++ Design: Generic Programming and Design Patterns Applied的第3章Typelists对Typelist有着更为详尽的描述。

阅读(791) | 评论(0) | 转发(0) |

0

上一篇：C++引用与const引用比较

下一篇：c＋＋对象的放置

给主人留下些什么吧！~~

评论热议

请登录后评论。
登录注册