likely() and unlikely()
它们是指什么?在linux内核代码中,经常看到likely()和unlikely()会在条件语句中调用到,如:
bvl = bvec_alloc(gfp_mask, nr_iovecs, &idx);
if (unlikely(!bvl)) {
mempool_free(bio, bio_pool);
bio = NULL;
goto out;
}
|
事实上,根据这两个函数可以得知条件语句中最可能发生的情形,从而告知编译器以允许它正确地优化条件分支。
在include/linux/complier.h头文件中可以找到它们的宏定义:
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)
|
在gcc文档中是这样来解释__builtin_expect()的作用:
-- Built-in Function: long __builtin_expect (long EXP, long C)
You may use `__builtin_expect' to provide the compiler with branch
prediction information. In general, you should prefer to use
actual profile feedback for this (`-fprofile-arcs'), as
programmers are notoriously bad at predicting how their programs
actually perform. However, there are applications in which this
data is hard to collect.
The return value is the value of EXP, which should be an integral
expression. The value of C must be a compile-time constant. The
semantics of the built-in are that it is expected that EXP == C.
For example:
if (__builtin_expect (x, 0))
foo ();
would indicate that we do not expect to call `foo', since we
expect `x' to be zero. Since you are limited to integral
expressions for EXP, you should use constructions such as
if (__builtin_expect (ptr != NULL, 1))
error ();
when testing pointer or floating-point values.
|
如何来优化的?通过合理地安排生成的汇编代码来加以优化,进而充分地发挥处理器的pipeline流水线性能。这样,可以安排最可能发生的条件分支代码并不执行任何的jmp指令(jmp指令的使用会导致刷新处理器流水线的负效应)。为说明上述优化过程,我们以gcc -O2 优化编译选项来编译接下来的C用户空间程序:
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)
int main(int argc, char *argv[] )
{
int a;
/* Get the value from somewhere GCC can't optimize */
a = atoi (argv[1]);
if (unlikely (a == 2))
a++;
else
a--;
printf ("%d\n", a);
return 0;
}
|
由gcc -O2 -o test test.c编译产生最终的可执行二进制文件,并利用objdump -S对产生的二进制文件进行反汇编,以得到优化之后的汇编代码,如下(其中加入了注释):
080483b0 <main>:
// Prologue
80483b0: 55 push %ebp
80483b1: 89 e5 mov %esp,%ebp
80483b3: 50 push %eax
80483b4: 50 push %eax
80483b5: 83 e4 f0 and $0xfffffff0,%esp
// Call atoi()
80483b8: 8b 45 08 mov 0x8(%ebp),%eax
80483bb: 83 ec 1c sub $0x1c,%esp
80483be: 8b 48 04 mov 0x4(%eax),%ecx
80483c1: 51 push %ecx
80483c2: e8 1d ff ff ff call 80482e4 <atoi@plt>
80483c7: 83 c4 10 add $0x10,%esp
// Test the value
80483ca: 83 f8 02 cmp $0x2,%eax
// --------------------------------------------------------
// 如果'a'等于2(这种情形最不可能发生),就jump跳转执行;否则继续顺序往下 // 执行,不存在jump跳转,这样就不会刷新流水线pipeline。 // --------------------------------------------------------
80483cd: 74 12 je 80483e1 // 此处,eax直接进行自减,即a-- 80483cf: 48 dec %eax // Call printf 80483d0: 52 push %edx 80483d1: 52 push %edx 80483d2: 50 push %eax 80483d3: 68 c8 84 04 08 push $0x80484c8 80483d8: e8 f7 fe ff ff call 80482d4 // Return 0 and go out. 80483dd: 31 c0 xor %eax,%eax 80483df: c9 leave 80483e0: c3 ret
|
同样,我们修改前面的程序,用likely()代替unlikely(),重新编译和反汇编,如下(其中已经添加了注释):
080483b0 <main>:
// Prologue
80483b0: 55 push %ebp
80483b1: 89 e5 mov %esp,%ebp
80483b3: 50 push %eax
80483b4: 50 push %eax
80483b5: 83 e4 f0 and $0xfffffff0,%esp
// Call atoi()
80483b8: 8b 45 08 mov 0x8(%ebp),%eax
80483bb: 83 ec 1c sub $0x1c,%esp
80483be: 8b 48 04 mov 0x4(%eax),%ecx
80483c1: 51 push %ecx
80483c2: e8 1d ff ff ff call 80482e4 <atoi@plt>
80483c7: 83 c4 10 add $0x10,%esp
// --------------------------------------------------
// 如果'a'等于2,将继续顺序执行而不会发生jump跳转,因此也就不会刷新处理器pipeline;
// 只有当a != 2时,这是最不可能发生的情形,才会发生jump跳转。
// ---------------------------------------------------
80483ca: 83 f8 02 cmp $0x2,%eax
80483cd: 75 13 jne 80483e2 <main+0x32>
// 此处,a++自增已经由gcc进行了优化,直接将值3赋给al
80483cf: b0 03 mov $0x3,%al
// Call printf()
80483d1: 52 push %edx
80483d2: 52 push %edx
80483d3: 50 push %eax
80483d4: 68 c8 84 04 08 push $0x80484c8
80483d9: e8 f6 fe ff ff call 80482d4 <printf@plt>
// Return 0 and go out.
80483de: 31 c0 xor %eax,%eax
80483e0: c9 leave
80483e1: c3 ret
|
什么时候应该使用likely()和unlikely()?当存在最最可能发生的分支情形时,使用likely();而当存在最最不可能发生的分支情形时,使用unlikely()。
阅读(1376) | 评论(0) | 转发(0) |