分类: LINUX
2011-12-08 15:03:48
转自:
如果能知道CPU cache行的大小,那么就可以有针对性地设置内存的对齐值,这样可以提高程序的效率。Nginx有分配内存池的接口,Nginx会将内存池边界对齐到 CPU cache行大小。对于这一点真是让人惊诧,我看过memcached的源代码,memcache也用到了内存池,但却没有考虑得这么周到。
Ngx_posix_init.c中ngx_os_init函数调用了ngx_cpuinfo()这句,这个函数便是在获取CPU的信息,根据CPU的型号对ngx_cacheline_size进行设置。函数的代码如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | /* auto detect the L2 cache line size of modern and widespread CPUs */ void ngx_cpuinfo(void) { u_char *vendor; uint32_t vbuf[5], cpu[4], model; vbuf[0] = 0; vbuf[1] = 0; vbuf[2] = 0; vbuf[3] = 0; vbuf[4] = 0; ngx_cpuid(0, vbuf); vendor = (u_char *) &vbuf[1]; if (vbuf[0] == 0) { return; } ngx_cpuid(1, cpu); //根据CPU的类型选择cache的行的大小 if (ngx_strcmp(vendor, "GenuineIntel") == 0) { switch ((cpu[0] & 0xf00) >> 8) { /* Pentium */ case 5: ngx_cacheline_size = 32; break; /* Pentium Pro, II, III */ case 6: ngx_cacheline_size = 32; model = ((cpu[0] & 0xf0000) >> 8) | (cpu[0] & 0xf0); if (model >= 0xd0) { /* Intel Core, Core 2, Atom */ ngx_cacheline_size = 64; } break; /* * Pentium 4, although its cache line size is 64 bytes, * it prefetches up to two cache lines during memory read */ case 15: ngx_cacheline_size = 128; break; } } else if (ngx_strcmp(vendor, "AuthenticAMD") == 0) { ngx_cacheline_size = 64; } } |
代码两次调用了ngx_cpuid函数,该函数实现代码如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | #if ( __i386__ ) static ngx_inline void ngx_cpuid(uint32_t i, uint32_t *buf) { /* * we could not use %ebx as output parameter if gcc builds PIC, * and we could not save %ebx on stack, because %esp is used, * when the -fomit-frame-pointer optimization is specified. */ __asm__ ( " mov %%ebx, %%esi; " " cpuid; " " mov %%eax, (%1); " " mov %%ebx, 4(%1); " " mov %%edx, 8(%1); " " mov %%ecx, 12(%1); " " mov %%esi, %%ebx; " : : "a" (i), "D" (buf) : "ecx", "edx", "esi", "memory" ); } #else /* __amd64__ */ static ngx_inline void ngx_cpuid(uint32_t i, uint32_t *buf) { uint32_t eax, ebx, ecx, edx; __asm__ ( "cpuid" : "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx) : "a" (i) ); buf[0] = eax; buf[1] = ebx; buf[2] = edx; buf[3] = ecx; } #endif |
如果是非AMD64架构的CPU那么就应该进入#if语句块,其中又涉及到C语言调用汇编的代码了。x86 CPU提供cpuid指令用于获取CPU的信息,在网上找到一张图片,展示了cpuid的使用。
当eax为0时,可以得到CPU的Vendor ID;当eax为1时,可以得到CPU更详细的信息。