内存屏障机制及内核相关源代码分析 分析人:余旭 分析版本:Linux Kernel 2.6.14 来自于: 分析开始时间:2005-11-17-20:45:56 分析结束时间:2005-11-21-20:07:32 编号:2-1 类别:进程管理-准备工作1-内存屏障 Email:yuxu9710108@163.com 版权声明:版权保留。本文用作其他用途当经作者本人同意,转载请注明作者姓名 All Rights Reserved. If for other use,must Agreed By the writer.Citing this text,please claim the writer's name. Copyright (C) 2005 YuXu ************************************************************* 内存屏障是Linux Kernel中常要遇到的问题,这里专门来对其进行研究。一者查阅网上现有资料,进行整理汇集;二者翻阅Linux内核方面的指导书,从中提炼观点;最后,自己加以综合分析,提出自己的看法。下面将对个问题进行专题分析。 ***************************************************************************** ------------------------------------------------------ 专题研究:内存屏障--------------------------------
再次发贴指出: The memory keyword forces the compiler to assume that all memory locations in RAM have been changed by the assembly language instruction; therefore, the compiler cannot optimize the code by using the values of memory locations stored in CPU registers before the asm instruction.
通过以上众人的贴子的分析,自己综合一下,这4个宏set_current_state(),__set_current_state(),set_task_state(),__set_task_state()和3个函数rmb(),wmb(),mb()的源代码中的疑难大都被解决。此处只是汇集众人精彩观点,只用来解决代码中的疑难,具体有序系统的源代码将在后面给出。 -------------------------------------------------------------------------------------------------------------- mfence,mb(),wmb(),OOPS的疑难问题的突破 -------------------------------------------------------------------------------------------------------------- 1.--->puppy love (zhou_ict@hotmail.com )在 CPU 与 编译器 问: 在linux核心当中, mb(x86-64)的实现是 ("mfence":::"memory") 我查了一下cpu的manual,mfence用来同步指令执行的。而后面的memory clober好像是gcc中用来干扰指令调度的。但还是不甚了了,哪位能给解释解释吗? 或者有什么文档之类的可以推荐看看的?
ANSWER: 1.classpath 发贴指出: mfence is a memory barrier supported by hardware, and it only makes sense for shared memory systems.
For example, you have the following codes mfence
mfence or other memory barriers techniques disallows the code motion (load/store)from codes2 to codes1 done by _hardware_ . Some machines like P4 can move loads in codes 2 before stores in codes1, which is out-of-order.
Another memory barrier is something like ("":::"memory"), which disallows the code motion done by _compiler_. But IMO memory access order is not always guaranteed in this case.
然而,存在一种情况,此时我们需要知道一个符号的地址(或者一个地址对应的 符号)。这是通过符号表来做到的,与gdb能够从一个地址给出函数名(或者给出一个函数名的地址)的情况很相似。符号表是所有符号及其对应地址的一个列表。这里是 一个符号表例子: c03441a0 B dmi_broken c03441a4 B is_sony_vaio_laptop c03441c0 b dmi_ident c0344200 b pci_bios_present c0344204 b pirq_table c0344208 b pirq_router c034420c b pirq_router_dev c0344220 b ascii_buffer c0344224 b ascii_buf_bytes 你可以看出名称为dmi_broken的变量位于内核地址c03441a0处。
4.如果我没有一个好的System.map,会发生什么问题? 假设你在同一台机器上有多个内核。则每个内核都需要一个独立的System.map文件!如果所启动的内核没有对应的System.map文件,那么你将定期地看到这样一条信息: System.map does not match actual kernel (System.map与实际内核不匹配) 不是一个致命错误,但是每当你执行ps ax时都会恼人地出现。有些软件,比如dosemu,可能不会正常工作。最后,当出现一个内核oops时,klogd或ksymoops的输出可能会不可靠。
5.自己分析: 作者查阅了内核注释如下: -----------------------------------------------asm-i386\system.h-------------------------------------- 内核注释: /* * Force strict CPU ordering. * And yes, this is required on UP too when we're talking * to devices. * * For now, "wmb()" doesn't actually do anything, as all * Intel CPU's follow what Intel calls a *Processor Order*, * in which all writes are seen in the program order even * outside the CPU. * * I expect future Intel CPU's to have a weaker ordering, * but I'd also expect them to finally get their act together * and add some real memory barriers if so. * * Some non intel clones support out of order store. wmb() ceases to be a * nop for these. */ 自己分析认为: 1.Intel CPU 有严格的“processor Order”,已经确保内存按序写,这里的wmb()所以定义的为空操作。 2.内核人员希望Intel CPU今后能采用弱排序技术,采用真正的内存屏障技术。 3.在非intel的cpu上,wmb()就不再为空操作了。
-----------------------------------------------------alternative()----------------------------------------- /* * Alternative instructions for different CPU types or capabilities. * * This allows to use optimized instructions even on generic binary kernels. * * length of oldinstr must be longer or equal the length of newinstr * It can be padded with nops as needed. * * For non barrier like inlines please define new variants * without volatile and memory clobber. */ #define alternative(oldinstr, newinstr, feature) \ asm volatile ("661:\n\t" oldinstr "\n662:\n" \ ".section .altinstructions,\"a\"\n" \ " .align 4\n" \ " .long 661b\n" /* label */ \ " .long 663f\n" /* new instruction */ \ " .byte %c0\n" /* feature bit */ \ " .byte 662b-661b\n" /* sourcelen */ \ " .byte 664f-663f\n" /* replacementlen */ \ ".previous\n" \ ".section .altinstr_replacement,\"ax\"\n" \ "663:\n\t" newinstr "\n664:\n" /* replacement */ \ ".previous" :: "i" (feature) : "memory") 自己分析: 1.alternative()宏用于在不同的cpu上优化指令。oldinstr为旧指令,newinstr为新指令,feature为cpu特征位。 2.oldinstr的长度必须>=newinstr的长度。不够将填充空操作符。 ---------------------------------------------------------------------- /* * Force strict CPU ordering. * And yes, this is required on UP too when we're talking * to devices. * * For now, "wmb()" doesn't actually do anything, as all * Intel CPU's follow what Intel calls a *Processor Order*, * in which all writes are seen in the program order even * outside the CPU. * * I expect future Intel CPU's to have a weaker ordering, * but I'd also expect them to finally get their act together * and add some real memory barriers if so. * * Some non intel clones support out of order store. wmb() ceases * to be a nop for these. */ /* * Actually only lfence would be needed for mb() because all stores done by the kernel should be already ordered. But keep a full barrier for now. */ 自己分析: 这里的内核中的注释,在前面已经作了讲解,主要就是intel cpu采用Processor Order,对wmb()保证其的执行顺序按照程序顺序执行,所以wmb()定义为空操作。如果是对于对于非intel的cpu,这时wmb()就不能再是空操作了。
#ifdef CONFIG_X86_OOSTORE /* Actually there are no OOO store capable CPUs for now that do SSE,but make it already an possibility. */ 作者附注:(对内核注释中的名词的解释) -->OOO:Out of Order,乱序执行。 -->SSE:SSE是英特尔提出的即MMX之后新一代(当然是几年前了)CPU指令集,最早应用在PIII系列CPU上。 本小段内核注释意即:乱序存储的cpu还没有问世,故CONFIG_X86_OOSTORE也就仍未定义的,wmb()当为后面空宏(在__volatile__作用下,阻止编译器重排顺序优化)。
Pentium 4, Intel Xeon,P6系列以及Pentium处理器还保证下列内存操作总是被自动执行: 1)读或写一个在64位边界对齐的四字(quadword) 2)对32位数据总线可以容纳的未缓存的内存位置进行16位方式访问 (16-bit accesses to uncached memory locations that fit within a 32-bit data bus)
11.加强和削弱访存排序模型(Strengthening or Weakening the Memory Ordering Model) IA-32体系提供了几种机制用来加强和削弱访存排序模型以处理特殊的编程场合。这些机制包括: 1)I/O指令,加锁指令,LOCK前缀,以及序列化指令来强制执行"强排序"。
在本文的代码中有不少下划线的关键字,特此作一研究: --------------------------------------------------------双下划线的解释-------------------------------------- --->摘自gcc手册 Alternate Keywords ‘-ansi’ and the various ‘-std’ options disable certain keywords。 This causes trouble when you want to use GNU C extensions, or a general-purpose header file that should be usable by all programs, including ISO C programs。 The keywords asm, typeof and inline are not available in programs compiled with ‘-ansi’ or ‘-std’ (although inline can be used in a program compiled with ‘-std=c99’)。 The ISO C99 keyword restrict is only available when ‘-std=gnu99’ (which will eventually be the default) or ‘-std=c99’ (or the equivalent ‘-std=iso9899:1999’) is used。The way to solve these problems is to put ‘__’ at the beginning and end of each problematical keyword。 For example, use __asm__ instead of asm, and __inline__ instead of inline。 Other C compilers won’t accept these alternative keywords; if you want to compile with another compiler, you can define the alternate keywords as macros to replace them with the customary keywords。 It looks like this: #ifndef __GNUC__ #define __asm__ asm #endif ‘-pedantic’(pedantic选项解释见下面) and other options cause warnings for many GNU C extensions。 You can prevent such warnings within one expression by writing __extension__ before the expression。__extension__ has no effect aside from this。
自己分析: 1。我们在程序中使用了很多的gnu风格,也就是GNU C extensions 或其他的通用的头文件。但是如果程序用'-ansi'或各种'-std'选项编译时候,一些关键字,比如:asm、typeof、inline就不能再用了,在这个编译选项下,这此关键字被关闭。所以用有双下划线的关键字,如:__asm__、__typeof__、__inline__,这些编译器通常支持这些带有双下划线的宏。这能替换这些会产生编译问题的关键字,使程序能正常通过编译。
-----------------------------------------------pedantic选项的解释---------------------------------- --->摘自gcc手册Download from www。gnu。org Issue all the warnings demanded by strict ISO C and ISO C++; reject all programs that use forbidden extensions, and some other programs that do not follow ISO C and ISO C++。 For ISO C, follows the version of the ISO C standard specified by any ‘-std’ option used。 Valid ISO C and ISO C++ programs should compile properly with or without this option (though a rare few will require ‘-ansi’ or a ‘-std’ option specifying the required version of ISO C)。 However, without this option, certain GNU extensions and traditional C and C++ features are supported as well。 With this option, they are rejected。 ‘-pedantic’ does not cause warning messages for use of the alternate keywords whose names begin and end with ‘__’。 Pedantic warnings are also disabled in the expression that follows __extension__。 However, only system header files should use these escape routes; application programs should avoid them。 See Section 5。38 [Alternate Keywords], page 271。 Some users try to use ‘-pedantic’ to check programs for strict ISO C conformance。They soon find that it does not do quite what they want: it finds some non-ISO practices, but not all—only those for which ISO C requires a diagnostic, and some others for which diagnostics have been added。 A feature to report any failure to conform to ISO C might be useful in some instances, but would require considerable additional work and would be quite different from ‘-pedantic’。 We don’t have plans to support such a feature in the near future。 -----------------------------------------------------------------------------------------------------------------
-------------------- 姓名:余旭 Linux Kernel 2.6.14源代码剖析