Linux spin_lock的实现
Spin_lock是Linux内核的一种同步机制。内核代码可以通过获得spin_lock宣称对某一资源的占有,直到其释放该spin_lock;如果内核代码试图获得一个已经锁定的spin_lock,则这部分代码会一直忙等待,直到获得该spin_lock。
Spin_lock的kernel中的实现对单核(UP),多核(SMP)有不同的处理方式。对单核来说,如果spin_lock不处于中断上下文,则spin_lock锁定的代码丢失CPU拥有权,只会在内核抢占的时候发生。所以,对于单核来说,只需要在spin_lock获得锁的时候禁止抢占,释放锁的时候开放抢占。对多核来说,存在两段代码同时在多核上执行的情况,这时候才需要一个真正的锁来宣告代码对资源的占有。
在include/linux/spinlock.h文件中,给出了UP,SMP所涉及的不同的头文件,也很清楚的将两者实现的不同体现出来。
/* * include/linux/spinlock.h - generic spinlock/rwlock declarations here's the role of the various spinlock/rwlock related include files: *
* on SMP builds:
* * asm/spinlock_types.h: contains the arch_spinlock_t/arch_rwlock_t and the initializers
* * linux/spinlock_types.h: defines the generic type and initializers
* * asm/spinlock.h: contains the arch_spin_*)/etc. lowlevel implementations, mostly inline assembly code (also included on UP-ebug builds:)
* * linux/spinlock_api_smp.h: contains the prototypes for the _spin_*() APIs.
* * linux/spinlock.h: builds the final spin_*) APIs.
* * on UP builds:
* * linux/spinlock_type_up.h: contains the generic, simplified UP spinlock type. (which is an empty structure on non-debug builds)
* * linux/spinlock_types.h: defines the generic type and initializers
* * linux/spinlock_up.h: contains the arch_spin_*)/etc. version of UP builds. (which are NOPs on non-debug, non-preempt builds) * * (included on UP-non-debug builds:)
* * linux/spinlock_api_up.h: builds the _spin_*() APIs.
* * linux/spinlock.h: builds the final spin_*() APIs.
下面代码表明了UP和SMP是通过CONFIG_SMP选项来区分,从而编译不同的头文件。
-
/* * Pull the _spin_*()/_read_*()/_write_*() functions/declarations: */
-
#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
-
# include <linux/spinlock_api_smp.h>
-
#else# include <linux/spinlock_api_up.h>
-
#endif
-
static inline void spin_lock(spinlock_t *lock)
-
{ raw_spin_lock(&lock->rlock);
-
}
-
#define raw_spin_lock(lock) _raw_spin_lock(lock)
UP中spin_lock的实现
实现在include/linux/spinlock_api_up.h
-
/* * In the UP-nondebug case there's no real locking going on, so the * only thing we have to do is tokeep the preempt counts and irq * flags straight, to suppress compiler warnings of unused lock * variables, and to add the proper checker annotations: */
-
#define __LOCK(lock) \
-
do {
-
preempt_disable();
-
__acquire(lock);
-
(void)(lock);
-
} while (0)
-
#define _raw_spin_lock(lock) __LOCK(lock)
代码表明,spin_lock在UP中实际上被处理为三个语句:
preempt_disable();
__acquire(lock);
(void)(lock);
Preempt_disable()将当前进程的preempt_count加1,表示禁止内核抢占,那么内核从中断上下文返回时不会发生进程调度。
__acquire(lock)只是使用sparse工具对lock进行检查,否则该宏为空。
另在make 中加入C=1/C=2的参数,则会导致编译时进行sparse检查。
(void)(lock)仅仅是为了防止编译器对lock的未使用报警。
SMP中spin_lock的实现
实现在include/linux/spinlock_api_smp.h
-
static inline void __raw_spin_lock(raw_spinlock_t *lock)
-
{
-
preempt_disable();
-
spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
-
LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);}
同样,SMP上的实现被分解为三句话。
Preempt_disable()不用解释
Spin_acquire()同样是sparse检查需要
LOCK_CONTENDED()是一个宏,如果不考虑CONFIG_LOCK_STAT(该宏是为了统计lock的操作),则:
#define LOCK_CONTENDED(_lock, try, lock) \ lock(_lock)
则第三句话等同于:
do_raw_spin_lock(lock)
而do_raw_spin_lock()则可以从spinlock.h中找到痕迹:
static inline int do_raw_spin_trylock(raw_spinlock_t *lock){ return arch_spin_trylock(&(lock)->raw_lock);}
看到arch,我们明白这个函数是体系相关的。下面分别分析ARM和x86体现结构下该函数的实现。
ARM中spin_lock的实现
-
static inline void arch_spin_lock(arch_spinlock_t *lock)
-
{
-
unsigned long tmp;
-
__asm__ __volatile__("
-
1: ldrex %0, [%1]\n"
-
@将&lock->lock地址中的值,即lock->lock加载到tmp中,并设置&lock->lock为独占访问"
-
teq %0, #0\n"
-
@测试tmp是否为0
-
WFE("ne")
-
@不为0,则执行WFE指令。不为0,代表锁已被锁定,则通过WFE指令进入suspend mode(clock停止),直到该锁被释放时发出的SEV指令,CPU才会跳出suspend mode"
-
strexeq %0, %2, [%1]\n"
-
@将lock->lock加1,并解除lock->lock的锁定状态,tmp中存入返回状态"
-
teqeq %0, #0\n"
-
@如果执行成功,则tmp为0,成功获得所"
-
bne 1b"
-
@如果执行不成功,则tmp不为0,跳转到标号1处,继续获得锁。
-
: "=&r" (tmp)
-
: "r" (&lock->lock), "r" (1) : "cc");
-
smp_mb(); }
代码是一段内联汇编。Tmp为输出,放在寄存器中,在代码中以%0表示,&lock->lock为输入参数1,放在寄存器中,在代码中以%1表示,常数1为输入参数2,放在寄存器中,在代码中以2%表示。
代码中,ldrex/strex以及WFE指令是关键。因lock->lock放在内存中,那么将lock->lock加1这一操作会经过读取内存,+1,写内存的操作,这一过程如果不是原子操作,那么其他核有可能在这一过程中访问lock->lock,造成错误。Ldrex/strex是ARM在arm v6中新增的指令,用于对内存区域的独占访问,WFE指令则可以在空等时间内暂停CPU的时钟,以达到省电的目的。
X86中spin_lock的实现
X86中的实现在arch/x86/include/asm/spin_lock.h:
-
static __always_inline void __ticket_spin_lock(arch_spinlock_t *lock){
-
short inc = 0x0100;
-
asm volatile (
-
LOCK_PREFIX "xaddw %w0, %1\n" @对SMP内核来说,LOCK_PREFIX为”\n\tlock” Lock是一个指令前缀,表示在接下来的一个指令内,LOCK信号被ASSERT,指令所访问的内存区域将为独占访问。具体实现或是BUS锁定,或是Cache一致性操作。可参考intel system program guide 8.1 另:这一实现是最新的实现,名为ticket实现,即每个希望获得锁的代码都会得到一张ticket,ticket按顺序增长,锁内部会维护一个当前使用锁的ticket号owner,和下一个使用锁的ticket号next,各一个字节。当锁处于释放状态时,owner=next,如果锁处于锁定状态,则next=owner+1。获得锁的时候,将next+1,释放锁的时候将owner+1。 "1:\t"
-
"cmpb %h0, %b0\n\t" "je 2f\n\t"
-
"rep ; nop\n\t"
-
"movb %1, %b0\n\t"
-
/* don't need lfence here, because loads are in-order */
-
"jmp 1b\n" "2:"
-
: "+Q" (inc), "+m" (lock->slock) :
-
: "memory", "cc");}