Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1715412
  • 博文数量: 199
  • 博客积分: 10
  • 博客等级: 民兵
  • 技术积分: 6186
  • 用 户 组: 普通用户
  • 注册时间: 2012-10-30 11:01
个人简介

Linuxer.

文章存档

2015年(4)

2014年(28)

2013年(167)

分类: LINUX

2014-08-18 20:22:09


  1. 一、概述
  2. 在Linux系统里,假设有两处代码(比如不同线程的两个函数F1和F2)都要获取两个锁(分别为L1和L2),如果F1持有L1后再去获取L2,而此时恰好由F2持有L2且它也正在尝试获取L1,那么此时就是处于死锁的状态,这是一个最简单的死锁例子,也即所谓的AB-BA死锁。

  3. 死锁导致的最终结果无需多说,关于如何避免死锁在教科书上也有提到,最简单直观的做法就是按顺序上锁,以破坏死锁的环形等待条件。但对于拥有成千上万个锁的整个系统来说,完全定义它们之间的顺序是非常困难的,所以一种更可行的办法就是尽量提前发现这其中潜在的死锁风险,而不是等到最后真正出现死锁时给用户带来切实的困惑。
  4. 已有很多工具用于发现可能的死锁风险,而本文介绍的调试/检测模块lockdep,即是属于这一类工具的一种。调试模块lockdep从2006年引入内核,经过实践验证,其对提前发现死锁起到了巨大的效果

  5. 官方文档有介绍调试模块lockdep的设计原理,这里按照我自己的理解描述一下。
  6. 1,lockdep操作的基本单元并非单个的锁实例,而是锁类(lock-class)。比如,struct inode结构体中的自旋锁i_lock字段就代表了这一类锁,而具体每个inode节点的锁只是该类锁中的一个实例。对所有这些实例,lockdep会把它们当作一个整体做处理,即把判断粒度放大,否则对可能有成千上万个的实例进行逐一判断,那处理难度可想而知,而且也没有必要。当然,在具体的处理中,可能会记录某些特性情况下的实例的部分相关信息,以便提供事后问题排查。
  7. 2,lockdep跟踪每个锁类的自身状态,也跟踪各个锁类之间的依赖关系,通过一系列的验证规则,以确保锁类状态和锁类之间的依赖总是正确的。另外,锁类一旦在初次使用时被注册,那么后续就会一直存在,所有它的具体实例都会关联到它。

  8. lockdep是linux内核的一个调试模块,用来检查内核互斥机制尤其是自旋锁潜在的死锁问题。自旋锁由于是查询方式等待,不释放处理器,比一般的互斥机制更容易死锁,故引入lockdep检查以下几种情况可能的死锁。
  9. 1.同一个进程递归地加锁同一把锁.
  10. 2.一把锁既在中断(或中断下半部)使能的情况下执行过加锁操作,又在中断(或中断下半部)里执行过加锁操作。这样该锁有可能在锁定时由于中断发生又试图在同一处理器上加锁,加锁后导致依赖图产生成闭环,这是典型的死锁现象。

  11. 二、 lockdep验证规则
  12. (1)单锁状态规则(Single-lock state rules)
  13. 1,一个软中断不安全(softirq-unsafe)的锁类同样也是硬中断不安全(hardirq-unsafe)的。
  14. 2,对于任何一个锁类,它不可能同时是hardirq-safe和hardirq-unsafe,也不可能同时是softirq-safe和softirq-unsafe,即这两对对应状态是互斥的。
  15. 上面这两条就是lockdep判断单锁是否会发生死锁的检测规则。
  16. (2)多锁依赖规则(Multi-lock dependency rules)
  17. 1,同一个锁类不能被获取两次,因为这会导致递归死锁。
  18. 2,不能以不同的顺序获取两个锁类,即如此这样:
  19. <L1> -> <L2>
  20. <L2> -> <L1>
  21. 是不行的。因为这会非常容易的导致本文最先提到的AB-BA死锁。当然,下面这样的情况也不行:    
  22. <L1> -> <L3> -> <L4> -> <L2>
  23. <L2> -> <L3> -> <L4> -> <L1>
  24. 即在中间插入了其它正常顺序的锁也能被lockdep检测出来。
  25. 3,同一个锁实例在任何两个锁类之间不能出现这样的情况:    
  26. <hardirq-safe> -> <hardirq-unsafe>
  27. <softirq-safe> -> <softirq-unsafe>
  28. 这意味着,如果同一个锁实例,在某些地方是hardirq-safe(即采用spin_lock_irqsave()),而在某些地方又是hardirq-unsafe(即采用spin_lock()),那么就存在死锁的风险。这应该容易理解,比如在进程上下文中持有锁A,并且锁A是hardirq-unsafe,如果此时触发硬中断,而硬中断处理函数又要去获取锁A,那么就导致了死锁。
  29. 在锁类状态发生变化时,进行如下几个规则检测,判断是否存在潜在死锁。比较简单,就是判断hardirq-safe和hardirq-unsafe以及softirq-safe和softirq-unsafe是否发生了碰撞.

  30. 三、相关结构体
  31. 1.struct held_lock
  32. 在每个进程的task_struct结构体中定义了struct held_lock held_locks[MAX_LOCK_DEPTH]成员,用来记录锁。
  33. struct held_lock {
  34. 215 /*
  35. 216 * One-way hash of the dependency chain up to this point. We
  36. 217 * hash the hashes step by step as the dependency chain grows.
  37. 218 *
  38. 219 * We use it for dependency-caching and we skip detection
  39. 220 * passes and dependency-updates if there is a cache-hit, so
  40. 221 * it is absolutely critical for 100% coverage of the validator
  41. 222 * to have a unique key value for every unique dependency path
  42. 223 * that can occur in the system, to make a unique hash value
  43. 224 * as likely as possible - hence the 64-bit width.
  44. 225 *
  45. 226 * The task struct holds the current hash value (initialized
  46. 227 * with zero), here we store the previous hash value:
  47. 228 */
  48.     u64 prev_chain_key;
  49.     unsigned long acquire_ip;
  50.     struct lockdep_map *instance;
  51.     struct lockdep_map *nest_lock;
  52. #ifdef CONFIG_LOCK_STAT
  53.     u64 waittime_stamp;
  54.     u64 holdtime_stamp;
  55. #endif
  56.     unsigned int class_idx:MAX_LOCKDEP_KEYS_BITS;
  57. 238 /*
  58. 239 * The lock-stack is unified in that the lock chains of interrupt
  59. 240 * contexts nest ontop of process context chains, but we 'separate'
  60. 241 * the hashes by starting with 0 if we cross into an interrupt
  61. 242 * context, and we also keep do not add cross-context lock
  62. 243 * dependencies - the lock usage graph walking covers that area
  63. 244 * anyway, and we'd just unnecessarily increase the number of
  64. 245 * dependencies otherwise. [Note: hardirq and softirq contexts
  65. 246 * are separated from each other too.]
  66. 247 *
  67. 248 * The following field is used to detect when we cross into an
  68. 249 * interrupt context:
  69. 250 */
  70.     unsigned int irq_context:2; /* bit 0 - soft, bit 1 - hard */
  71.     unsigned int trylock:1; /* 16 bits */
  72.     
  73.     unsigned int read:2; /* see lock_acquire() comment */
  74.     unsigned int check:2; /* see lock_acquire() comment */
  75.     unsigned int hardirqs_off:1;
  76.     unsigned int references:11; /* 32 bits */
  77. };

  78. 2.lockdep_map
  79. 各种锁结构体中如mutex、rawspinlock、semaphore内嵌该结构体,用于对锁检测。
  80. struct lockdep_map {
  81.     struct lock_class_key *key;
  82.     struct lock_class *class_cache[NR_LOCKDEP_CACHING_CLASSES];
  83.     const char *name;
  84. #ifdef CONFIG_LOCK_STAT
  85.     int cpu;    //对结构体初始化时所在的cpu号
  86.     unsigned long ip;
  87. #endif
  88. };

  89. 3.lock_class
  90. struct lock_class {
  91.     struct list_head hash_entry;
  92.     struct list_head lock_entry;
  93.     
  94.     struct lockdep_subclass_key *key;
  95.     unsigned int subclass;
  96.     unsigned int dep_gen_id;

  97.     unsigned long usage_mask;
  98.     struct stack_trace usage_traces[XXX_LOCK_USAGE_STATES];

  99.     struct list_head locks_after, locks_before;
  100.     unsigned int version;
  101.     unsigned long ops;
  102.         
  103.     const char *name;
  104.     int name_version;

  105. #ifdef CONFIG_LOCK_STAT
  106.     unsigned long contention_point[LOCKSTAT_POINTS];
  107.     unsigned long contending_point[LOCKSTAT_POINTS];
  108. #endif
  109. };

  110. 4.lock_class_key
  111. struct lock_class_key {
  112.     struct lockdep_subclass_key subkeys[MAX_LOCKDEP_SUBCLASSES];
  113. };

  114. 5.lockdep_subclass_key
  115. struct lockdep_subclass_key {
  116.     char __one_byte;
  117. } __attribute__ ((__packed__));


  118. 三、lockdep初始化
  119. 建立两个散列表calsshash_table和chainhash_table,并初始化全局变量lockdep_initialized,标志已初始化完成。
  120. static struct list_head classhash_table[CLASSHASH_SIZE];
  121. static struct list_head chainhash_table[CHAINHASH_SIZE];
  122. void lockdep_init(void)
  123. {
  124.     int i;

  125.     if (lockdep_initialized)
  126.         return;
  127.     
  128.     for (i = 0; i < CLASSHASH_SIZE; i++)
  129.         INIT_LIST_HEAD(classhash_table + i);
  130.     
  131.     for (i = 0; i < CHAINHASH_SIZE; i++)
  132.         INIT_LIST_HEAD(chainhash_table + i);
  133.     
  134.     lockdep_initialized = 1;
  135. }

  136. 四、提供接口
  137. 1. lockdep_init_map
  138. 用于初始化锁内嵌的lockdep_map结构体
  139. static inline void sema_init(struct semaphore *sem, int val)
  140. {
  141.     static struct lock_class_key __key;
  142.     *sem = (struct semaphore) __SEMAPHORE_INITIALIZER(*sem, val);
  143.     lockdep_init_map(&sem->lock.dep_map, "semaphore->lock", &__key, 0);
  144. }

  145. void lockdep_init_map(struct lockdep_map *lock, const char *name,struct lock_class_key *key, int subclass)
  146. {
  147.     int i;
  148.     
  149.     //arm上是空函数    
  150.     kmemcheck_mark_initialized(lock, sizeof(*lock));
  151.     
  152.     //初始化lock_class结构体的class_cache成员
  153.     for (i = 0; i < NR_LOCKDEP_CACHING_CLASSES; i++)
  154.         lock->class_cache[i] = NULL;

  155. #ifdef CONFIG_LOCK_STAT
  156.     lock->cpu = raw_smp_processor_id();
  157. #endif    
  158.     //name不能为空
  159.     if (DEBUG_LOCKS_WARN_ON(!name)) {
  160.         lock->name = "NULL";
  161.         return;
  162.     }
  163.     //设置name
  164.     lock->name = name;

  165.     //key不能为空
  166.     if (DEBUG_LOCKS_WARN_ON(!key))
  167.         return;
  168.     
  169.     //对key的地址进行健康检查,确保在内核.data地址空间,percpu空间或者module空间
  170.     if (!static_obj(key)) {
  171.         printk("BUG: key %p not in .data!\n", key);
  172.         DEBUG_LOCKS_WARN_ON(1);
  173.         return;
  174.     }
  175.     //设置key
  176.     lock->key = key;
  177.  
  178.     if (unlikely(!debug_locks))
  179.         return;
  180.     
  181.     //subclass不为0,将lockdep_map注册到类中
  182.     if (subclass)
  183.         register_lock_class(lock, subclass, 1);
  184. }

  185. 2.
  186. void lock_acquire(struct lockdep_map *lock, unsigned int subclass,int trylock, int read, int check,struct lockdep_map *nest_lock, unsigned long ip)
  187. {
  188.     unsigned long flags;
  189.         
  190.     if (unlikely(current->lockdep_recursion))
  191.         return;
  192.         
  193.     raw_local_irq_save(flags);
  194.     check_flags(flags);
  195.         
  196.     current->lockdep_recursion = 1;
  197.     //空函数
  198.     trace_lock_acquire(lock, subclass, trylock, read, check, nest_lock, ip);
  199.     __lock_acquire(lock, subclass, trylock, read, check,irqs_disabled_flags(flags), nest_lock, ip, 0);
  200.     current->lockdep_recursion = 0;
  201.     raw_local_irq_restore(flags);
  202. }

  203. 2. debug_check_no_locks_freed
  204. 用于检测一个锁是不是被多次初始化,或者一块内存在释放时还持有锁。
  205. void debug_check_no_locks_freed(const void *mem_from, unsigned long mem_len)
  206. {
  207.     struct task_struct *curr = current;
  208.     struct held_lock *hlock;
  209.     unsigned long flags;
  210.     int i;
  211.         
  212.     if (unlikely(!debug_locks))
  213.         return;
  214.     
  215.     local_irq_save(flags);
  216.     //遍历当前进程所拥有的held_lock
  217.     for (i = 0; i < curr->lockdep_depth; i++) {
  218.         hlock = curr->held_locks + i;
  219.     
  220.         //检查hlock是否在(mem_from,mem_from+mem_len)区间里,不在此区间则继续循环
  221.         if (not_in_range(mem_from, mem_len, hlock->instance,sizeof(*hlock->instance)))
  222.             continue;
  223.     
  224.         print_freed_lock_bug(curr, mem_from, mem_from + mem_len, hlock);
  225.         break;
  226.     }
  227.     local_irq_restore(flags);
  228. }

  229. static inline int not_in_range(const void* mem_from, unsigned long mem_len,
  230.                                 const void* lock_from, unsigned long lock_len)
  231. {
  232.     return lock_from + lock_len <= mem_from || mem_from + mem_len <= lock_from;
  233. }

  234. static void print_freed_lock_bug(struct task_struct *curr, const void *mem_from,
  235.                                 const void *mem_to, struct held_lock *hlock)
  236. {
  237.     //如果关闭所有lock-debugging,则退出
  238.     if (!debug_locks_off())
  239.         return;
  240.     //
  241.     if (debug_locks_silent)
  242.         return;
  243.     
  244.     printk("\n");
  245.     printk("=========================\n");
  246.     printk("[ BUG: held lock freed! ]\n");
  247.     print_kernel_ident();
  248.     printk("-------------------------\n");
  249.     printk("%s/%d is freeing memory %p-%p, with a lock still held there!\n",
  250.         curr->comm, task_pid_nr(curr), mem_from, mem_to-1);
  251.     print_lock(hlock);//打印锁信息
  252.     lockdep_print_held_locks(curr);
  253.  
  254.     printk("\nstack backtrace:\n");
  255.     dump_stack();//打印堆栈信息
  256. }

  257. //Generic 'turn off all lock debugging' function:
  258. int debug_locks_off(void)
  259. {
  260.     if (__debug_locks_off()) {
  261.         if (!debug_locks_silent) {
  262.             console_verbose();
  263.             return 1;
  264.         }
  265.     }
  266.     return 0;
  267. }

  268. //debug_locks为1表示打开lock-debugging,为0表示关闭所有lock-debugging
  269. static inline int __debug_locks_off(void)
  270. {
  271.     return xchg(&debug_locks, 0);
  272. }

  273. static void print_kernel_ident(void)
  274. {
  275.     printk("%s %.*s %s\n", init_utsname()->release,
  276.         (int)strcspn(init_utsname()->version, " "),
  277.         init_utsname()->version,
  278.         print_tainted());
  279. }

  280. static void print_lock(struct held_lock *hlock)
  281. {
  282.     print_lock_name(hlock_class(hlock));
  283.     printk(", at: ");
  284.     print_ip_sym(hlock->acquire_ip);
  285. }

  286. static inline struct lock_class *hlock_class(struct held_lock *hlock)
  287. {
  288.     if (!hlock->class_idx) {
  289.         DEBUG_LOCKS_WARN_ON(1);
  290.         return NULL;
  291.     }
  292.     return lock_classes + hlock->class_idx - 1;
  293. }

  294. static void print_lock_name(struct lock_class *class)
  295. {
  296. 529 char usage[LOCK_USAGE_CHARS];
  297. 530
  298. 531 get_usage_chars(class, usage);
  299. 532
  300. 533 printk(" (");
  301. 534 __print_lock_name(class);
  302. 535 printk("){%s}", usage);
  303. }

  304. static void __print_lock_name(struct lock_class *class)
  305. {
  306. 511 char str[KSYM_NAME_LEN];
  307. 512 const char *name;
  308. 513
  309. 514 name = class->name;
  310. 515 if (!name) {
  311. 516 name = __get_key_name(class->key, str);
  312. 517 printk("%s", name);
  313. 518 } else {
  314. 519 printk("%s", name);
  315. 520 if (class->name_version > 1)
  316. 521 printk("#%d", class->name_version);
  317. 522 if (class->subclass)
  318. 523 printk("/%d", class->subclass);
  319. 524 }
  320. }

  321. static inline void print_ip_sym(unsigned long ip)
  322. {
  323.     printk("[<%p>] %pS\n", (void *) ip, (void *) ip);
  324. }

  325. static void lockdep_print_held_locks(struct task_struct *curr)
  326. {
  327.     int i, depth = curr->lockdep_depth;
  328.     
  329.     if (!depth) {
  330.         printk("no locks held by %s/%d.\n", curr->comm, task_pid_nr(curr));
  331.         return;
  332.     }
  333.     printk("%d lock%s held by %s/%d:\n",
  334.         depth, depth > 1 ? "s" : "", curr->comm, task_pid_nr(curr));
  335.     
  336.     for (i = 0; i < depth; i++) {
  337.         printk(" #%d: ", i);
  338.         print_lock(curr->held_locks + i);
  339.     }
  340. }

  341. 2.


  342. 参考http://www.lenky.info/archives/2013/04/2253

阅读(3708) | 评论(0) | 转发(2) |
0

上一篇:scatterlist分析

下一篇:arm linux启动流程三

给主人留下些什么吧!~~