Chinaunix首页 | 论坛 | 博客
  • 博客访问: 294171
  • 博文数量: 44
  • 博客积分: 10
  • 博客等级: 民兵
  • 技术积分: 1354
  • 用 户 组: 普通用户
  • 注册时间: 2012-04-08 15:38
个人简介

人生像是在跑马拉松,能够完赛的都是不断地坚持向前迈进;人生就是像在跑马拉松,不断调整步伐,把握好分分秒秒;人生还是像在跑马拉松,能力决定了能跑短程、半程还是全程。人生其实就是一场马拉松,坚持不懈,珍惜时间。

文章分类

分类: LINUX

2015-11-23 02:26:40

前面分析了Slub分配算法的缓存区创建及对象分配,现继续分配算法的对象回收。

    Slub分配算法中对象释放的接口为kmem_cache_free():

  1. 【file:/mm/slub.c】
  2. void kmem_cache_free(struct kmem_cache *s, void *x)
  3. {
  4.     s = cache_from_obj(s, x);
  5.     if (!s)
  6.         return;
  7.     slab_free(s, virt_to_head_page(x), x, _RET_IP_);
  8.     trace_kmem_cache_free(_RET_IP_, x);
  9. }

该函数中,cache_from_obj()主要是用于获取回收对象的kmem_cache,而slab_free()主要是用于将对象回收,至于trace_kmem_cache_free()则是对对象的回收做轨迹跟踪的。

具体看一下cache_from_obj()的实现:

  1. 【file:/mm/slub.h】
  2. static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
  3. {
  4.     struct kmem_cache *cachep;
  5.     struct page *page;
  6.  
  7.     /*
  8.      * When kmemcg is not being used, both assignments should return the
  9.      * same value. but we don't want to pay the assignment price in that
  10.      * case. If it is not compiled in, the compiler should be smart enough
  11.      * to not do even the assignment. In that case, slab_equal_or_root
  12.      * will also be a constant.
  13.      */
  14.     if (!memcg_kmem_enabled() && !unlikely(s->flags & SLAB_DEBUG_FREE))
  15.         return s;
  16.  
  17.     page = virt_to_head_page(x);
  18.     cachep = page->slab_cache;
  19.     if (slab_equal_or_root(cachep, s))
  20.         return cachep;
  21.  
  22.     pr_err("%s: Wrong slab cache. %s but object is from %s\n",
  23.         __FUNCTION__, cachep->name, s->name);
  24.     WARN_ON_ONCE(1);
  25.     return s;
  26. }

详细分析一下slub的对象回收实现函数slab_free()kmem_cache在kmem_cache_free()的入参已经传入了,但是这里仍然要去重新判断获取该结构,主要是由于当内核将各缓冲区链起来的时候,其通过对象地址经virt_to_head_page()转换后获取的page页面结构远比用户传入的值得可信。所以在该函数中则先会if (!memcg_kmem_enabled() && !unlikely(s->flags & SLAB_DEBUG_FREE))判断是否memcg未开启且kmem_cache未设置SLAB_DEBUG_FREE,如果是的话,接着通过virt_to_head_page()经由对象地址获得其页面page管理结构;再经由slab_equal_or_root()判断调用者传入的kmem_cache是否与释放的对象所属的cache相匹配,如果匹配,则将由对象得到kmem_cache返回;否则最后只好将调用者传入的kmem_cache返回。

  1. 【file:/mm/slub.c】
  2. /*
  3.  * Fastpath with forced inlining to produce a kfree and kmem_cache_free that
  4.  * can perform fastpath freeing without additional function calls.
  5.  *
  6.  * The fastpath is only possible if we are freeing to the current cpu slab
  7.  * of this processor. This typically the case if we have just allocated
  8.  * the item before.
  9.  *
  10.  * If fastpath is not possible then fall back to __slab_free where we deal
  11.  * with all sorts of special processing.
  12.  */
  13. static __always_inline void slab_free(struct kmem_cache *s,
  14.             struct page *page, void *x, unsigned long addr)
  15. {
  16.     void **object = (void *)x;
  17.     struct kmem_cache_cpu *c;
  18.     unsigned long tid;
  19.  
  20.     slab_free_hook(s, x);
  21.  
  22. redo:
  23.     /*
  24.      * Determine the currently cpus per cpu slab.
  25.      * The cpu may change afterward. However that does not matter since
  26.      * data is retrieved via this pointer. If we are on the same cpu
  27.      * during the cmpxchg then the free will succedd.
  28.      */
  29.     preempt_disable();
  30.     c = __this_cpu_ptr(s->cpu_slab);
  31.  
  32.     tid = c->tid;
  33.     preempt_enable();
  34.  
  35.     if (likely(page == c->page)) {
  36.         set_freepointer(s, object, c->freelist);
  37.  
  38.         if (unlikely(!this_cpu_cmpxchg_double(
  39.                 s->cpu_slab->freelist, s->cpu_slab->tid,
  40.                 c->freelist, tid,
  41.                 object, next_tid(tid)))) {
  42.  
  43.             note_cmpxchg_failure("slab_free", s, tid);
  44.             goto redo;
  45.         }
  46.         stat(s, FREE_FASTPATH);
  47.     } else
  48.         __slab_free(s, page, x, addr);
  49.  
  50. }

函数最先的是slab_free_hook()对象释放处理钩子调用处理,主要是用于去注册kmemleak中的对象;接着是redo的标签,该标签主要是用于释放过程中出现因抢占而发生CPU迁移的时候,跳转重新处理的点;在redo里面,将先通过preempt_disable()禁止抢占,然后__this_cpu_ptr()获取本地CPUkmem_cache_cpu管理结构以及其中的事务IDtid),然后preempt_enable()恢复抢占;if(likely(page == c->page))如果当前释放的对象与本地CPU的缓存区相匹配,将会set_freepointer()设置该对象尾随的空闲对象指针数据,然后类似分配时,经由this_cpu_cmpxchg_double()原子操作,将对象归还回去;但是如果当前释放的对象与本地CPU的缓存区不匹配,意味着不可以快速释放对象,此时将会通过__slab_free()慢通道将对象释放。

接着分析一下__slab_free()的实现:

  1. 【file:/mm/slub.c】
  2. /*
  3.  * Slow patch handling. This may still be called frequently since objects
  4.  * have a longer lifetime than the cpu slabs in most processing loads.
  5.  *
  6.  * So we still attempt to reduce cache line usage. Just take the slab
  7.  * lock and free the item. If there is no additional partial page
  8.  * handling required then we can return immediately.
  9.  */
  10. static void __slab_free(struct kmem_cache *s, struct page *page,
  11.             void *x, unsigned long addr)
  12. {
  13.     void *prior;
  14.     void **object = (void *)x;
  15.     int was_frozen;
  16.     struct page new;
  17.     unsigned long counters;
  18.     struct kmem_cache_node *n = NULL;
  19.     unsigned long uninitialized_var(flags);
  20.  
  21.     stat(s, FREE_SLOWPATH);
  22.  
  23.     if (kmem_cache_debug(s) &&
  24.         !(n = free_debug_processing(s, page, x, addr, &flags)))
  25.         return;
  26.  
  27.     do {
  28.         if (unlikely(n)) {
  29.             spin_unlock_irqrestore(&n->list_lock, flags);
  30.             n = NULL;
  31.         }
  32.         prior = page->freelist;
  33.         counters = page->counters;
  34.         set_freepointer(s, object, prior);
  35.         new.counters = counters;
  36.         was_frozen = new.frozen;
  37.         new.inuse--;
  38.         if ((!new.inuse || !prior) && !was_frozen) {
  39.  
  40.             if (kmem_cache_has_cpu_partial(s) && !prior) {
  41.  
  42.                 /*
  43.                  * Slab was on no list before and will be
  44.                  * partially empty
  45.                  * We can defer the list move and instead
  46.                  * freeze it.
  47.                  */
  48.                 new.frozen = 1;
  49.  
  50.             } else { /* Needs to be taken off a list */
  51.  
  52.                             n = get_node(s, page_to_nid(page));
  53.                 /*
  54.                  * Speculatively acquire the list_lock.
  55.                  * If the cmpxchg does not succeed then we may
  56.                  * drop the list_lock without any processing.
  57.                  *
  58.                  * Otherwise the list_lock will synchronize with
  59.                  * other processors updating the list of slabs.
  60.                  */
  61.                 spin_lock_irqsave(&n->list_lock, flags);
  62.  
  63.             }
  64.         }
  65.  
  66.     } while (!cmpxchg_double_slab(s, page,
  67.         prior, counters,
  68.         object, new.counters,
  69.         "__slab_free"));
  70.  
  71.     if (likely(!n)) {
  72.  
  73.         /*
  74.          * If we just froze the page then put it onto the
  75.          * per cpu partial list.
  76.          */
  77.         if (new.frozen && !was_frozen) {
  78.             put_cpu_partial(s, page, 1);
  79.             stat(s, CPU_PARTIAL_FREE);
  80.         }
  81.         /*
  82.          * The list lock was not taken therefore no list
  83.          * activity can be necessary.
  84.          */
  85.                 if (was_frozen)
  86.                         stat(s, FREE_FROZEN);
  87.                 return;
  88.         }
  89.  
  90.     if (unlikely(!new.inuse && n->nr_partial > s->min_partial))
  91.         goto slab_empty;
  92.  
  93.     /*
  94.      * Objects left in the slab. If it was not on the partial list before
  95.      * then add it.
  96.      */
  97.     if (!kmem_cache_has_cpu_partial(s) && unlikely(!prior)) {
  98.         if (kmem_cache_debug(s))
  99.             remove_full(s, n, page);
  100.         add_partial(n, page, DEACTIVATE_TO_TAIL);
  101.         stat(s, FREE_ADD_PARTIAL);
  102.     }
  103.     spin_unlock_irqrestore(&n->list_lock, flags);
  104.     return;
  105.  
  106. slab_empty:
  107.     if (prior) {
  108.         /*
  109.          * Slab on the partial list.
  110.          */
  111.         remove_partial(n, page);
  112.         stat(s, FREE_REMOVE_PARTIAL);
  113.     } else {
  114.         /* Slab must be on the full list */
  115.         remove_full(s, n, page);
  116.     }
  117.  
  118.     spin_unlock_irqrestore(&n->list_lock, flags);
  119.     stat(s, FREE_SLAB);
  120.     discard_slab(s, page);
  121. }

该函数最先的if (kmem_cache_debug(s) && !(n = free_debug_processing(s, page, x, addr, &flags)))主要用于kmem_cache_debug()判断是否开启调试,如果开启,将通过free_debug_processing()进行调试检测以及获取经检验过的合法的kmem_cache_node节点缓冲区管理结构;接着进入do-while循环,如果kmem_cache_nodefree_debug_processing()返回出来,则n不为空,那么将会释放其在free_debug_processing()内加的锁进行释放,并将n置空;然后获取缓冲区的信息以及设置对象末尾的空闲对象指针,同时更新缓冲区中对象使用数。

往下if ((!new.inuse || !prior) && !was_frozen)的判断,如果缓冲区中被使用的对象为0或者空闲队列为空,且缓冲区未处于冻结态(即缓冲区未处于每CPU对象缓存中),那么意味着该释放的对象是缓冲区中最后一个被使用的对象,对象释放之后的缓冲区是可以被释放回伙伴管理算法的;接着if (kmem_cache_has_cpu_partial(s) && !prior)的判断,表示每CPU存在partial半满队列同时空闲队列不为空,那么该缓冲区将会设置frozen标识,用于后期将其放置到每CPUpartial队列中,反之,那么意味着该缓冲区将会从链表中移出,接着将会get_node()获取节点缓冲区管理结构,同时spin_lock_irqsave()加锁持有该slab的节点管理结构;最后通过cmpxchg_double_slab()将对象释放,如果执行失败,将返回重试。

接下来if (likely(!n))判断中kmem_cache_node不为空,如果if (new.frozen && !was_frozen)前面未冻结该缓冲区,这将会把该缓冲区put_cpu_partial()挂入到每CPUpartial队列中,同时stat()更新统计信息;如果if (was_frozen)冻结了该缓冲区,则仅需stat()更新统计信息。

if (unlikely(!new.inuse && n->nr_partial > s->min_partial)) 如果缓冲区无对象被使用,且节点的半满slab缓冲区数量超过了最小临界点,则该页面将需要被释放掉,那么将会跳转至slab_empty执行缓冲区释放操作。

此外if (!kmem_cache_has_cpu_partial(s) && unlikely(!prior)) 该缓冲区因对象的释放,处于半满状态(即仍有对象被占用的情况),则其将从full链表中remove_full()移出,并add_partial()添加至半满partial队列中。

最后spin_unlock_irqrestore()释放中断锁并恢复中断环境。

至于slab_empty标签中的缓冲区释放流程,则是根据其空闲队列是否空,然后选择地去将该页面从对应的full或者partial链表中摘除,然后spin_unlock_irqrestore()释放锁,最终通过discard_slab()将缓冲区释放。

回顾该函数,下面侧重看一下free_debug_processing()的处理及discard_slab()的实现。

  1. 【file:/mm/slub.c】
  2. static noinline struct kmem_cache_node *free_debug_processing(
  3.     struct kmem_cache *s, struct page *page, void *object,
  4.     unsigned long addr, unsigned long *flags)
  5. {
  6.     struct kmem_cache_node *n = get_node(s, page_to_nid(page));
  7.  
  8.     spin_lock_irqsave(&n->list_lock, *flags);
  9.     slab_lock(page);
  10.  
  11.     if (!check_slab(s, page))
  12.         goto fail;
  13.  
  14.     if (!check_valid_pointer(s, page, object)) {
  15.         slab_err(s, page, "Invalid object pointer 0x%p", object);
  16.         goto fail;
  17.     }
  18.  
  19.     if (on_freelist(s, page, object)) {
  20.         object_err(s, page, object, "Object already free");
  21.         goto fail;
  22.     }
  23.  
  24.     if (!check_object(s, page, object, SLUB_RED_ACTIVE))
  25.         goto out;
  26.  
  27.     if (unlikely(s != page->slab_cache)) {
  28.         if (!PageSlab(page)) {
  29.             slab_err(s, page, "Attempt to free object(0x%p) "
  30.                 "outside of slab", object);
  31.         } else if (!page->slab_cache) {
  32.             printk(KERN_ERR
  33.                 "SLUB : no slab for object 0x%p.\n",
  34.                         object);
  35.             dump_stack();
  36.         } else
  37.             object_err(s, page, object,
  38.                     "page slab pointer corrupt.");
  39.         goto fail;
  40.     }
  41.  
  42.     if (s->flags & SLAB_STORE_USER)
  43.         set_track(s, object, TRACK_FREE, addr);
  44.     trace(s, page, object, 0);
  45.     init_object(s, object, SLUB_RED_INACTIVE);
  46. out:
  47.     slab_unlock(page);
  48.     /*
  49.      * Keep node_lock to preserve integrity
  50.      * until the object is actually freed
  51.      */
  52.     return n;
  53.  
  54. fail:
  55.     slab_unlock(page);
  56.     spin_unlock_irqrestore(&n->list_lock, *flags);
  57.     slab_fix(s, "Object at 0x%p not freed", object);
  58.     return NULL;
  59. }

该调测处理函数主要检测有:check_slab()检查slabkmem_cachepage中的slab信息是否匹配,如果不匹配,可能发生了破坏或者数据不符;check_valid_pointer()检查对象地址的合法性,表示地址确切地为某对象的首地址,而非对象的中间位置;on_freelist()检测该对象是否已经被释放,避免造成重复释放;check_object()主要是根据内存标识SLAB_RED_ZONESLAB_POISON的设置,对对象空间进行完整性检测;至于if (unlikely(s != page->slab_cache))判断主要是为了确保用户传入的kmem_cache与页面所属的kmem_cache类型是匹配的,否则将记录错误日志。此外还根据if (s->flags & SLAB_STORE_USER) 如果设置了SLAB_STORE_USER标识,将记录对象释放的track信息。最后将trace()记录对象的轨迹信息,同时还init_object()将重新初始化对象。代码末尾的outfail则是对检测处理的成功及释放的后处理。

至于discard_slab()的实现:

  1. 【file:/mm/slub.c】
  2. static void discard_slab(struct kmem_cache *s, struct page *page)
  3. {
  4.     dec_slabs_node(s, page_to_nid(page), page->objects);
  5.     free_slab(s, page);
  6. }

    如果discard_slab()释放缓冲区,将会先dec_slabs_node()更新统计,然后通过free_slab()进行处理。

  1. 【file:/mm/slub.c】
  2. static void free_slab(struct kmem_cache *s, struct page *page)
  3. {
  4.     if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) {
  5.         struct rcu_head *head;
  6.  
  7.         if (need_reserve_slab_rcu) {
  8.             int order = compound_order(page);
  9.             int offset = (PAGE_SIZE << order) - s->reserved;
  10.  
  11.             VM_BUG_ON(s->reserved != sizeof(*head));
  12.             head = page_address(page) + offset;
  13.         } else {
  14.             /*
  15.              * RCU free overloads the RCU head over the LRU
  16.              */
  17.             head = (void *)&page->lru;
  18.         }
  19.  
  20.         call_rcu(head, rcu_free_slab);
  21.     } else
  22.         __free_slab(s, page);
  23. }

如果设置了SLAB_DESTROY_BY_RCU标识,将会通过RCU的方式将内存页面释放掉,否则将会通过__free_slab()普通方式释放。

__free_slab()的实现:

  1. 【file:/mm/slub.c】
  2. static void __free_slab(struct kmem_cache *s, struct page *page)
  3. {
  4.     int order = compound_order(page);
  5.     int pages = 1 << order;
  6.  
  7.     if (kmem_cache_debug(s)) {
  8.         void *p;
  9.  
  10.         slab_pad_check(s, page);
  11.         for_each_object(p, s, page_address(page),
  12.                         page->objects)
  13.             check_object(s, page, p, SLUB_RED_INACTIVE);
  14.     }
  15.  
  16.     kmemcheck_free_shadow(page, compound_order(page));
  17.  
  18.     mod_zone_page_state(page_zone(page),
  19.         (s->flags & SLAB_RECLAIM_ACCOUNT) ?
  20.         NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE,
  21.         -pages);
  22.  
  23.     __ClearPageSlabPfmemalloc(page);
  24.     __ClearPageSlab(page);
  25.  
  26.     memcg_release_pages(s, order);
  27.     page_mapcount_reset(page);
  28.     if (current->reclaim_state)
  29.         current->reclaim_state->reclaimed_slab += pages;
  30.     __free_memcg_kmem_pages(page, order);
  31. }

    其将通过compound_order()获取页面阶数转而获得释放的页面数;然后kmem_cache_debug()判断该slab是否开启了调测,如果开启,将会对该slab缓冲区进行一次检测,主要是检测是否有内存破坏以记录相关信息;接着kmemcheck_free_shadow()释放影子内存;mod_zone_page_state()修改内存页面的状态,同时__ClearPageSlabPfmemalloc()__ClearPageSlab()清除页面的slab信息;最后memcg_release_pages()释放memcg中的页面处理,接着page_mapcount_reset()重置页面映射计数,最后则是__free_memcg_kmem_pages()将页面释放。

    至于__free_memcg_kmem_pages()的实现则是将memcg去注册页面,然后经由__free_pages()将页面归还到伙伴管理算法中。

  1. 【file:/mm/page_alloc.c】
  2. /*
  3.  * __free_memcg_kmem_pages and free_memcg_kmem_pages will free
  4.  * pages allocated with __GFP_KMEMCG.
  5.  *
  6.  * Those pages are accounted to a particular memcg, embedded in the
  7.  * corresponding page_cgroup. To avoid adding a hit in the allocator to search
  8.  * for that information only to find out that it is NULL for users who have no
  9.  * interest in that whatsoever, we provide these functions.
  10.  *
  11.  * The caller knows better which flags it relies on.
  12.  */
  13. void __free_memcg_kmem_pages(struct page *page, unsigned int order)
  14. {
  15.     memcg_kmem_uncharge_pages(page, order);
  16.     __free_pages(page, order);
  17. }

    至此对象释放分析完毕。

阅读(3774) | 评论(0) | 转发(1) |
给主人留下些什么吧!~~