Linux内存管理之slab分配器分析(一)-hanwei

hanwei_1049

首页　| 　博文目录　| 　关于我

hanwei_1049

博客访问： 1527955
博文数量： 228
博客积分： 1698
博客等级：上尉
技术积分： 3241
用户组：普通用户
注册时间： 2008-12-24 21:49

个人简介

Linux

文章分类

全部博文（228）

进程间通信（4）
LVS（0）
OpenStack（1）
HTTPS（5）
LUA（1）
版本控制（9）
个人计划（2）
Nginx（27）
MySQL（0）
Trouble Sho（2）
HaProxy（1）
进程调度（0）
ATS（31）
CDN（15）
Redis（0）
TCP/IP协议栈（7）
文件系统/存储（3）
内存管理（11）
系统/脚本（4）
编程相关（8）
攻防研究（13）
体系结构（9）
数据结构（0）
内核相关（14）
安全相关（1）
网络相关（43）
未分配的博文（17）

文章存档

2017年（1）

2016年（43）

2015年（102）

2014年（44）

2013年（5）

2012年（30）

2011年（3）

我的朋友

相关博文

Linux内存管理之slab分配器分析(一)

分类： LINUX

2015-01-14 20:27:53

Linux采用了slab来管理小块内存的分配与释放，Slab的提出基于以下两个考虑：

1. 内核函数经常倾向于反复请求相同的数据类型

2. 不同的结构使用不同的分配方法可以提高效率

3. 伙伴系统的频繁申请/释放影响效率，将释放内存放入缓冲区，直至超过阀值再归还给伙伴系统

4. 可以缓解内存碎片的产生，不过产生了内存浪费

Slab分为专用高速缓存（用于存放专用数据结构，skb/mm等）和普通高速缓存（普通的kmalloc之类）。所有高

速缓存区通过链表组织在一起，首节点是cache_chain。普通高速缓存分为(2^0)、(2^1)、(2^2)......，区域的个数

以及大小与系统内存配置以及PAGE_SIZE/L1_CACHE_BYTES/KMALLOC_MAX_SIZE等相关，具体在

linux/kmalloc_sizes.h中定义，每个大小对应两个高速缓存，一个是DMA高速缓存，一个是常规高速缓存。他们

存放在struct cache_sizes malloc_sizes[]中。

Slab把每一个请求的内存称之为对象，并将对象存放在slab中，具体的slab按照空、未满和全满放在不同的链表

中，全部连接至高速缓存中。如下

Slab分配器的相关数据结构：

点击(此处)折叠或打开

static DEFINE_MUTEX(cache_chain_mutex); // 保护相关的链表操作
static struct list_head cache_chain; // 串联各kemem_cache

点击(此处)折叠或打开

/*
* struct array_cache
*
* Purpose:
* - LIFO ordering, to hand out cache-warm objects from _alloc
* - reduce the number of linked list operations
* - reduce spinlock operations
*
* The limit is stored in the per-cpu structure to reduce the data cache
* footprint.
*
*/
struct array_cache {
unsigned int avail; // 当前空闲对象的位置，取值时objp = ac->entry[--ac->avail]
unsigned int limit; // 空闲对象上限
unsigned int batchcount; // 每次需要申请对象数量或释放的对象数量，和kmem_cache相同
unsigned int touched; // 如果从该组中分配了对象，就设置为1
spinlock_t lock;
void *entry[]; /* // 指向可用的slab对象指针数组
* Must have this definition in here for the proper
* alignment of array_cache. Also simplifies accessing
* the entries.
*/
};

点击(此处)折叠或打开

/*
* The slab lists for all objects.
*/
struct kmem_list3 {
struct list_head slabs_partial; /* 部分已分配的链表 partial list first, better asm code */
struct list_head slabs_full; /× 已经全部分配的链表 ×/
struct list_head slabs_free; /× 尚未分配的链表 ×/
unsigned long free_objects; /× partial和free上可用的对象个数 ×/
unsigned int free_limit; /× 所有slab上允许未使用的对象最大数目 ×/
unsigned int colour_next; /* 下一个slab的颜色 Per-node cache coloring */
spinlock_t list_lock;
struct array_cache *shared; /* 节点内共享 shared per node */
struct array_cache **alien; /* 在其他节点上 on other nodes */
unsigned long next_reap; /* 尝试两次缓存回收必须经过的时间间隔 updated without locking */
int free_touched; /* 缓存是否活动的，0回收，1申请对象 updated without locking */
};

点击(此处)折叠或打开

/*
* struct slab
*
* Manages the objs in a slab. Placed either at the beginning of mem allocated
* for a slab, or allocated from an general cache.
* Slabs are chained into three list: fully used, partial, fully free slabs.
*/
struct slab {
struct list_head list; /× Slab所在的链表 ×/
unsigned long colouroff; /× Slab中第一个对象的偏移 ×/
void *s_mem; /* 第一个对象的地址 including colour offset */
unsigned int inuse; /* 被使用的对象的数目 num of objs active in slab */
kmem_bufctl_t free; /× 下一个空闲对象的下标 ×/
unsigned short nodeid; /× NUMA ID，节点标识号 ×/
};

点击(此处)折叠或打开

/*
* struct kmem_cache
*
* manages a cache.
*/
struct kmem_cache {
/* 1) per-cpu data, touched during every alloc/free */
struct array_cache *array[NR_CPUS]; /× Per_cpu结构，用于节点的申请与释放 ×/
/* 2) Cache tunables. Protected by cache_chain_mutex */
unsigned int batchcount; /× Array中没有空闲对象时，每次为数组分配的对象数目 ×/
unsigned int limit; /× Array中允许的最大对象数目，超出后会将batchcount个对象返回给slab ×/
unsigned int shared; /× 是否共享CPU高速缓存 ×/
unsigned int buffer_size; /* Slab中对象的大小，对象长度+填充字节 */
u32 reciprocal_buffer_size; /× slab中对象大小的倒数，加快计算 ×/
/* 3) touched by every alloc & free from the backend */
unsigned int flags; /* 高速缓存永久化的标志，当前只有CFLGS_OFF_SLAB，用于标识slab管理数据是在slab内还是外 */
unsigned int num; /* 封装在一个单独slab中的对象的数目 # of objs per slab */
/* 4) cache_grow/shrink */
/* order of pgs per slab (2^n) */
unsigned int gfporder; /× 每个slab包含的页框数，取2为底的对数 ×/
/* force GFP flags, e.g. GFP_DMA */
gfp_t gfpflags; /* 分配页框时传递给伙伴系统的标志 */
size_t colour; /* 缓存的着色块个数 cache colouring range */
unsigned int colour_off; /* Slab基本的对齐偏移，基本偏移量×着色块个数得到的绝对偏移量 colour offset */
struct kmem_cache *slabp_cache; /* Slab描述符在外部时使用，其指向的高速缓存用于存储描述符；在内部时为NULL */
unsigned int slab_size; /× Slab管理区大小，包含slab对象和kmem_buffctl_t数组 ×/
unsigned int dflags; /* 描述高速缓存的动态属性，目前没有使用 dynamic flags */
/* constructor func */
void (*ctor)(void *obj); /× 创造高速缓存时的构造函数指针 ×/
/* 5) cache creation/removal */
const char *name; /× Cache的名字 ×/
struct list_head next; /× 链接高速缓存的双向指针至cache_chain上 ×/
/* 6) statistics */
#ifdef CONFIG_DEBUG_SLAB
unsigned long num_active;
unsigned long num_allocations;
unsigned long high_mark;
unsigned long grown;
unsigned long reaped;
unsigned long errors;
unsigned long max_freeable;
unsigned long node_allocs;
unsigned long node_frees;
unsigned long node_overflow;
atomic_t allochit; /× Cache的命中次数，在分配中更新 ×/
atomic_t allocmiss; /× Cache的未命中次数，在分配中更新 ×/
atomic_t freehit;
atomic_t freemiss;
/*
* If debugging is enabled, then the allocator can add additional
* fields and/or padding to every object. buffer_size contains the total
* object size including these internal fields, the following two
* variables contain the offset to the user object and its size.
*/
int obj_offset; /× 对象间的偏移 ×/
int obj_size; /× 对象本身的大小 ×/
#endif /* CONFIG_DEBUG_SLAB */
/*
* We put nodelists[] at the end of kmem_cache, because we want to size
* this array to nr_node_ids slots instead of MAX_NUMNODES
* (see kmem_cache_init())
* We still use [MAX_NUMNODES] and not [1] or [0] because cache_cache
* is statically defined, so we reserve the max number of nodes.
*/
struct kmem_list3 *nodelists[MAX_NUMNODES]; /× 存放所有NUMA节点对应的相关数据，每个节点拥有自己的数据 ×/
/*
* Do not add fields after nodelists[]
*/
};

借鉴网友的图片，来加深下理解：

图基本说明了各结构之间的关联关系，具体申请和释放的过程，后面还会继续分析，结合图更容易理解。

另，下面一段摘抄自http://blog.csdn.net/hs794502825/article/details/7981524

每个slab首部都有一个小区域是不用的，称为“着色区”。着色区的大小使Slab中的每个对象的起始地址都按高速缓存中的缓存行（cache line）大小进行对齐（80386的一级高速缓存行大小为16字节，Pentium为32字节）。因为Slab是由1个页面或多个页面（最多为32）组成，因此每个Slab都是从一个页面边界开始的，它自然按高速缓存的缓冲行对齐。但是，Slab中的对象大小不确定，设置着色区的目的就是将Slab中第一个对象的起始地址往后推到与缓冲行对齐的位置。因为一个缓冲区中有多个Slab，因此，应该把每个缓冲区中的各个Slab着色区的大小尽量安排成不同的大小，这样可以使得在不同的Slab中，处于同一相对位置的对象，让它们在高速缓存中的起始地址相互错开，这样就可以改善高速缓存的存取效率。

每个Slab上最后一个对象以后也有个小小的废料区是不用的，这是对着色区大小的补偿，其大小取决于着色区的大小，以及Slab与其每个对象的相对大小。但该区域与着色区的总和对于同一种对象的各个Slab是个常数。

每个对象的大小基本上是所需数据结构的大小。只有当数据结构的大小不与高速缓存中的缓冲行对齐时，才增加若干字节使其对齐。所以，一个Slab上的所有对象的起始地址都必然是按高速缓存中的缓冲行对齐的。

与传统的内存管理模式相比， slab 缓存分配器提供了很多优点。首先，内核通常依赖于对小对象的分配，它们会在系统生命周期内进行无数次分配。slab 缓存分配器通过对类似大小的对象进行缓存而提供这种功能，从而避免了常见的碎片问题。slab 分配器还支持通用对象的初始化，从而避免了为同一目而对一个对象重复进行初始化。最后，slab 分配器还可以支持硬件缓存对齐和着色，这允许不同缓存中的对象占用相同的缓存行，从而提高缓存的利用率并获得更好的性能。

阅读(2143) | 评论(0) | 转发(2) |

上一篇：Kernel Symbols and CONFIG_MODVERSIONS

下一篇：结合内核源码来看如何调整影响TIME_WAIT状态套接字数量的参数

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6