通过mount和ramfs深入理解vfs-xlpang-ChinaUnix博客

floatcloudfloatcloud.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

xlpang

博客访问： 430285
博文数量： 49
博客积分： 1346
博客等级：中尉
技术积分： 936
用户组：普通用户
注册时间： 2010-09-21 01:49

文章分类

全部博文（49）

health（1）
ansic（1）
arch（5）

mips64（2）

arm（1）

amd64（2）
english（28）

IELTS（4）

daily（3）

friends（19）

DesperateHousewi（2）
linux（14）

iommu（0）

misc（5）

2.6.32.41（1）

2.6.21.7（4）

boot（2）

drivers（0）

cpufreq（1）
c（0）
algorithm（0）
u-boot（0）
未分配的博文（0）

文章存档

2013年（7）

2012年（12）

2011年（30）

我的朋友

Bean_lee

相关博文

通过mount和ramfs深入理解vfs

分类： LINUX

2011-10-08 20:41:07

VFS

通过mount, umount, rootfs以及ramfs等相关知识深入学习vfs实现细节。

1 sys_mount()实现

四个参数：

设备名称dev_name，挂载点目录名称dir_name，文件系统类型type，mount标志flags，其它数据data；例如：

sys_mount((char __user *)"/dev/sda1", (char __user *)"/mnt/c/", (char __user *)"ext2", MS_RDONLY | MS_NOATIME | MS_NODIRATIME, NULL);

实现流程：

1) 将用户空间的type_page所指内容拷到内核空间，使用函数copy_mount_options()；

2) 将用户空间的dir_name所指内容拷到内核空间，使用函数getname()；

3) 将用户空间的dev_name所指内容拷到内核空间，使用函数copy_mount_options()；

4) 将用户空间的data所指内容拷到内核空间，使用函数copy_mount_options()；

5) lock_kernel()；

6) do_mount()；

7) unlock_kernel()；

8) 释放内核1)~3)中所申请的内核内存资源。

1.1 copy_mount_options()拷贝用户态页面数据

原型：int copy_mount_options(const void __user * data, unsigned long *where)，正常返回0；

功能：分配一个页面，将用户空间的data所指内容拷贝至其中，之后再将未用到的页面部分清0，使用函数exact_copy_from_user()进行拷贝。

思考exact_copy_from_user()与copy_from_user()的区别：

1.1.1 exact_copy_from_user()

此函数会拷贝用户态的内容至新分配的页面中，一般会拷满整页，或者发生了EFAULT不足一个页，并返回这个页面空间剩余的字节数；当目的内核地址为非法地址时，内核会发生访问Oops(因为没有.fixup)；与copy_from_user()相比，未拷贝的空间不会置0。

1.1.2 copy_from_user()

此函数在有些架构下实现的是个宏；且有些版本的实现不会返回未拷贝的字节数；当拷贝出现异常时会进入.fixup去修正指令即将未拷贝的区域清0；访问失败后此函数会返回-EFAULT不会发生Oops(因为有.fixup)。

1.2 getname()拷贝用户态字符串

通过__getname()分配一个大小为PATH_MAX(4096B)的SLAB变量，最终调用函数strncpy_from_user()将用户空间的名字信息(此处为挂载目录名)。

1.2.1 copy_from_user()

strncpy_from_user()将拷贝至’\0’结尾的字节数，或拷满第三参数len指定的字节数，返回实际拷贝的字节数(不包括结尾’\0’)，出错时返回-EFAULT(出错前的内容已经拷贝)不会发生Oops(因为有.fixup)。

1.3 lock_kernel()/unlock_kernel()大内核锁

大内核锁，此处只分析配了CONFIG_PREEMPT_BKL的情况：0, 1信号量即MUTEX。

在task_struct里面增加了lock_depth字段(初始值为-1，没有持有锁)，用来实现对大内核锁的递归调用功能以避免自己死锁，并且此lock_depth字段可以实现获得锁后主动睡眠。

举例：

lock_kernel();

。。。。。。;

schedule();

。。。。。。;

unlock_kernel();

为了避免别的线程获取锁阻塞，在schedule()前如果持有大内核锁必须将其释放掉；判断是否持有大内核锁是通过查看lock_depth字段的值决定的，此字段的值在获取锁时加1，释放时减1，对每一个线程第一次获取和最后一次释放时才真正执行P/V操作。

在schedule()开头，对prev线程调用release_kernel_lock(prev)，如果持有锁就调用__release_kernel_lock(void)进行V操作；在schedule()结尾任务切换过后对当前线程(即切换前的next)执行reacquire_kernel_lock(current)，如果上次切换出去之前是持有锁的，就会调用__reacquire_kernel_lock(void)释放锁(不是简单的持有锁，在执行P操作的时候可能睡眠，这样就有可能造成schedule()切换从而导致错误的再次释放锁；为避免这种情况发生需要在P操作前将此线程的lock_depth值保存起来并置成-1，P操作成功后恢复lock_depth)。

另外，在P操作之后允许强占，为避免强占后全局数据被破坏，强占发生的调度不允许释放大内核锁(主动调度是否破坏全局数据由设计者保证)，在preempt_schedule()或preempt_schedule_irq()中需要在调用schedule()前将lock_depth值保存起来并置成-1，调度回来后再恢复。如下：

#ifdef CONFIG_PREEMPT_BKL

saved_lock_depth = task->lock_depth; // 保存原来的lock_depth

task->lock_depth = -1; // 置成-1以防schedule()中释放大内核锁

#endif

local_irq_enable();

schedule();

local_irq_disable();

#ifdef CONFIG_PREEMPT_BKL

task->lock_depth = saved_lock_depth; // 恢复lock_depth

#endif

1.4 do_mount()

这个是sys_mount()函数的主体实现，其原型为：

long do_mount(char *dev_name, char *dir_name, char *type_page,

unsigned long flags, void *data_page);

1.4.1 挂载参数

传入的挂载参数flags：

转成内部的内核参数mnt_flags(vfsmnt结构中的成员)：

传入的flags用以初始化mnt_flags以及控制程序分支；转换过的内部mnt_flags用以1.4.3节中的几个挂载主体函数的传入参数执行实际挂载的内核动作。

1.4.2 path_lookup()获得挂载点的nameidata结构

函数主体是do_path_lookup(AT_FDCWD, name, flags, nd)，参数AT_FDCWD是什么意思呢？

我们先来看个系统调用openat(int dirfd, const char *pathname, int flags)，参数dirfd表示将它代表的目录作为起始目录，然后去解析pathname；而传统的方式是将线程的当前目录作为起始目录；还有一种情况是，当dirfd的值为AT_FDCWD时，将表示从线程的当前目录(current->fs->pwd，current->fs->pwdmnt)作为起始目录解析；另外，不管上面哪种情况，当pathname以’/’开头时就以线程的根目录(current->fs->root，current->fs->rootmnt)作为起始目录，此时参数dirfd将被忽略。

此场景中，do_path_lookup的flags参数值为LOOKUP_FOLLOW，表示解析链接(link)；name参数即挂载点目录路径名。

do_path_lookup(dfd, name, flags, nd)具体实现(本场景中dfd为AT_FDCWD)：

1) 初始化nd结构：nd->last_type为LAST_ROOT，默认表示为根；nd->flags为flags(本场景为LOOKUP_FOLLOW)；nd->depth清零用于do_follow_link()中解析符号链接；

2) 解析name参数。

a) 如果以”/”开头，表示绝对路径；nd->mnt和nd->dentry设为线程的根目录信息；

如果cureent->fs->altroot不为空，且传入的flags参数无LOOKUP_NOALT标志，

表示使用线程的更改根目录的信息。

附线程的current->fs的结构定义：

b) 否则，如果传入的flags参数有LOOKUP_ONE标志，表明nd参数的dentry和 mnt成员已被初始化后传入的，这里只简单增加下引用计数即可；

附参数flags的值的含义：

c) 否则，如果传入的参数为AT_FDCWD(本场景)，表示相对路径为当前路径；nd->mnt和nd->dentry设为线程的当前目录信息；

d) 否则最后一种情况，取dfd参数代表的文件信息(dfd必须为文件夹)作为起始目录；nd->mnt和nd->dentry设为此目录相关信息，通过调用fget_light()/fput_light()获得dfd代表的相应文件信息；

e) fget_light()是轻量级的fget()，通过查看current->files(struct files_struct结构)的引用计数来提高性能，如果值为1表示只有本线程使用，可以不用增加目标文件的引用计数，而直接获取文件描述符对应的struct file结构信息；此时fput_light()只要简单的返回即可(实际上为空函数的效果了)。

附struct files_struct结构定义：

(当文件总个数files->fdt->max_fds小于BITS_PER_LONG时，files->fdt->fd指向files->fd_arrray；否则files->fdt->fd指向新分配的1024个struct file结构，files->fdt->max_fds也同步更新。)

3) 当前线程current->total_link_count清零用于do_follow_link()中路径解析符号链接；

4) 将初始化好的nd作为参数(输入)，最后调用link_path_walk(name, nd)，获得name

代表的目录信息，并最后存入nd参数中(输出)。

1.4.2.1 link_path_walk()解析路径信息

link_path_walk(name, nd)，name为挂载点目录路径名字；函数具体实现：

1) nd参数已在do_path_lookup()中初始化了起始位置的dentry和mnt等成员；对name

参数的开头’/’去掉(允许连续多个’\’)，循环解析name路径中的以’/’分隔的中间目录

名，包括目录，“.”，“..”，链接以及文件(必须是路径中的最后一个，不以“/”结尾)：

a) 为“.”。什么都不用做；

b) 为“..”。调用follow_dotdot(nd)，此函数实现比较复杂，主要考虑多个mount覆盖

的情况；

c) 目录。调用do_lookup()查询其对应的dentry和mnt结构存入一个局部struct path

结构中(变量名为next)；如果此目录为链接则调用do_follow_link(&next, nd)解析

之；如果此目录为正常目录则调用path_to_nameidata(&next, nd)将next中存放

的结果放入nd结构中；如果目录不存在，do_lookup()返回失败或inode为空；

d) 文件。如果为最后一层名字且不以“/”结尾，则类似于c)中的做法；另外，如

果有LOOKUP_DIRECTORY查询标志，也要报错；其它情况报错。

2) follow_dotdot(nd)，此函数传入当前目录的nameidata，最终再获取父目录的nameidata

(即dentry和mnt)，先进入一个循环体：

a) 如果当前目录为当前线程的根目录即满足条件：nd->mnt == current->fs->rootmnt以及nd->dentry == current->fs->root，则直接退出循环；

b) 如果当前目录不是当前文件系统的根目录即满足条件：nd->mnt->mnt_root != nd->dentry，则nd->dentry取为nd->dentry->d_parent后退出循环。

c) 如果当前目录为当前文件系统的根目录即满足条件：nd->mnt->mnt_root == nd->dentry，且当前文件系统是顶层文件系统即满足条件：nd->mnt->mnt_parent == nd->mnt，则没法再往上走了，直接退出循环 (和a)不重复，因为线程可能调用了类似于chroot的命令改变根目录，此种情况下current->fs->root等信息不一定是顶层文件系统)；

d) 如果当前目录为当前文件系统的根目录即满足条件：nd->mnt->mnt_root == nd->dentry，且当前文件系统不是顶层文件系统即满足条件：nd->mnt->mnt_parent

!= nd->mnt，则取nd->dentry取nd->mnt->mnt_mountpoint底层挂载点目录，nd->mnt结构则取nd->mnt->mnt_parent，之后继续循环；

最后执行完毕跳出循环后都会执行follow_mount(&nd->mnt, &nd->dentry)，此操作

是必须的，目的是回溯新获得的nd->dentry是否还挂载有顶层文件系统，因为线程

允许挂载顶层文件系统后，继续在被已经覆盖的文件系统中操作文件；此函数的过

程总结为：从当前文件系统层->底层(通过mnt->mountpoint挂载目录实现)->顶层

(follow_mount中通过mnt->mnt_root获得上层真正看得到的文件系统根目录实现)；

3) do_follow_link()实现链接的递归跟踪，对于目录(包括中间目录和路径名的最后一层

目录)会必然调用，对于最后一层文件，如果有LOOKUP_FOLLOW标志会调用；

此函数取出链接文件inode->mapping页面中存放的目录路径后，初始化namidata

后递归调用link_path_walk()进行符号的解析，current->link_count，nd->depth和

nd->saved_names[]用来控制单个链接的递归次数，不超过MAX_NESTED_LINKS

(定义为8)层递归调用；current->total_link_count用来控制整个路径解析中的递归次

数，不超过40；解析过程会主动调度即调用cond_resched()。

link_path_walk()函数中只处理查找标志LOOKUP_DIRECTORY，LOOKUP_FOLLOW和LOOKUP_PARENT。

1.4.3 根据传入的挂载参数flags，走不同的挂载分支

第一个参数为path_lookup()获得的挂载目录的nameidata结构变量nd：

以do_new_mount()为例进行讲解，此函数挂载一个type_page名字标识的文件系统类型到在path_lookup()中解析出来的存放在nd结构中的挂载点上，mnt_flags将存入即将分配出来的mnt结构中的mnt_flags字段，dev_name也将存放入mnt结构中的mnt_devname字段。

1) do_kern_mount(type, flags, name, data)生成文件系统挂载块，返回新分配的mnt结构，

最终通过file_system_type->get_sb()函数完成sb, 根目录inode和根目录dentry等结构

的初始化；并将mnt结构mnt_mountpoint成员指向mnt->mnt_root和mnt_parent成员

指向mnt自身；最终形成一个新的文件系统挂载块；

2) do_add_mount(mnt, nd, mnt_flags, NULL)将新的文件系统挂载块挂载到挂载点上，初

始化mnt的mnt_flags, mnt_mountpoint, mnt_parent，mnt_child父子关系，mnt_list namespace空间和mnt_hash hash链表，将挂载点dentry->d_mounted++，具体操作通过mnt_set_mountpoint()和commit_tree()完成的；挂载过程中需要注意down_write (&namespace_sem)和spin_lock(&vfsmount_lock)的保护。

最终形成类似于下图结构中的红色部分：

1.4.4涉及的mnt，dentry和inode

1.4.4.1 数据结构详解

1. struct vfsmount { // 结构中无锁成员，通过全局vfsmount_lock自旋锁保护

struct list_head mnt_hash; // 链入mount_hashtable全局hash表，用当前mnt->mnt_parent // 和mnt->mnt_mountpoint计算hash值；见lookup_mnt()函数。

struct vfsmount *mnt_parent; // 紧挨着的底层父文件系统的mnt结构，mntget()获得。

struct dentry *mnt_mountpoint; // 本文件系统的挂载点目录，即mnt_parent对应的文

// 件系统中的某个目录，dget()获得。

struct dentry *mnt_root; // 本文件系统的根目录，dget(sb->s_root)，注意引用计数。

struct super_block *mnt_sb; // 本文件系统的超级块结构，在执行do_mount时

// 的file_system_type->get_sb()中指向新分配的超级块。

struct list_head mnt_mounts; // 挂载在本文件系统之上的其它文件系统的mnt结构通过

// 其mnt_child链入此链表。

struct list_head mnt_child; // 见上面描述；即链入mnt_parent->mnt_mounts中。

int mnt_flags; // sys_mount()的flags转成mnt_flags存入此字段；见1.4.1

char *mnt_devname; // 挂载的设备名称，如/dev/sda1

struct list_head mnt_list; // 挂入当前线程的namespace->list结构中。

struct list_head mnt_expire; /* link in fs-specific expiry list */

struct list_head mnt_share; /* circular list of shared mounts */

struct list_head mnt_slave_list;/* list of slave mounts */

struct list_head mnt_slave; /* slave list entry */

struct vfsmount *mnt_master; /* slave is on master->mnt_slave_list */

struct mnt_namespace *mnt_ns; // 指向mnt_list所挂在的namespace结构。

* We put mnt_count & mnt_expiry_mark at the end of struct vfsmount

* to let these frequently modified fields in a separate cache line

* (so that reads of mnt_flags wont ping-pong on SMP machines)

atomic_t mnt_count; // 本mnt结构引用计数，alloc_vfsmnt()分配时置为1；

// 之后通过mntget()/mntput()操作；真正释放时会调用

// fs->kill_sb()并释放超级块。

int mnt_expiry_mark; /* true if marked for expiry */

int mnt_pinned;

#ifdef CONFIG_FUMOUNT

struct rw_semaphore mnt_close_sem;

#endif

};

2.struct dentry { // dentry无对应的磁盘信息

atomic_t d_count; // 引用计数，在d_alloc(parent, struct qstr *name)时置1。

unsigned int d_flags; // 标志，在d_alloc()时置为DCACHE_UNHASHED。

spinlock_t d_lock; // 保护此结构的自旋锁。

struct inode *d_inode; // 关联的inode结构；一般在dir->i_op->lookup()时通过

// d_add(dentry, inode)链入。

* The next three fields are touched by __d_lookup. Place them here

* so they all fit in a cache line.

struct hlist_node d_hash; // 通过此字段链入dentry_hashtable全局hash表中；见

// __d_lookup()，通过d_hash()实现，用d_parent和本qstr名算hash。

struct dentry *d_parent; // 指向父目录项；在d_alloc()中设置。

struct qstr d_name; // 名字和hash值，在d_alloc()中设置。

struct list_head d_lru; // 链入dentry_unused全局链表，表示不用的dentry项，

// dentry_stat. nr_unused反映此链表中的dentry个数。

* d_child and d_rcu can share memory

union {

struct list_head d_child; // 在d_alloc()时链入d_parent->d_subdirs中。

struct rcu_head d_rcu; // 对于dentry_hashtable中删除时用，避免访问了已被删除

// 的对象，对已链拉hash表中的项，调用call_rcu()删除。

} d_u;

struct list_head d_subdirs; // 见d_child

struct list_head d_alias; // 多个dentry可以对应到一个inode(硬链接)，这些dentry通过

// 这个字段链在相同的inode->i_dentry链表中。

unsigned long d_time; /* used by d_revalidate */

struct dentry_operations *d_op; // 一般在dir->i_op->lookup()中设置。

struct super_block *d_sb; // 在d_alloc()时设置为d_parent->d_sb

void *d_fsdata; /* fs-specific data */

#ifdef CONFIG_PROFILING

struct dcookie_struct *d_cookie; /* cookie, if any */

#endif

int d_mounted; // 每当在此dentry目录上发生了mount操作，此字段加1

unsigned char d_iname[DNAME_INLINE_LEN_MIN]; // 内置名字；d_name.name指向它。

};

3. struct inode {

struct hlist_node i_hash; // i_sb和i_ino计算hash值，hash表为inode_hashtable；

struct list_head i_list;

struct list_head i_sb_list;

struct list_head i_dentry;

unsigned long i_ino;

atomic_t i_count; // alloc_inode()时，会置1

unsigned int i_nlink;

uid_t i_uid;

gid_t i_gid;

dev_t i_rdev;

unsigned long i_version;

loff_t i_size;

#ifdef __NEED_I_SIZE_ORDERED

seqcount_t i_size_seqcount;

#endif

struct timespec i_atime;

struct timespec i_mtime;

struct timespec i_ctime;

unsigned int i_blkbits;

blkcnt_t i_blocks;

unsigned short i_bytes;

umode_t i_mode;

spinlock_t i_lock; /* i_blocks, i_bytes, maybe i_size */

struct mutex i_mutex;

struct rw_semaphore i_alloc_sem;

const struct inode_operations *i_op;

const struct file_operations *i_fop; /* former ->i_op->default_file_ops */

struct super_block *i_sb;

struct file_lock *i_flock;

struct address_space *i_mapping;

struct address_space i_data;

#ifdef CONFIG_QUOTA

struct dquot *i_dquot[MAXQUOTAS];

#endif

struct list_head i_devices;

union {

struct pipe_inode_info *i_pipe;

struct block_device *i_bdev;

struct cdev *i_cdev;

};

int i_cindex;

__u32 i_generation;

#ifdef CONFIG_DNOTIFY

unsigned long i_dnotify_mask; /* Directory notify events */

struct dnotify_struct *i_dnotify; /* for directory notifications */

#endif

#ifdef CONFIG_INOTIFY

struct list_head inotify_watches; /* watches on this inode */

struct mutex inotify_mutex; /* protects the watches list */

#endif

unsigned long i_state;

unsigned long dirtied_when; /* jiffies of first dirtying */

unsigned int i_flags;

atomic_t i_writecount;

#ifdef CONFIG_SECURITY

void *i_security;

#endif

void *i_private; /* fs or device private pointer */

};

1.4.4.2 __d_lookup()

位于fs/dcache.c中，函数原型：

struct dentry * __d_lookup(struct dentry * parent, struct qstr * name)；

parent为找到目标的父目录，name为目标qstr信息，包括目标dentry的名字和相应hash值，此hash值是通过init_name_hash(), partial_name_hash(), end_name_hash()函数由dentry的名字信息计算得到，但有些文件系统通过dentry->d_op->d_hash()计算出来；这里涉及两个hash值：第一个是d_name.hash，第二个是d_name.hash和d_parent通过d_hash()计算出来挂入dentry_hashtable中。

1) d_hash(parent, hash)得到dentry_hashtable的hlist_head头，rcu_read_lock()关强占；

2) hlist_for_each_entry_rcu遍历上述获得的链表中的每一个dentry，选择d_name.hash和d_parent和传进来的两个参数匹配的dentry；

3) spin_lock(&dentry->d_lock);

4)获得锁后，再比较下d_parent是否匹配；如果不匹配可能在获取锁之前有人调用了d_move()，此时退出本次查找继续本hash链中的下一个dentry的遍历(基本找不到)；

5)继续比较名字信息是否匹配(因为d_name.hash值匹配后，名字不一定匹配，这里计算hash是为了提高查找速度，不用每个dentry都去比较name字符串)；

6)如果以上条件都符合，则认为找到了目标dentry；找到后，如果此dentry是刚刚分配的，引用计数为1，状态为DCACHE_UNHASHED(见d_alloc)，则直接返回；否则将其d_count加1；

7)找到后，spin_unlock(&dentry->d_lock)，rcu_read_unlock()开强占，退出循环返回此dentry结束。

此函数关强占，然后在dentry_hashtable中查找目标dentry，只涉及dentry->d_lock自旋锁。

d_lock是避免和d_move()发生冲突，此函数会更新dentry的引用计数。RCU是为了保护全

局dentry_hashtable链表。此函数可认为是一个无锁函数，一般内核中以两个下划线开头的

实现的是无锁函数。

1.4.4.3 d_lookup()

位于fs/dcache.c中，函数原型：

struct dentry * d_lookup(struct dentry * parent, struct qstr * name)；

此函数实现为__d_lookup()的包装，在外层简单的加了一把rename_lock的顺序锁，用于

__d_lookup()失败的情况(如存在d_move操作导致失败)。

1.4.4.4 do_lookup()

位于fs/namei.c中，函数原型：

static int do_lookup(struct nameidata *nd, struct qstr *name, struct path *path);

此函数用于link_path_walk()，nd存放父目录信息，name存放目标目录信息，path存放结果。

1) __d_lookup(nd->dentry, name)查找目标dentry，如果找到就调用dentry->d_op->d_revalidate() (如果不存在此函数，则跳过)更新dentry的一些信息，如时戳；

2) 如果__d_lookup()查找失败，一般是目标dentry不存在，需要调用d_alloc()重新分配，则调用查找函数real_lookup(nd->dentry, name, nd)根据情况进行分配；

3) real_lookup()中，先上锁mutex_lock(&parent->inode->i_mutex)，然后调用d_lookup()再次查找hash表，看看是不是在等待mutex时其它人刚好分配出来了；查找成功则返回；

4)上述步骤查找失败后，就需要新分配一个new_dentry结构了，调用d_alloc(parent, name)分配；然后parent->inode->i_op->lookup(parent->inode, new_dentry, nd)；如果i_op->lookup()方法返回NULL意味使用new_dentry，否则意味着找到了dentry，这时就dput(new_dentry)，返回i_op->lookup()到的dentry(上层会检查其inode是否为空)；类似1)中调用d_revalidate()后，mutex_unlock(&parent->inode->i_mutex)后退出real_lookup()；

5)最后一个步骤就是对查找出来的dentry和mnt赋给path结构，进行__follow_mount(path)操作，类似于1.4.2.1的2)步骤。

do_lookup()和d_lookup()的区别是，d_lookup()单单查找dentry_hashtable中的dentry，而

do_lookup()还额外分配新的dentry结构，并调用i_op->lookup()去初始化找到的dentry的

inode信息。附ramfs的i_op->lookup(目录对应的inode->i_op才有此函数)实现：

dentry->d_inode为空称此dentry为negative状态的(无磁盘文件或link_path_walk()过程使

用)，此种dentry仍然可以位于dentry cache中，用于后续的查找使用。

dentry cache：

a) 通过d_alloc()分配的dentry，可以处于in-use, unused, 或negative三种状态：

in-use状态：dentry->d_alias链入相应inode->i_dentry链表中；

unused状态：dentry->d_lru链入dentry_unused全局链表，用作dentry的回收；

negative状态：dentry->d_inode为空；通过d_op->d_iput()或iput()释放了inode结构。

b) dentry_hashtable中，dentry->d_hash (struct hlist_node)链入；上述三种状态的dentry都必须在此hash表中。

一般来说，d_lookup()返回失败，表明目标dentry项不在dentry cache中。另外，dentry cache使用dcache_lock自旋锁保护，此锁还保护d_u.d_child和d_subdirs字段。

1.4.4.5 lookup_hash()

位于fs/namei.c中，函数原型：

static struct dentry *lookup_hash(struct nameidata *nd);

类似于do_lookup()，不过不进行__follow_mount()操作，此函数之前需要获得父目录的dentry->d_inode->i_mutex锁；

用上述四个函数得到的dentry使用完毕后均需要对其调用dput()操作；lookup_hash()和

do_lookup()中先在hash中查找，如果没有会分配一个新的dentry后调.lookup()。

1.4.4.6 d_add()

位于include/linux/dcache.h中，内联函数，原型：

static inline void d_add(struct dentry *entry, struct inode *inode);

此函数将entry加入到dentry hash表中，并初始化与inode的关联。

1) d_instantiate(dentry, inode)将dentry->d_alias链入inode->i_dentry中，并将dentry->d_inode指向此indoe；此dentry将变成in-use状态，整个过程需要dcache_lock保护；

2) d_rehash(dentry)将dentry通过dentry->d_hash链入dentry_hashtable全局hash表中，通过函数hlist_add_head_rcu()链入至对应hlist_head链表的头部，并将dentry->d_flags标志的DCACHE_UNHASHED清掉(见d_alloc)；整个过程需要dcache_lock保护；

1.4.4.7 dput()

位于fs/dcache.c中，函数原型：

void dput(struct dentry *dentry);

对应于dget，减少dentry->d_count值，但远比dget复杂(dget简单的将d_count加1)。

1) 如果d_count为1，则调下cond_sched()；

2) 调用d_count减1；如果不为0则结束，否则获取dcache_lock自旋锁；函数为atomic_dec_and_ lock(&dentry->d_count, &dentry->d_lock);

3) 获得spin_lock(&dentry->d_lock)锁，如果中途谁有用了dget()即d_count非0，解锁返回；

4) 走到这里，d_count已减为0；如果存在dentry->d_op->d_delete()的实现，此dentry是否删除交给文件系统决策。

a) d_op->d_delete()返回1，表示要删除此dentry；

如果dentry在dentry_hashtable中就将其摘除，dentry->d_flags置DCACHE_UNHASHED，对应实现函数为__d_drop(dentry)；

kill_it: 如果dentry在unused中，断链dentry->d_lru；

断链dentry-> d_u.d_child；这之后此dentry已经可以保证不会再有人能访问到了；

dentry_iput(dentry)释放其inode，此函数中会释放dentry->d_lock和dcache_lock锁;

获得dentry->d_parent存入parent局部变量中，以便进行递归释放；

d_free(dentry)释放dentry结构，如果在hash中调call_rcu(&dentry->d_u.d_rcu, d_callback)，否则直接__d_free(dentry)释放外部d_name.name和dentry内存，还会调d_op->d_release()；

如果dentry与parent相等，即到达了根目录，则结束退出；

将parent局部变量赋给dentry变量，跳至步骤1)循环处理；即在dput()中处理父子关系。

b) d_op->d_delete()返回0或者不存在，表示不删除此dentry，将其放入dentry_unused中作为dentry cache使用；

如果此dentry不在hash表中即有DCACHE_UNHASHED标志，转至a)蓝色kill_it处；

如果此dentry在hash表中，dentry->d_flags置DCACHE_REFERENCED标志(此标志在回收dentry时可以缓冲一次被释放，请参考prune_dcache()函数的实现)，并将d_lru链入dentry_unused全局链表中，释放dentry->d_lock和dcache_lock锁后退出，此种情况虽然d_count为0，但不释放dentry，也不递归处理父目录dentry结构，因为会在内存不足时，触发shrink_dcache_memory()调用prune_one_dentry(dentry)以及prune_one_dentry(dentry)->dput(dentry->d_parent)做类似a)的回收。

也就是说当dentry->d_count为0且在hash表中的情况，才会转至dentry_unused链表中。

1.4.4.8 mntput()

位于include/linux/mount.h中，函数原型：

static inline void mntput(struct vfsmount *mnt);

实现为：

if (mnt) {

mnt->mnt_expiry_mark = 0;

mntput_no_expire(mnt);

}

由上可知，主要是通过mntput_no_expire(mnt)释放的：

1) 调用atomic_dec_and_lock(&mnt->mnt_count, &vfsmount_lock)将mnt_count减1；

2) 如果mnt_count变成0，则加锁成功返回1，进入真正的释放操作；

3) 处理完mnt_pinned后(目前不了解其作用)，调用__mntput(mnt)；

__mntput(mnt)：

1) dput(mnt->mnt_root)将本文件系统的根目录释放；

2) free_vfsmnt(mnt)释放mnt->mnt_devname和mnt自身内存；

3) deactivate_super(mnt_mnt_sb)释放超级块以及与此超级块相关的各种dentry和inode结构；

deactivate_super(sb)：

1) atomic_dec_and_lock (&sb->s_active, &sb_lock)成功后，需要释放此超级块以及相关结构；

2) sb->s_count -= S_BIAS-1，因为初始化时为S_BIAS，这里变成1；

3) down_write(&s->s_umount);

4)调用fs->kill_sb(s)回写并释放dentry和inode；fs->kill_sb(s)中会up_write(&s->s_umount);

5) put_filesystem(sb->s_type);

6) put_super(sb)->spin_lock(&sb_lock)后，调用__put_super(sb);

7) __put_super(sb)实现为：sb->s_count减1变成0后，就调用kfree(sb);

8) spin_unlock(&sb_lock)结束。

来看下rootfs的fs->kill_sb()的实现，为kill_litter_super(sb)：

1) d_genocide(sb->s_root)来种族灭绝挂在s_root树上的所有dentry结构，实际上只是递减

树上除了根外的所有dentry结构的d_count值的所有引用，为后面的释放做准备；

2) kill_anon_super(sb) ->generic_shutdown_super(sb)操作相关dentry和inode结构。

generic_shutdown_super(sb)：

1) shrink_dcache_for_umount(sb);此函数中会将根目录的所有引用计数减掉；减sb->s_root 的计数后将sb->s_root 置空后，并在shrink_dcache_for_umount_subtree()中遍历整个树，对叶节点依次调用__d_drop()摘除hash表，解除父子关系，调用d_free()释放所有的dentry结构(包括根目录)，同时对所有dentry相关联的inode调用iput();

2) fsync_super(sb);会调用sync_inodes_sb(sb, 0和1)和sync_blockdev(sb->s_bdev)回写脏页；

3) s_flags清MS_ACTIVE标志；

4) invalidate_inodes(sb);抛弃本文件系统中的所有的inode；

5) 摘链sb->s_list, 摘链sb->s_instances;

6) up_write(&sb->s_umount)释放在fs->kill_sb()操作之前申请有信号量。

步骤2)和4)中的操作比较复杂，因为本身inode就比较复杂，以后会专门研究。

2 sys_umount()实现

两个参数：

设备名称dev_name，挂载点目录名称name，umount标志flags，例如：

sys_umount( (char __user *)"/mnt/c/", MNT_FORCE);

实现流程：

1) __user_walk(name, LOOKUP_FOLLOW, &nd)，最终调到do_path_lookup(AT_FDCWD,

name_page, LOOKUP_FOLLOW, nd)；

2) 检查输入参数是否为挂载点，如果nd.dentry != nd.mnt->mnt_root表明不是，返回- EINVAL；

3) check_mnt(nd.mnt)，检查mnt->mnt_ns是否等于current->nsproxy-> mnt_ns；

4) do_umount(nd.mnt, flags)进行实际的卸载动作；

5) dput(nd.dentry), mntput_no_expire (nd.mnt)结束。

2.1 do_umount()

函数原型：static int do_umount(struct vfsmount *mnt, int flags);

1) 如果有MNT_FORCE标志，则调用s_op->umount_begin()；

2) 如果是当前的fs->rootmnt，并且flags无MNT_DETACH标志，则do_remount_sb()将

sb->s_flags设成只读的，如果存在s_op-> remount_fs()则调用下；

3) down_write(&namespace_sem)和spin_lock(&vfsmount_lock)后，通过了条件检查后会调用umount_tree(mnt, 1, &umount_list)进行实际的卸载，此函数的实现包括挂载在本文件系统上的子文件系统，但这里条件检查通过后就不会包括；条件检查为：flags & MNT_DETACH || !propagate_mount_busy(mnt, 2)，MNT_DETACH表示只是将从树结构中摘掉，一般用于initrd的处理；另一个条件为引用计数不能大于2，由于mnt结构在分配的时候计数置为1，此处调用do_path_lookup()后又会加1，所以此条件意味着无挂载子文件系统和打开文件；

4) 解锁spin_unlock(&vfsmount_lock) 和up_write(&namespace_sem)；

5) release_mounts(&umount_list)释放本文件系统的挂载点dentry和mnt结构。

步骤1)和2)分别需要lock_kernel()/unlock_kernel()的保护，具体的释放操作实际上是通过

mntput()的做的；打开每一个文件的时候都会dget(dentry)和mntget(mnt)一下，当sys_close()

时会作相应的put操作，当文件系统最后一个文件被关闭时，mntput()真正执行释放动作。

3 pxlramfs实战

仿照ramfs，简单的pxlramfs的实现（可以用git跟踪逐个功能的实现），如下：

#include
#include
#include
#include
#define PXLRAMFS_BLOCKBIT PAGE_CACHE_SHIFT
#define PXLRAMFS_BLOCKSIZE (1<
#define PXLRAMFS_MAXFILESIZE 20
#define PXLRAMFS_MAGIC 0x12345678
#define PXLRAMFS_TIMEGRAN 1000000000 /* 1 second */
struct inode *pxlramfs_get_inode(struct super_block *sb, int mode, dev_t dev);
static const struct super_operations pxlramfs_sops = {
};
const struct address_space_operations pxlramfs_aops = {
.readpage = simple_readpage,
.prepare_write = simple_prepare_write,
.commit_write = simple_commit_write,
.set_page_dirty = __set_page_dirty_no_writeback,
};
static struct backing_dev_info pxlramfs_backing_dev_info = {
.ra_pages = 0,
//.capabilities = BDI_CAP_MAP_DIRECT | BDI_CAP_READ_MAP | BDI_CAP_WRITE_MAP,
};
/* having this method means delete this dentry when calling dput(), instead of adding to dentry_unused dentry cache */
int pxlramfs_d_delete(struct dentry *dentry)
{
return 1;
}
static struct dentry *pxlramfs_lookup(struct inode *dir, struct dentry *newdentry, struct nameidata *nd)
{
static struct dentry_operations pxlramfs_d_ops = {
.d_delete = pxlramfs_d_delete,
};
newdentry->d_op = &pxlramfs_d_ops;
/* ramfs don't need .lookup, call this means the file doesn't exist; so using the NULL inode */
/* must use d_add() to add it to dentry_hashtable if has .d_delete() method */
d_add(newdentry, NULL);
/* NULL means use newdentry allocated by d_alloc() before call this method. plz refer to real_lookup() */
return NULL;
}
static int pxlramfs_mknod(struct inode *parent_inode, struct dentry *child_newdentry, int mode, dev_t dev)
{
struct inode * inode = pxlramfs_get_inode(parent_inode->i_sb, mode, dev);
int error = -ENOSPC;
if (inode) {
if (parent_inode->i_mode & S_ISGID) {
inode->i_gid = parent_inode->i_gid;
if (S_ISDIR(mode))
inode->i_mode |= S_ISGID;
}
/* child_newdentry has already been added to dentry_hashtable in parent_inode->i_op->lookup() in lookup_hash() */
d_instantiate(child_newdentry, inode);
/* extra count */
dget(child_newdentry);
error = 0;
parent_inode->i_mtime = parent_inode->i_ctime = CURRENT_TIME;
}
return error;
}
int pxlramfs_mkdir(struct inode *parent_inode, struct dentry *child_newdentry, int child_newinode_mode)
{
int retval = pxlramfs_mknod(parent_inode, child_newdentry, child_newinode_mode | S_IFDIR, 0);
if (!retval)
inc_nlink(parent_inode);
return retval;
}
int pxlramfs_rmdir(struct inode *parent_inode, struct dentry *child_dentry)
{
struct dentry *tmp;
struct inode *child_inode = child_dentry->d_inode;
/* NOTE: dcache_lock protects d_subdirs and d_u.d_child as well */
spin_lock(&dcache_lock);
list_for_each_entry(tmp, &child_dentry->d_subdirs, d_u.d_child) {
if (tmp->d_inode && !d_unhashed(tmp)) {
spin_unlock(&dcache_lock);
return -ENOTEMPTY;
}
}
spin_unlock(&dcache_lock);
/* directory's i_nlink equals 2 when initialized */
child_inode->i_nlink--;
child_inode->i_nlink--;
/* see inc_link() in pxlramfs_mkdir() */
parent_inode->i_nlink--;
parent_inode->i_mtime = CURRENT_TIME;
/* before calling this, pxlramfs_mknod() has extra dget(child_newdentry) when allocating, so we should call extra dput(child_dentry) here */
/* dput() will also put child_dentry->d_parent in recursive and list_del(child_dentry->d_u.d_child, so we need do nothing similar here. */
dput(child_dentry);
return 0;
}
int pxlramfs_create(struct inode *parent_inode, struct dentry *child_newdentry, int child_newinode_mode, struct nameidata *nd)
{
return pxlramfs_mknod(parent_inode, child_newdentry, child_newinode_mode | S_IFREG, 0);
}
/* .unlink() implements such as shell rm operations */
int pxlramfs_unlink(struct inode *parent_inode, struct dentry *child_dentry)
{
child_dentry->d_inode->i_nlink--;
parent_inode->i_mtime = CURRENT_TIME;
/* see pxlramfs_rmdir() about this dput() */
dput(child_dentry);
return 0;
}
int pxlramfs_link(struct dentry *old_dentry, struct inode *new_parent_inode, struct dentry *new_child_dentry)
{
new_parent_inode->i_mtime = CURRENT_TIME;
atomic_inc(&old_dentry->d_inode->i_count);
old_dentry->d_inode->i_nlink++;
d_instantiate(new_child_dentry, old_dentry->d_inode);
/* extra count as in pxlramfs_mknod() */
dget(new_child_dentry);
return 0;
}
static const struct inode_operations pxlramfs_dir_inode_operations = {
.lookup = pxlramfs_lookup, /* directory inode must have .lookup() method used for path_lookup, i.e. cd, ls, mount, etc */
.mkdir = pxlramfs_mkdir,
.rmdir = pxlramfs_rmdir,
.mknod = pxlramfs_mknod,
.create = pxlramfs_create,
.unlink = pxlramfs_unlink,
.link = pxlramfs_link,
};
static const struct inode_operations pxlramfs_file_inode_operations = {
};
int pxlramfs_diropen(struct inode *inode, struct file *file)
{
/* we must allocate a cursor to meet the shell command's implementation.
* i.e. ls command will call readdir multiple times until buf is NULL,
* so we must add cursor dentry related code, otherwilse it will be deadloop
*/
static struct qstr cursor_qstr = {.name = ".", .len = 1};
file->private_data = d_alloc(file->f_path.dentry, &cursor_qstr);
return file->private_data? 0: -ENOMEM;
}
int pxlramfs_dirrelease(struct inode *inode, struct file *file)
{
dput(file->private_data);
return 0;
}
static inline unsigned char dt_type(struct inode *inode)
{
return (inode->i_mode >> 12) & 15;
}
int pxlramfs_readdir(struct file *file, void *buf, filldir_t fill_func)
{
struct dentry *dentry = file->f_path.dentry;
struct inode *inode = dentry->d_inode;
struct list_head *q = &(((struct dentry *)(file->private_data))->d_u.d_child);
struct list_head *p;
ino_t ino;
switch (file->f_pos) {
case 0:
ino = inode->i_ino;
if (fill_func(buf, ".", 1, file->f_pos++, ino, DT_DIR) < 0) {
file->f_pos--;
return 0;
}
case 1:
spin_lock(&dentry->d_parent->d_lock);
ino = dentry->d_parent->d_inode->i_ino;
spin_unlock(&dentry->d_parent->d_lock);
if (fill_func(buf, "..", 2, file->f_pos++, ino, DT_DIR) < 0) {
file->f_pos--;
return 0;
}
default:
/* NOTE: dcache_lock protects d_subdirs and d_u.d_child as well */
spin_lock(&dcache_lock);
/* the first time here we just move cusor the the front, in case new dentries are created in pxlramfs_diropen() before this */
if (file->f_pos == 2) {
list_move(q, &dentry->d_subdirs);
}
for (p = q->next; p != &dentry->d_subdirs;) {
struct dentry *result;
result = list_entry(p, struct dentry, d_u.d_child);
if (d_unhashed(result) || !result->d_inode) {
continue;
}
/* fill_func may block, so unlock dcache_lock here */
spin_unlock(&dcache_lock);
ino = result->d_inode->i_ino;
if (fill_func(buf, result->d_name.name, result->d_name.len, file->f_pos++, ino, dt_type(result->d_inode)) < 0) {
file->f_pos--;
return 0;
}
spin_lock(&dcache_lock);
/* update cusor */
list_move(q, p);
p = q->next;
}
spin_unlock(&dcache_lock);
}
return 0;
}
static const struct file_operations pxlramfs_dir_operations = {
.open = pxlramfs_diropen,
.release = pxlramfs_dirrelease,
.readdir = pxlramfs_readdir,
};
static const struct file_operations pxlramfs_file_operations = {
.read = do_sync_read,
.aio_read = generic_file_aio_read,
.write = do_sync_write,
.aio_write = generic_file_aio_write,
.mmap = generic_file_mmap,
.fsync = simple_sync_file,
.sendfile = generic_file_sendfile,
.llseek = generic_file_llseek,
};
struct inode *pxlramfs_get_inode(struct super_block *sb, int mode, dev_t dev)
{
struct inode * inode = new_inode(sb);
if (inode) {
inode->i_uid = current->fsuid;
inode->i_gid = current->fsgid;
inode->i_mode = mode;
inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
inode->i_mapping->a_ops = &pxlramfs_aops;
inode->i_mapping->backing_dev_info = &pxlramfs_backing_dev_info;
switch(mode & S_IFMT) {
case S_IFDIR:
inode->i_op = &pxlramfs_dir_inode_operations;
inode->i_fop = &pxlramfs_dir_operations;
/* inc extra for directory */
inc_nlink(inode);
break;
case S_IFREG:
inode->i_op = &pxlramfs_file_inode_operations;
inode->i_fop = &pxlramfs_file_operations;
break;
case S_IFLNK:
break;
default:
init_special_inode(inode, mode, dev);
break;
}
}
return inode;
}
static int pxlramfs_fill_super(struct super_block *sb)
{
struct inode *inode;
struct dentry *dentry;
struct qstr name = {.name = "pxlramfs_root", .len = 13};
sb->s_blocksize = PXLRAMFS_BLOCKSIZE;
sb->s_blocksize_bits = PXLRAMFS_BLOCKBIT;
sb->s_maxbytes = PXLRAMFS_MAXFILESIZE;
sb->s_op = &pxlramfs_sops;
sb->s_magic = PXLRAMFS_MAGIC;
sb->s_time_gran = PXLRAMFS_TIMEGRAN;
inode = pxlramfs_get_inode(sb, S_IFDIR | S_IRWXUGO, 0);
if (inode == NULL) {
return -ENOMEM;
}
dentry = d_alloc(NULL, &name);
if (dentry == NULL) {
iput(inode);
return -ENOMEM;
}
/* must set d_parent before calling d_instantiate() or d_add() */
dentry->d_parent = dentry;
dentry->d_sb = sb;
/* link dentry with inode */
/* have no hash, so don't use d_add() */
d_instantiate(dentry, inode);
/* we get our root */
sb->s_root = dentry;
return 0;
}
static int pxlramfs_get_sb(struct file_system_type *fs_type, int flags,
const char *dev_name, void *data, struct vfsmount *mnt)
{
struct super_block *sb;
int err;
sb = sget(fs_type, NULL, set_anon_super, NULL); /* allocate sb and init s_dev, s_type, s_id[], s_list, s_instances */
if (IS_ERR(sb)) {
return PTR_ERR(sb);
}
sb->s_flags = flags;
/* init sb structure, including related inode and s_root */
err = pxlramfs_fill_super(sb);
if (err) {
/* this function should only take the s_umount semaphore on success;
* if success, it will be released in function vfs_kern_mount()
*/
up_write(&sb->s_umount);
deactivate_super(sb);
return err;
}
sb->s_flags |= MS_ACTIVE; /* active now */
mnt->mnt_sb = sb;
mnt->mnt_root = dget(sb->s_root); /* note: using dget() */
/* mnt->mnt_parent, mnt->mnt_mountpoint will be set in vfs_kern_mount() */
return 0;
}
static struct file_system_type pxlramfs_fs_type = {
.name = "pxlramfs",
.get_sb = pxlramfs_get_sb,
.kill_sb = kill_litter_super,
};
static int __init init_pxlramfs(void)
{
return register_filesystem(&pxlramfs_fs_type);
}
static void __exit exit_pxlramfs(void)
{
unregister_filesystem(&pxlramfs_fs_type);
}
module_init(init_pxlramfs)
module_exit(exit_pxlramfs)
MODULE_LICENSE("GPL");

4 ramfs file f_op->read()实现

本小节研究do_sync_read()的实现细节，ramfs文件操作结构：

步骤一：

原型：ssize_t do_sync_read(struct file *filp, char __user *buf, size_t len, loff_t *ppos);

1) 初始化栈上struct iovec结构iov：

iov.iov_base=buf;

iov.iov_len=len;

2) 初始化栈上struct kiocb结构kiocb：

kiocb. ki_filp = filp;

kiocb. ki_pos = *ppos;

kiocb.ki_left = len;

kiocb.ki_obj.tsk = current;

kiocb. ki_users = 1;

kiocb. ki_flags = 0;

3) 调用filp->f_op->aio_read(&kiocb, &iov, 1, kiocb.ki_pos);

4) 等待aio_read()完成：wait_on_sync_kiocb(&kiocb)查看kiocb.ki_flags是否KIF_KICKED。

步骤二：

本上下文环境中，f_op->aio_read()即generic_file_aio_read()，原型：

ssize_t generic_file_aio_read(struct kiocb *iocb, const struct iovec *iov,

unsigned long nr_segs, loff_t pos);

1) 验证iov向量数组中所指向的用户态空间的合法性，包括是否可写：

access_ok(VERIFY_WRITE, iv->iov_base, iv->iov_len);

2) 如果filp->f_flags 有O_DIRECT标志，调用直接IO方式读generic_file_direct_IO();

3) 对于读使用read_descriptor_t结构以及read_actor_t类型函数进行iov的实际复制动作：

读描述结构：

read_descriptor_t desc;

desc.written = 0; // 读操作完成后，存放成功拷贝至用户态的字节数

desc.arg.buf = iov[0].iov_base;

desc.count = iov[0].iov_len; // 表示需要读的字节数

desc.error = 0;

read_actor_t函数为file_read_actor()：

此函数中更新读描述结构变量desc的所有成员；

4) 调用do_generic_file_read(filp, &kiocb->ppos, &desc, file_read_actor)，此函数为内联函

数，即do_generic_mapping_read(filp->f_mapping, &filp->f_ra, filp, &kiocb->ppos,

&desc, file_read_actor)。

步骤三：

do_generic_mapping_read原型：

void do_generic_mapping_read(struct address_space *mapping,

struct file_ra_state *_ra,

struct file *filp,

loff_t *ppos,

read_descriptor_t *desc,

read_actor_t actor);

1) 调用page_cache_readahead()预读页面，ramfs中关闭预读，实际上只分配page cache;

预读算法比较重要并且复杂，后续将在专门的文档中研究文件预读；

2) 调用find_get_page(mapping, index)找到上面分配的page cache中的page页面；

3) lock_page(page);

4) 调用mapping->a_ops->readpage(filp, page)读内容，ramfs中会将页面数据内容清0，然

后置PG_uptodate标志；

此方法实现中读页面完成后必须调用unlock_page(page)；附ramfs的a_ops方法集：

const struct address_space_operations ramfs_aops = {

.readpage = simple_readpage,

.prepare_write = simple_prepare_write,

.commit_write = simple_commit_write,

.set_page_dirty = __set_page_dirty_no_writeback,

};

5) 调用file_read_actor拷贝page页面中的数据至iov所指的用户缓冲区中，并更新desc。

阅读(7113) | 评论(0) | 转发(1) |

上一篇：Linux-2.6.21的负载均衡

下一篇：git使用详解

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6