Linux中创建进程的具体过程分析-automaton-ChinaUnix博客

automaton的博客

首页　| 　博文目录　| 　关于我

automaton

博客访问： 50357
博文数量： 7
博客积分： 0
博客等级：民兵
技术积分： 304
用户组：普通用户
注册时间： 2015-03-08 20:51

个人简介

爱学习不寂寞

文章分类

全部博文（7）

未分配的博文（7）

文章存档

2015年（7）

我的朋友

概述

在《Linux系统调用的工作机制（上）》中分析了系统调用的工作过程，本文以Linux系统中创建进程的系统调用为例，分析创建一个新进程的具体过程。

如何创建进程

系统调用过程中会从用户态陷入内核态，在内核态中调用相应的系统调用服务例程，完成处理后从中断处理程序返回到用户态。Linux中fork, vfork和clone三个系统调用都可以创建一个新进程，在内核态的系统调用服务例程中都调用do_fork()来实现进程的创建，主要的函数调用关系为：
do_fork() -> copy_process() -> dup_task_struct().

三个函数都定义在内核源代码kernel/fork.c文件中，其中do_fork()的定义如下：

点击(此处)折叠或打开

long do_fork(unsigned long clone_flags,
unsigned long stack_start,
unsigned long stack_size,
int __user *parent_tidptr,
int __user *child_tidptr)
{
struct task_struct *p;
int trace = 0;
long nr;
/*
* Determine whether and which event to report to ptracer. When
* called from kernel_thread or CLONE_UNTRACED is explicitly
* requested, no event is reported; otherwise, report if the event
* for the type of forking is enabled.
*/
if (!(clone_flags & CLONE_UNTRACED)) {
if (clone_flags & CLONE_VFORK)
trace = PTRACE_EVENT_VFORK;
else if ((clone_flags & CSIGNAL) != SIGCHLD)
trace = PTRACE_EVENT_CLONE;
else
trace = PTRACE_EVENT_FORK;
if (likely(!ptrace_event_enabled(current, trace)))
trace = 0;
}
p = copy_process(clone_flags, stack_start, stack_size,
child_tidptr, NULL, trace);
/*
* Do this prior waking up the new thread - the thread pointer
* might get invalid after that point, if the thread exits quickly.
*/
if (!IS_ERR(p)) {
struct completion vfork;
struct pid *pid;
trace_sched_process_fork(current, p);
pid = get_task_pid(p, PIDTYPE_PID);
nr = pid_vnr(pid);
if (clone_flags & CLONE_PARENT_SETTID)
put_user(nr, parent_tidptr);
if (clone_flags & CLONE_VFORK) {
p->vfork_done = &vfork;
init_completion(&vfork);
get_task_struct(p);
}
wake_up_new_task(p);
/* forking complete and child started to run, tell ptracer */
if (unlikely(trace))
ptrace_event_pid(trace, pid);
if (clone_flags & CLONE_VFORK) {
if (!wait_for_vfork_done(p, &vfork))
ptrace_event_pid(PTRACE_EVENT_VFORK_DONE, pid);
}
put_pid(pid);
} else {
nr = PTR_ERR(p);
}
return nr;
}

第29行调用copy_process()复制父进程的相关信息创建子进程的进程描述符struct task_struct，包括复制寄存器值和必要的数据结构；
第53行根据进程优先级将创建的子进程放入CPU的调度队列，该进程进入就绪状态。

copy_process()函数代码较多，其中主要调用的函数有：
dup_task_struct()创建新的进程描述符task_struct；
copy_semundo()复制父进程的semaphore undo_list到子进程；
copy_files()和copy_fs()复制父进程文件系统相关的环境到子进程；
copy_sighand()和copy_signal()复制父进程信号处理相关的环境到子进程；
copy_mm()复制父进程内存管理相关的环境到子进程，包括页表、地址空间和代码数据；
copy_thread()设置子进程的执行环境，如子进程运行时各CPU寄存器的值、子进程的kernel栈的起始地址；
sched_fork()设置子进程调度相关的参数，即子进程的运行CPU、初始时间片长度和静态优先级等。
fork系统调用的奇妙之处在于一次调用两次返回，两次返回分别返回到父进程和子进程，在copy_thread()函数中可以一窥玄机：

点击(此处)折叠或打开

*childregs = *current_pt_regs(); //复制内核堆栈
childregs->ax = 0; //为什么子进程的fork返回0，这里就是原因！
p->thread.sp = (unsigned long) childregs; //调度到子进程时的内核栈顶
p->thread.ip = (unsigned long) ret_from_fork; //调度到子进程时的第一条指令地址

从上面的代码可知，子进程的堆栈中保存的寄存器ax值设为0，从而子进程在fork调用中的返回值为0，并且其堆栈指针寄存器sp设置为其内核栈顶，指令指针寄存器ip与父进程相同，因为子进程和父进程共享只读的文本段，系统调用结束后返回到同样的地址继续执行。

dup_task_struct()的定义如下：

点击(此处)折叠或打开

static struct task_struct *dup_task_struct(struct task_struct *orig)
{
struct task_struct *tsk;
struct thread_info *ti;
int node = tsk_fork_get_node(orig);
int err;
tsk = alloc_task_struct_node(node);
if (!tsk)
return NULL;
ti = alloc_thread_info_node(tsk, node);
if (!ti)
goto free_tsk;
err = arch_dup_task_struct(tsk, orig);
if (err)
goto free_ti;
tsk->stack = ti;
#ifdef CONFIG_SECCOMP
/*
* We must handle setting up seccomp filters once we're under
* the sighand lock in case orig has changed between now and
* then. Until then, filter must be NULL to avoid messing up
* the usage counts on the error path calling free_task.
*/
tsk->seccomp.filter = NULL;
#endif
setup_thread_stack(tsk, orig);
clear_user_return_notifier(tsk);
clear_tsk_need_resched(tsk);
set_task_stack_end_magic(tsk);
#ifdef CONFIG_CC_STACKPROTECTOR
tsk->stack_canary = get_random_int();
#endif
/*
* One for us, one for whoever does the "release_task()" (usually
* parent)
*/
atomic_set(&tsk->usage, 2);
#ifdef CONFIG_BLK_DEV_IO_TRACE
tsk->btrace_seq = 0;
#endif
tsk->splice_pipe = NULL;
tsk->task_frag.page = NULL;
account_kernel_stack(ti, 1);
return tsk;
free_ti:
free_thread_info(ti);
free_tsk:
free_task_struct(tsk);
return NULL;
}

第8行alloc_task_struct_node()用于为新的task_struct分配内存并返回；
第12行alloc_thread_info_node()为thread_info结构分配内存并返回；
第16行arch_dup_task_struct()用于复制父进程的task_struct结构；

第20行将子进程的task_struct结构中stack指针指向其thread_info，从进程描述符中可查找到内核栈的地址；

第31行setup_thread_stack()用于设置内核栈中task指针指向task_struct，从而可从内核栈查找到其进程描述符task_struct。
第34行set_task_stack_end_magicy用于设置栈底魔数，用于放置内核栈溢出。

参考

《Linux系统调用的工作机制（上）》 http://blog.chinaunix.net/uid-30156195-id-4923751.html
孟宁《Linux内核分析讲义》

阅读(2982) | 评论(0) | 转发(1) |

上一篇：分析system_call中断处理过程

下一篇：Linux内核学习总结

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6