linux启动流程（从start_kernel中的rest_init函数到init进程（1））-shenxiaocheng-ChinaUnix博客

为了信任的背叛shenxiaocheng.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

shenxiaocheng

博客访问： 348993
博文数量： 72
博客积分： 2130
博客等级：大尉
技术积分： 857
用户组：普通用户
注册时间： 2008-09-05 16:10

文章分类

全部博文（72）

Windows编程（3）
插件（1）
人生哲理（2）
网络编程（7）
嵌入式开发（3）
内核技术（7）
驱动开发（1）
系统编程（3）
脚本编程（2）
数据结构与算法（30）

算法（6）

图（0）

树（2）

栈（3）

队列（3）

双向链表（7）

单向链表（9）
系统管理（13）

系统配置（6）

服务器搭建（5）

包管理（1）

LFS（1）
未分配的博文（0）

文章存档

2010年（5）

2009年（14）

2008年（53）

我的朋友

相关博文

linux启动流程（从start_kernel中的rest_init函数到init进程（1））

分类： LINUX

2008-11-28 11:37:03

在init/main.c文件中有个函数叫start_kernel，它是用来启动内核的主函数，我想大家都知道这个函数啦，而在该函数的最后将调用一个函数叫rest_init()，它执行完，内核就起来了，

asmlinkage void __init start_kernel(void)

{

......

/* Do the rest non-__init'ed, we're now alive */

rest_init();

}

现在我们来看一下rest_init()函数，它也在文件init/main.c中，它的前面几行是：

static void noinline __init_refok rest_init(void) __releases(kernel_lock)

{

int pid;

kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND);

其中函数kernel_thread定义在文件arch/ia64/kernel/process.c中，用来启动一个内核线程，这里的kernel_init是要执行的函数的指针，NULL表示传递给该函数的参数为空，CLONE_FS | CLONE_SIGHAND为do_fork产生线程时的标志，表示进程间的fs信息共享，信号处理和块信号共享，然后我就屁颠屁颠地追随到kernel_init函数了，现在来瞧瞧它都做了什么好事，它的完整代码如下：

static int __init kernel_init(void * unused)

{

lock_kernel();

* init can run on any cpu.

set_cpus_allowed_ptr(current, CPU_MASK_ALL_PTR);

* Tell the world that we're going to be the grim

* reaper of innocent orphaned children.

* We don't want people to have to make incorrect

* assumptions about where in the task array this

* can be found.

init_pid_ns.child_reaper = current;

cad_pid = task_pid(current);

smp_prepare_cpus(setup_max_cpus);

do_pre_smp_initcalls();

smp_init();

sched_init_smp();

cpuset_init_smp();

do_basic_setup();

* check if there is an early userspace init. If yes, let it do all

* the work

if (!ramdisk_execute_command)

ramdisk_execute_command = "/init";

if (sys_access((const char __user *) ramdisk_execute_command, 0) != 0) {

ramdisk_execute_command = NULL;

prepare_namespace();

}

* Ok, we have completed the initial bootup, and

* we're essentially up and running. Get rid of the

* initmem segments and start the user-mode stuff..

init_post();

return 0;

}

在kernel_init函数的一开始就调用了lock_kernel()函数，当编译时选上了CONFIG_LOCK_KERNEL，就加上大内核锁，否则啥也不做，紧接着就调用了函数set_cpus_allowed_ptr，由于这些函数对init进程的调起还是有影响的，我们还是一个一个来瞧瞧吧，不要忘了啥东东最好，

static inline int set_cpus_allowed_ptr(struct task_struct *p,

const cpumask_t *new_mask)

{

if (!cpu_isset(0, *new_mask))

return -EINVAL;

return 0;

}

这函数其实就调用了cpu_isset宏，定义在文件"include/linux/cpumask.h中，如下：

#define cpu_isset(cpu, cpumask) test_bit((cpu), (cpumask).bits)

再来看看set_cpus_allowed_ptr的第二个参数类型吧，也定义在文件include/linux/cpumask.h中，具体如下：

typedef struct { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;

接着尾随着DECLAR_BITMAP宏到文件include/linux/types.h中，定义如下：

#define DECLARE_BITMAP(name,bits) \

unsigned long name[BITS_TO_LONGS(bits)]

而宏BITS_TO_LONGS定义在文件include/linux/bitops.h中，实现如下：

#define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))

DIV_ROUND_UP宏定义在文件include/linux/kernel.h中，BITS_PER_BYTE 宏定义在文件include/linux/bitops.h中，实现如下：

#define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))

#define BITS_PER_BYTE 8

即当NR_CPUS为1～32时，cpumask_t类型为

struct {

unsigned long bits[1];

}

然后来看看在set_cpus_allowed_ptr(current, CPU_MASK_ALL_PTR);中的 CPU_MASK_ALL_PTR宏，定义在include/linux/cpumask.h中：

#define CPU_MASK_ALL_PTR (&CPU_MASK_ALL)

而CPU_MASK_ALL宏也定义在文件include/linux/cpumask.h中：

#define CPU_MASK_ALL \

(cpumask_t) { { \

[BITS_TO_LONGS(NR_CPUS)-1] = CPU_MASK_LAST_WORD \

} }

NR_CPUS宏定义在文件include/linux/threads.h中，实现如下：

#ifdef CONFIG_SMP

#define NR_CPUS CONFIG_NR_CPUS

#else

#define NR_CPUS 1

#endif

CPU_MASK_LAST_WORD宏定义在文件include/linux/cpumask.h中，实现如下：

#define CPU_MASK_LAST_WORD BITMAP_LAST_WORD_MASK(NR_CPUS)

BITMAP_LAST_WORD_MASK(NR_CPUS)宏定义在文件include/linux/bitmap.h中，实现如下：

#define BITMAP_LAST_WORD_MASK(nbits) \

( \

((nbits) % BITS_PER_LONG) ? \

(1UL<<((nbits) % BITS_PER_LONG))-1 : ~0UL \

)

当NR_CPUS为1时，CPU_MASK_LAST_WORD为1

当NR_CPUS为2时，CPU_MASK_LAST_WORD为2

当NR_CPUS为n时，CPU_MASK_LAST_WORD为2的n-1次方

有点晕了，我们现在把参数带入，即set_cpus_allowed_ptr(current, CPU_MASK_ALL_PTR)

－－>cpu_isset(0,CPU_MASK_ALL_PTR)－－>test_bit(0,CPU_MASK_ALL_PTR.bits)

即当NR_CPUS为n时，就把usigned long bits[0]的第n位置1，应该就如注释所说的，init能运行在任何CPU上吧。

现在kernel_init中的set_cpus_allowed_ptr(current, CPU_MASK_ALL_PTR); 分析完了，我们接着往下看，首先 init_pid_ns.child_reaper = current; init_pid_ns定义在kernel/pid.c文件中

struct pid_namespace init_pid_ns = {

.kref = {

.refcount = ATOMIC_INIT(2),

.pidmap = {

[ 0 ... PIDMAP_ENTRIES-1] = { ATOMIC_INIT(BITS_PER_PAGE), NULL }

.last_pid = 0,

.level = 0,

.child_reaper = &init_task,

};

它是一个pid_namespace结构的变量，先来看看pid_namespace的结构，它定义在文件

include/linux/pid_namespace.h中，具体定义如下：

struct pid_namespace {

struct kref kref;

struct pidmap pidmap[PIDMAP_ENTRIES];

int last_pid;

struct task_struct *child_reaper;

struct kmem_cache *pid_cachep;

unsigned int level;

struct pid_namespace *parent;

#ifdef CONFIG_PROC_FS

struct vfsmount *proc_mnt;

#endif

};

即把当前进程设为接受其它孤儿进程的进程，然后取得该进程的进程ID，如：

cad_pid = task_pid(current);

然后调用 smp_prepare_cpus(setup_max_cpus);如果编译时没有指定CONFIG_SMP，它什么也不做，接着往下看，调用do_pre_smp_initcalls()函数，它定义在init/main.c文件中，实现如下：

static void __init do_pre_smp_initcalls(void)

{

extern int spawn_ksoftirqd(void);

migration_init();

spawn_ksoftirqd();

if (!nosoftlockup)

spawn_softlockup_task();

}

其中migration_init()定义在文件include/linux/sched.h中，具体实现如下:

#ifdef CONFIG_SMP

void migration_init(void);

#else

static inline void migration_init(void)

{

}

#endif

好像什么也没有做，然后是调用spawn_ksoftirqd()函数，定义在文件kernel/softirq.c中，代码如下：

__init int spawn_ksoftirqd(void)

{

void *cpu = (void *)(long)smp_processor_id();

int err = cpu_callback(&cpu_nfb, CPU_UP_PREPARE, cpu);

BUG_ON(err == NOTIFY_BAD);

cpu_callback(&cpu_nfb, CPU_ONLINE, cpu);

register_cpu_notifier(&cpu_nfb);

return 0;

}

在该函数中，首先调用smp_processor_id函数获得当前CPU的ID并把它赋值给变量cpu，然后把cpu连同&cpu_nfb，CPU_UP_PREPARE传递给函数cpu_callback，我们先看cpu_callback的前几行：

static int __cpuinit cpu_callback(struct notifier_block *nfb,

unsigned long action,

void *hcpu)

{

int hotcpu = (unsigned long)hcpu;

struct task_struct *p;

switch (action) {

case CPU_UP_PREPARE:

case CPU_UP_PREPARE_FROZEN:

p = kthread_create(ksoftirqd, hcpu, "ksoftirqd/%d", hotcpu);

if (IS_ERR(p)) {

printk("ksoftirqd for %i failed\n", hotcpu);

return NOTIFY_BAD;

}

kthread_bind(p, hotcpu);

per_cpu(ksoftirqd, hotcpu) = p;

break;

从上述代码可以看出当action为CPU_PREPARE时，将创建一个内核线程并把它赋值给p，该进程所要运行的函数为ksoftirqd，传递给该函数的参数为hcpu，而紧跟其后的”ksoftirqd/%d”,hotcpu为该进程的名字参数，这就是我们在终端用命令ps -ef | grep ksoftirqd所看到的线程；如果进程创建失败，打印出错信息，否则把创建的线程p绑定到当前CPU的ID上，这就是kthread_bind(p,hotcpu)所做的，接下来的几行为：

case CPU_ONLINE:

case CPU_ONLINE_FROZEN:

wake_up_process(per_cpu(ksoftirqd, hotcpu));

break;

即在spawn_ksoftirqd函数中cpu_callback(&cpu_nfb, CPU_ONLINE, cpu);的action为CPU_ONLINE时，将调用wake_up_process函数来唤醒当前CPU上的ksoftirqd进程。最后调用register_cpu_notifier(&cpu_nfb)；其实也没做什么，只是简单的返回0。返回到do_pre_smp_initcalls函数中，接着往下看：

if (!nosoftlockup)

spawn_softlockup_task();

spawn_softlockup_task()函数定义在文件include/linux/sched.h中，是个空函数。

到现在为止，do_pre_smp_initcalls分析完了，它主要就是创建进程ksoftirqd，把它绑定到当前CPU上，然后再把该进程拷贝给每个CPU，并唤醒所有CPU上的进程ksoftirqd，就是当我们执行ps -ef | grep ksoftirqd的时候所看到的：

root 4 2 0 08:30 ? 00:00:03 [ksoftirqd/0]

root 7 2 0 08:30 ? 00:00:02 [ksoftirqd/1]

革命尚未成功，同志仍需努力！接着享受吧，呵呵！

现在到了kernel_init函数中的smp_init();了

如果在编译时没有选择CONFIG_SMP，若定义CONFIG_X86_LOCAL_APIC则去调用APIC_init_uniprocessor()函数，否则什么也不做，具体代码定义在文件init/main.c中：

#ifndef CONFIG_SMP

#ifdef CONFIG_X86_LOCAL_APIC

static void __init smp_init(void)

{

APIC_init_uniprocessor();

}

#else

#define smp_init() do { } while (0)

#endif

如果在编译时选择了CONFIG_SMP呢，那么它的实现就如下喽：

/* Called by boot processor to activate the rest. */

static void __init smp_init(void)

{

unsigned int cpu;

/* FIXME: This should be done in userspace --RR */

for_each_present_cpu(cpu) {

if (num_online_cpus() >= setup_max_cpus)

break;

if (!cpu_online(cpu))

cpu_up(cpu);

}

/* Any cleanup work */

printk(KERN_INFO "Brought up %ld CPUs\n", (long)num_online_cpus());

smp_cpus_done(setup_max_cpus);

}

来看看这个函数的，for_each_present_cpu(cpu)宏在文件include/linux/cpumask.h中实现：

#define for_each_present_cpu(cpu) for_each_cpu_mask((cpu), cpu_present_map)

而for_each_cpu_mask(cpu,mask)宏也在文件include/linux/cpumask.h中实现：

#if NR_CPUS > 1

#define for_each_cpu_mask(cpu, mask) \

for ((cpu) = first_cpu(mask); \

(cpu) < NR_CPUS; \

(cpu) = next_cpu((cpu), (mask)))

#else /* NR_CPUS == 1 */

#define for_each_cpu_mask(cpu, mask) \

for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask)

#endif /* NR_CPUS */

即对于每个cpu都要执行大括号里的语句，如果当前cpu没激活就把它激活的，该函数然后打印一些cpu信息，如当前激活的cpu数目。

kernel_init中紧跟smp_init()函数后的是sched_init_smp()函数和do_basic_setup()函数，而其后便是最后一个函数init_post()，在该函数中将调起init进程。由于内容较多，下次分析......

（如哪里有错误，请高手指出，不胜感激，刚接触内核不久）

阅读(4166) | 评论(0) | 转发(0) |

上一篇：linux分段分页管理

下一篇：cmake

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6