全部博文(63)
2010年(63)
分类: LINUX
2010-11-04 18:16:11
linux启动流程分析(4)---汇编部分(1)
在网上参考很多高手的文章,又加入了自己的一点儿内容,整理了一下,里面还有很多不明白的地方,而且也会有理解错误
的地方,望高手指点,自己也会不断进行修改
当进入内核后,arch/arm/kernel/head-armv.S是内核最先执行的一个文件,包括从内核入口ENTRY(stext)到
start_kernel之间的初始化代码,下面以我所是用的平台intel pxa270为例,说明一下他的汇编代码:
1 .section ".text.init",#alloc,#execinstr
2 .type stext, #function
/* 内核入口点 */
3 ENTRY(stext)
4 mov r12, r0
/* 程序状态,禁止FIQ、IRQ,设定SVC模式 */
5 mov r0, #F_BIT | I_BIT | MODE_SVC @ make sure svc mode
6 msr cpsr_c, r0 @ and all irqs disabled
/* 判断CPU类型,查找运行的CPU ID值与Linux编译支持的ID值是否支持 */
7 bl __lookup_processor_type
/* 判断如果r10的值为0,则表示函数执行错误,跳转到出错处理,*/
/* 出错处理函数__error的实现代码定义在debug-armv.S中,这里就不再作过多介绍了 */
8 teq r10, #0 @ invalid processor?
9 moveq r0, #'p' @ yes, error 'p'
10 beq __error
/* 判断体系类型,查看R1寄存器的Architecture Type值是否支持 */
11 bl __lookup_architecture_type
/* 判断如果r7的值为0,则表示函数执行错误,跳转到出错处理,*/
12 teq r7, #0 @ invalid architecture?
13 moveq r0, #'a' @ yes, error 'a'
14 beq __error
/* 创建核心页表 */
15 bl __create_page_tables
16 adr lr, __ret @ return address
17 add pc, r10, #12 @ initialise processor
@ (return control reg)
第5行,准备进入SVC工作模式,同时关闭中断(I_BIT)和快速中断(F_BIT)
第7行,查看处理器类型,主要是为了得到处理器的ID以及页表的flags。
第11行,查看一些体系结构的信息。
第15行,建立页表。
第17行,跳转到处理器的初始化函数,其函数地址是从__lookup_processor_type中得到的,
需要注意的是第16行,当处理器初始化完成后,会直接跳转到__ret去执行,
这是由于初始化函数最后的语句是mov pc, lr。
汇编部分(2),简单介绍了内核启动的汇编主流程,这篇介绍其中调用的汇编子函数__lookup_processor_type
函数__lookup_processor_type介绍:
内核中使用了一个结构struct proc_info_list,用来记录处理器相关的信息,该结构定义在
kernel/include/asm-arm/procinfo.h头文件中。
/*
* Note! struct processor is always defined if we're
* using MULTI_CPU, otherwise this entry is unused,
* but still exists.
*
* NOTE! The following structure is defined by assembly
* language, NOT C code. For more information, check:
* arch/arm/mm/proc-*.S and arch/arm/kernel/head-armv.S
*/
struct proc_info_list {
unsigned int cpu_val;
unsigned int cpu_mask;
unsigned long __cpu_mmu_flags; /* used by head-armv.S */
unsigned long __cpu_flush; /* used by head-armv.S */
const char *arch_name;
const char *elf_name;
unsigned int elf_hwcap;
struct proc_info_item *info;
struct processor *proc;
};
在arch/arm/mm/proc-xscale.S文件中定义了所有和xscale有关的proc_info_list,我们使用的pxa270定义如下:
.section ".proc.info", #alloc, #execinstr
.type __bva0_proc_info,#object
__bva0_proc_info:
.long 0x69054110 @ Bulverde A0: 0x69054110, A1 : 0x69054111.
.long 0xfffffff0 @ and this is the CPU id mask.
#if CACHE_WRITE_THROUGH
.long 0x00000c0a
#else
.long 0x00000c0e
#endif
b __xscale_setup
.long cpu_arch_name
.long cpu_elf_name
.long HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP|HWCAP_XSCALE
.long cpu_bva0_info
.long xscale_processor_functions
.size __bva0_proc_info, . - __bva0_proc_info
由于.section指示符,上面定义的__bva0_proc_info信息在编译的时候被放到了.proc.info段中,这是由的
链接脚本文件vmlinux.lds指定的,参考如下:
SECTIONS
{
. = 0xC0008000;
.init : { /* Init code and data */
_stext = .;
__init_begin = .;
*(.text.init)
__proc_info_begin = .;
*(.proc.info)
__proc_info_end = .;
这里的符号__proc_info_begin指向.proc.info的起始地址,而符号__proc_info_end指向.proc.info的结束地址。
后面就会引用这两个符号,来指向.proc.info这个段。
下面来来看看函数的源代码,为了分析方便将函数按行进行编号,其中17-18行就是前面提到的对.proc.info的引用,
第2行将17行的地址放到寄存器r5中,adr是小范围的地址读取伪指令。第3行将r5所指向的数据区的数据读出到r7,r9
r10,执行结果是r7=__proc_info_end,r9=__proc_info_begin,r10=第19行的地址,第4-6行的结果应该是r10指向
__proc_info_begin的地址,第7行读取cpu的id,这是一个协处理器指令,将processor ID存储在r9中,第8行将r10指向
的__bva0_proc_info开始的数据读出放到寄存器r5,r6,r8,结果r5=0x69054110(cpu_val),r6=0xfffffff0(cpu_mask),
r8=0x00000c0e(__cpu_mmu_flags),第9-10行将读出的id和结构中的id进行比较,如果id相同则返回,返回时r9存储
processor ID,如果id不匹配,则将指针r10增加36(proc_info_list结构的长度),如果r10小于r7指定的地址,也就是
__proc_info_end,则继续循环比较下一个proc_info_list中的id,如第11-14行的代码,如果查找到__proc_info_end
仍未找到一个匹配的id,则将r10清零并返回,如15-16行,也就是说如果函数执行成功则r10指向匹配的proc_info_list
结构地址,如果函数返回错误则r10为0。
/*
* Read processor ID register (CP#15, CR0), and look up in the linker-built
* supported processor list. Note that we can't use the absolute addresses
* for the __proc_info lists since we aren't running with the MMU on
* (and therefore, we are not in the correct address space). We have to
* calculate the offset.
*
* Returns:
* r5, r6, r7 corrupted
* r8 = page table flags
* r9 = processor ID
* r10 = pointer to processor structure
*/
1 __lookup_processor_type:
2 adr r5, 2f
3 ldmia r5, {r7, r9, r10}
4 sub r5, r5, r10 @ convert addresses
5 add r7, r7, r5 @ to our address space
6 add r10, r9, r5
7 mrc p15, 0, r9, c0, c0 @ get processor id
8 1: ldmia r10, {r5, r6, r8} @ value, mask, mmuflags
9 and r6, r6, r9 @ mask wanted bits
10 teq r5, r6
11 moveq pc, lr
12 add r10, r10, #36 @ sizeof(proc_info_list)
13 cmp r10, r7
14 blt 1b
15 mov r10, #0 @ unknown processor
16 mov pc, lr
/*
* Look in include/asm-arm/procinfo.h and arch/arm/kernel/arch.[ch] for
* more information about the __proc_info and __arch_info structures.
*/
17 2: .long __proc_info_end
18 .long __proc_info_begin
19 .long 2b
20 .long __arch_info_begin
21 .long __arch_info_end
汇编部分(2)介绍了汇编函数__lookup_processor_type,这一篇介绍__lookup_architecture_type函数
函数__lookup_architecture_type介绍:
每个机器(一般指的是某一个电路板)都有自己的特殊结构,如物理内存地址,物理I/O地址,显存起始地址等等,
这个结构为struct machine_desc,定义在asm-arm/mach/arch.h中:
struct machine_desc {
/*
* Note! The first four elements are used
* by assembler code in head-armv.S
*/
unsigned intnr;/* architecture number*/
unsigned intphys_ram;/* start of physical ram */
unsigned intphys_io;/* start of physical io*/
unsigned intio_pg_offst;/* byte offset for io page table entry*/
const char*name;/* architecture name*/
unsigned intparam_offset;/* parameter page*/
unsigned intvideo_start;/* start of video RAM*/
unsigned intvideo_end;/* end of video RAM*/
unsigned intreserve_lp0 :1;/* never has lp0*/,
unsigned intreserve_lp1 :1;/* never has lp1*/
unsigned intreserve_lp2 :1;/* never has lp2*/
unsigned intsoft_reboot :1;/* soft reboot*/
void(*fixup)(struct machine_desc *,
struct param_struct *, char **,
struct meminfo *);
void(*map_io)(void);/* IO mapping function*/
void(*init_irq)(void);
};
这个结构一般都定义在(以arm平台为例)kernel\arch\arm\mach-xxx\xxx.c中,是用宏来定义的,以mainstone的开发板为例:
定义在kernel\arch\arm\mach-pxa\mainstone.c文件中,如下所示:
MACHINE_START(MAINSTONE, "Intel DBBVA0 Development Platform")
MAINTAINER("MontaVista Software Inc.")
BOOT_MEM(0xa0000000, 0x40000000, io_p2v(0x40000000))
FIXUP(fixup_mainstone)
MAPIO(mainstone_map_io)
INITIRQ(mainstone_init_irq)
MACHINE_END
这些宏也定义在kernel/include/asm-arm/mach/arch.h中,以MACHINE_START为例:
#define MACHINE_START(_type,_name) \
const struct machine_desc __mach_desc_##_type \
__attribute__((__section__(".arch.info"))) = { \
.nr = MACH_TYPE_##_type, \
.name = _name,
展开之后结构的是:
__mach_desc_MAINSTONE = {
.nr = MACH_TYPE_MAINSTIONE,
.name = "Intel DBBVA0 Development Platform",
中间的1行__attribute__((__section__(".arch.info"))) = {说明将这个结构放到指定的段.arch.info中,这和前面的
.proc.info是一个意思,__attribute__((__section__的含义参考手册。后面的宏都是类似的含义,这里就不再一一
介绍。下面开始说明源码:
第1行实现r4指向2b的地址,2b如__lookup_processor_type介绍的第19行,将machine_desc结构中的数据存放到r2, r3, r5, r6, r7。
读取__mach_desc_MAINSTONE结构中的nr参数到r5中,如第7行,比较r5和r1中的机器编号是否相同,如第8行,
r5中的nr值MACH_TYPE_MAINSTONE定义在kernel\include\asm-arm\mach-types.h中:
#define MACH_TYPE_MAINSTONE 303
r1中的值是由传递过来的,这在<
如果机器编号相同,跳到15行执行,r5=intphys_ram,r6=intphys_io,r7=intio_pg_offst,并返回。如果
不同则将地址指针增加,在跳到7行继续查找,如10--12行的代码,如果检索完所有的machine_desc仍然没
有找到则将r7清零并返回。
/*
* Lookup machine architecture in the linker-build list of architectures.
* Note that we can't use the absolute addresses for the __arch_info
* lists since we aren't running with the MMU on (and therefore, we are
* not in the correct address space). We have to calculate the offset.
*
* r1 = machine architecture number
* Returns:
* r2, r3, r4 corrupted
* r5 = physical start address of RAM
* r6 = physical address of IO
* r7 = byte offset into page tables for IO
*/
1 __lookup_architecture_type:
2 adr r4, 2b
3 ldmia r4, {r2, r3, r5, r6, r7} @ throw away r2, r3
4 sub r5, r4, r5 @ convert addresses
5 add r4, r6, r5 @ to our address space
6 add r7, r7, r5
7 1: ldr r5, [r4] @ get machine type
8 teq r5, r1
9 beq 2f
10 add r4, r4, #SIZEOF_MACHINE_DESC
11 cmp r4, r7
12 blt 1b
13 mov r7, #0 @ unknown architecture
14 mov pc, lr
15 2: ldmib r4, {r5, r6, r7} @ found, get results
16 mov pc, lr
linux启动流程分析(4)---汇编部分(4)
函数__create_page_tables介绍:
假设内核起始物理地址是0xA0008000,虚拟地址是0xC0008000,下面的代码是建立内核起始处4MB空间的映射,
采用了一级映射方式,即段式(section)映射方式,每段映射范围为1MB空间。于是需要建立4个表项,实现:
虚拟地址0xC0000000~0xC0300000,映射到物理地址0xA0000000~0xA0300000。
.macro pgtbl, reg, rambase
adr \reg, stext
sub \reg, \reg, #0x4000
.endm
.macro krnladr, rd, pgtable, rambase
bic \rd, \pgtable, #0x000ff000
.endm
/*
* Setup the initial page tables. We only setup the barest
* amount which are required to get the kernel running, which
* generally means mapping in the kernel code.
*
* We only map in 4MB of RAM, which should be sufficient in
* all cases.
*
* r5 = physical address of start of RAM
* r6 = physical IO address
* r7 = byte offset into page tables for IO
* r8 = page table flags
*/
1 __create_page_tables:
/* r5中存放着内核启动的地址0xa0008000 */
/* pgtbl将启动地址减去0x4000,存放到r4=0xa0004000 */
2 pgtbl r4, r5 @ page table address
/*
* Clear the 16K level 1 swapper page table
*/
/* r0 = 0xa0004000 */
3 mov r0, r4
4 mov r3, #0
/* r2 = 0xa0008000 */
5 add r2, r0, #0x4000
/* 清除16k空间,addr 0xa0004000: 0xa0008000 is page table, total 16K*/
6 1: str r3, [r0], #4
7 str r3, [r0], #4
8 str r3, [r0], #4
9 str r3, [r0], #4
10 teq r0, r2
11 bne 1b
/*
* Create identity mapping for first MB of kernel to
* cater for the MMU enable. This identity mapping
* will be removed by paging_init()
*/
/* r2 = 0xa0040000 & 0x000ff000 = 0xa00000000 */
12 krnladr r2, r4, r5 @ start of kernel
/* r3 = 0xa0000000 + 0x00000c0e = 0xa00000c0e */
/* r8 = 0x00000c0e在__lookup_processor_type函数中初始化 */
13 add r3, r8, r2 @ flags + kernel base
/* value r3=0xa0000c0e store to addr 0xa0006800*/
/* r4 = 0xa0006800 */
14 str r3, [r4, r2, lsr #18] @ identity mapping
/*
* Now setup the pagetables for our kernel direct
* mapped region. We round TEXTADDR down to the
* nearest megabyte boundary.
*/
/* TEXTADDR= 0xC0008000 有关TEXTADDR参考<
/* start of kernel, r0=0xa0007000 */
15 add r0, r4, #(TEXTADDR & 0xff000000) >> 18 @ start of kernel
/* r2=0xa0000c0e */
16 bic r2, r3, #0x00f00000
/* 0xa0000c0e的数据写入到0xa00070000 */
17 str r2, [r0] @ PAGE_OFFSET + 0MB
/* r0=0xa0007000, no change */
18 add r0, r0, #(TEXTADDR & 0x00f00000) >> 18
19 str r3, [r0], #4 @ KERNEL + 0MB
20 add r3, r3, #1 << 20
21 str r3, [r0], #4 @ KERNEL + 1MB
22 add r3, r3, #1 << 20
23 str r3, [r0], #4 @ KERNEL + 2MB
24 add r3, r3, #1 << 20
25 str r3, [r0], #4 @ KERNEL + 3MB
/*
* Ensure that the first section of RAM is present.
* we assume that:
* 1. the RAM is aligned to a 32MB boundary
* 2. the kernel is executing in the same 32MB chunk
* as the start of RAM.
*/
26 bic r0, r0, #0x01f00000 >> 18 @ round down
27 and r2, r5, #0xfe000000 @ round down
28 add r3, r8, r2 @ flags + rambase
29 str r3, [r0]
30 bic r8, r8, #0x0c @ turn off cacheable
31 mov pc, lr
我已经把每一步涉及的地址详细列出了,读者可以自行对照阅读。第11~16行,清空页表项从0xA0004000到0xA00,8000,共16KB。
第28行,取得__cpu_mmu_flags。第35~45行,填写页表项,共4项。读者可以对照XScale的地址映射手册,
因为采用的是段式映射方式,所以每1MB虚拟空间映射到相同的页表表项,根据手册说明,段式映射只有一级表索引,
是虚拟地址的前12位;而页式映射的页目录表是前12位,页表是接着的8位,最后12位才是页内偏移,
读者一定不要和386的10位页目录表,10位页表的机制相混淆。我们举个例子说明,对于虚拟地址0xC00x,xxxxx,
其前12位为C00,页表基址为0xA000,4000,所以表项地址为0xA000,4000+0xC00<<2=0xA000,7000,
而这个地址内容为0xA0000C0E,其前12位0xA00为段基地址,后20位为一些flags,这是从刚才__bva0_proc_info中取得的。
linux启动流程分析(4)---汇编部分(5)
函数__mmap_switched介绍:
/*
* The following fragment of code is executed with the MMU on, and uses
* absolute addresses; this is not position independent.
*
* r0 = processor control register
* r1 = machine ID
* r9 = processor ID
*/
/* 下面按4字节对齐 */
1 .align 5
2 __mmap_switched:
/* r3 = __bss_start */
3 adr r3, __switch_data + 4
4 ldmia r3, {r4, r5, r6, r7, r8, sp}@ r2 = compat
@ sp = stack pointer
5 mov fp, #0 @ Clear BSS (and zero fp)
6 1: cmp r4, r5
7 strcc fp, [r4],#4
8 bcc 1b
9 str r9, [r6] @ Save processor ID
10 str r1, [r7] @ Save machine type
11 orr r0, r0, #2 @ ...........A.
12 bic r2, r0, #2 @ Clear 'A' bit
13 stmia r8, {r0, r2} @ Save control register values
14 b SYMBOL_NAME(start_kernel)
程序的4行执行完成之后的结果是r4=__bss_start,r5=_end,r6=processor_id,r7=__machine_arch_type,
r8=cr_alignment,sp=init_task_union+8192,第5-8行将__bss_start到_end清零,定义在vmlinux.lds文件中,如下:
.bss : {
__bss_start = .; /* BSS */
*(.bss)
*(COMMON)
_end = . ;
}
第9、10行分别将处理器类型和机器类型存储到变量processor_id和__machine_arch_type中,这些变量以后会
在start_kernel->setup_arch中使用,来得到当前处理器的struct proc_info_list结构和当前系统的machine_desc结构的数据。
第10-13将processor control register保存到cr_alignment中,14行跳转到init/main.c中的start_kernel进入内核启动的第二阶段。
chinaunix网友2010-11-05 14:51:38
很好的, 收藏了 推荐一个博客,提供很多免费软件编程电子书下载: http://free-ebooks.appspot.com