arm-linux源码分析之解压内核映像-Alan0521-ChinaUnix博客

Alan0521

首页　| 　博文目录　| 　关于我

Alan0521

博客访问： 1369370
博文数量： 482
博客积分： 13297
博客等级：上将
技术积分： 2890
用户组：普通用户
注册时间： 2009-10-12 16:25

文章分类

全部博文（482）

Linux_drivers（9）

USB（1）

LCD（4）

SD_MMC（3）

uart（1）
History（0）
News（0）
Miscellaneous（1）
Exam（18）
ARM（39）

assembly（14）

hardware（2）

System（3）

arm_linux（2）

uboot（10）

MMU（1）

Error（1）

Qtopia（1）
C Sharp（36）
FPGA（11）
Windows（6）
Tools（104）

Git（18）

Regex（5）

Ohters（4）

Vim（77）
Hardware（25）

SPI（1）

USB（17）
Life（6）
C（54）

GCC（6）

ANSI C（33）

Keil C（15）
Linux（173）

Process（29）

File System（15）

Memory Managemen（31）

Drivers（26）

Kernel（24）

Ubuntu（21）

Makefile（7）

Linux_command（20）
未分配的博文（0）

文章存档

2012年（9）

2011年（407）

2010年（66）

我的朋友

相关博文

arm-linux源码分析之解压内核映像

分类： LINUX

2011-08-16 14:19:20

linux-2.6.20.6/arch/arm/boot/compressed/head.S

开头有一段宏定义，我们只看其中一段，分析一下gnu arm汇编的宏定义

#elif defined(CONFIG_ARCH_S3C2410)
              .macro loadsp, rb
              mov \rb, #0x50000000
              add \rb, \rb, #0x4000 * CONFIG_S3C2410_LOWLEVEL_UART_PORT
              .endm
#else
这里定义了一个宏，宏名是loadsp，rb是这个宏的参数。宏的参数在被引用时必须加”\”,如：
mov \rb, #0x50000000.

宏定义结束之后定义了一个段，
              .section ".start", #alloc, #execinstr

这个段的段名是 .start，#alloc表示Section contains allocated data, #execinstr表示Section contains executable instructions.
/*
* sort out different calling conventions
*/
              .align
start:
              .type       start,#function /*.type指定start这个符号是函数类型*/
              .rept 8
              mov r0, r0 //将此命令重复8次，相当于nop，这里为什么这样做还不清楚？？
              .endr

              b     1f
              .word      0x016f2818           @ Magic numbers to help the loader
              .word      start               @ absolute load/run zImage address
              .word      _edata                   @ zImage end address
1:            mov r7, r1                    @ save architecture ID
              mov r8, r2                    @ save atags pointer
r1和r2中分别存放着由bootloader传递过来的architecture ID和指向标记列表的指针。这里将这两个参数先保存。

#ifndef __ARM_ARCH_2__
              /*
              * Booting from Angel - need to enter SVC mode and disable
              * FIQs/IRQs (numeric definitions from angel arm.h source).
              * We only do this if we were in user mode on entry.
              */
读取cpsr并判断是否处理器处于supervisor模式——从u-boot进入kernel，系统已经处于SVC32模式；而利用angel进入则处于user模式，还需要额外两条指令。之后是再次确认中断关闭，并完成cpsr写入

Angel 是 ARM 的调试协议,现在用的 MULTI-ICE 用的是 RDI 通讯协议, ANGLE 需要在板子上有驻留程序,然后通过串口就可以调试了

这里介绍一下半主机.

半主机是用于 ARM 目标的一种机制，可将来自应用程序代码的输入/输出请求
传送至运行调试器的主机。例如，使用此机制可以启用 C 库中的函数，如
printf() 和 scanf()，来使用主机的屏幕和键盘，而不是在目标系统上配备屏幕和
键盘。

半主机是通过一组定义好的软件指令（如 swi）来实现的，这些指令通过程序控
制生成异常。应用程序调用相应的半主机调用，然后调试代理处理该异常。调
试代理提供与主机之间的必需通信。

              mrs r2, cpsr          @ get current mode
              tst    r2, #3                    @ not user?
              bne not_angel

下面两行实现了在主机和 ARM 目标之间启用调试 I/O 功能，

              mov r0, #0x17              @ angel_SWIreason_EnterSVC
              swi 0x123456              @ angel_SWI_ARM
0x17是angel_SWIreason_EnterSVC半主机操作，将处理器设置为超级用户模式，通过设置新 CPSR 中的两个中断掩码位来禁用所有中断。0x123456是arm指令集的半主机操作编号

not_angel: //不是通过angel调试进入内核
              mrs r2, cpsr          @ turn off interrupts to
              orr   r2, r2, #0xc0         @ prevent angel from running
              msr cpsr_c, r2   //这里将cpsr中I、F位分别置“1”，关闭IRQ和FIQ
#else
              teqp pc, #0x0c000003          @ turn off interrupts
常用 TEQP PC,#(新模式编号) 来改变模式
#endif

链接器会把一些处理器相关的代码链接到这个位置，也就是arch/arm/boot/compressed/head-xxx.S文件中的代码。在那个文件里会对I/D cache以及MMU进行一些操作

/*
              * Note that some cache flushing and other stuff may
              * be needed here - is there an Angel SWI call for this?
              */

              /*
              * some architecture specific code can be inserted
              * by the linker here, but it should preserve r7, r8, and r9.
              */

              .text
              adr   r0, LC0 //当前运行时LC0符号所在地址位置
              ldmia       r0, {r1, r2, r3, r4, r5, r6, ip, sp}
              subs r0, r0, r1        @ calculate the delta offset //这里获得当前运行地址与链接地址
                                          @ if delta is zero, we are   //的偏移量，存入r0中。
              beq not_relocated         @ running at the address we
                                          @ were linked at.

上面这几行代码用于判断代码是否已经重定位到内存中，LC0这个符号在288行定义。
              .type       LC0, #object
LC0:              .word      LC0               @ r1 //这个要加载到r1中的LC0是链接时LC0的地址
              .word      __bss_start            @ r2
              .word      _end                     @ r3
              .word      zreladdr          @ r4
              .word      _start                    @ r5
              .word      _got_start              @ r6
              .word      _got_end        @ ip
              .word      user_stack+4096           @ sp
通过当前运行时LC0的地址与链接器所链接的地址进行比较判断。若相等则是运行在链接的地址上。

如果不是运行在链接的地址上，则下面的代码必须运行
              /*
              * We're running at a different address. We need to fix
              * up various pointers:
              *   r5 - zImage base address
              *   r6 - GOT start
              *   ip - GOT end
              */
              add r5, r5, r0 //修改内核映像基地址
              add r6, r6, r0
              add ip, ip, r0 //修改got表的起始和结束位置

#ifndef CONFIG_ZBOOT_ROM
              /*若没有定义CONFIG_ZBOOT_ROM，此时运行的是完全位置无关代码
位置无关代码，也就是不能有绝对地址寻址。所以为了保持相对地址正确，
需要将bss段以及堆栈的地址都进行调整
              * If we're running fully PIC === CONFIG_ZBOOT_ROM = n,
              * we need to fix up pointers into the BSS region.
              *   r2 - BSS start
              *   r3 - BSS end
              *   sp - stack pointer
              */
              add r2, r2, r0
              add r3, r3, r0
              add sp, sp, r0

              /*
              * Relocate all entries in the GOT table.
              */
1:            ldr   r1, [r6, #0]            @ relocate entries in the GOT
              add r1, r1, r0        @ table. This fixes up the
              str   r1, [r6], #4            @ C references.
              cmp r6, ip
              blo   1b
#else //若定义了CONFIG_ZBOOT_ROM，只对got表中在bss段以外的符号进行重定位
//为什么要这样做呢？？我也不清楚
              /*
              * Relocate entries in the GOT table. We only relocate
              * the entries that are outside the (relocated) BSS region.
              */
1:            ldr   r1, [r6, #0]            @ relocate entries in the GOT
              cmp r1, r2                    @ entry < bss_start ||
              cmphs     r3, r1                    @ _end < entry
              addlo       r1, r1, r0        @ table. This fixes up the
              str   r1, [r6], #4            @ C references.
              cmp r6, ip
              blo   1b
#endif

如果运行当前运行地址和链接地址相等，则不需进行重定位。直接清除bss段
not_relocated: mov r0, #0
1:            str   r0, [r2], #4            @ clear bss
              str   r0, [r2], #4
              str   r0, [r2], #4
              str   r0, [r2], #4
              cmp r2, r3
              blo   1b

之后跳转到cache_on处
              /*
              * The C runtime environment should now be setup
              * sufficiently. Turn the cache on, set up some
              * pointers, and start decompressing.
              */
              bl     cache_on

cache_on在327行定义
              .align       5
cache_on:       mov r3, #8                    @ cache_on function
              b     call_cache_fn

把r3的值设为8有什么用呢？下面会看到。这里又跳转到call_cache_fn。这个函数的定义在512行

call_cache_fn: adr   r12, proc_types //把proc_types的地址加载到r12中
#ifdef CONFIG_CPU_CP15
              mrc p15, 0, r6, c0, c0   @ get processor ID
#else
              ldr   r6, =CONFIG_PROCESSOR_ID
#endif
1:            ldr   r1, [r12, #0]          @ get value
              ldr   r2, [r12, #4]          @ get mask
              eor   r1, r1, r6        @ (real ^ match)
              tst    r1, r2                    @       & mask
              addeq      pc, r12, r3             @ call cache function
              add r12, r12, #4*5
              b     1b

这一段代码首先获得当前处理器id，然后查proc_types表，也就是处理器类型表与获得的处理器id进行比较，当找到相应的处理器后，就加载对应的cache处理函数。
addeq      pc, r12, r3             @ call cache function
这里用到了上面说的r3，他的值是8，也就是一个偏移量，r12中存储的是某个处理器相关处理模块的基地址。

proc_type的定义如下，在541行

              .type       proc_types,#object
proc_types:
              .word      0x41560600           @ ARM6/610
              .word      0xffffffe0
              b     __arm6_mmu_cache_off      @ works, but slow
              b     __arm6_mmu_cache_off
              mov pc, lr
@           b     __arm6_mmu_cache_on              @ untested
@           b     __arm6_mmu_cache_off
@           b     __armv3_mmu_cache_flush

              .word      0x00000000           @ old ARM ID
              .word      0x0000f000
              mov pc, lr
              mov pc, lr
              mov pc, lr

              .word      0x41007000           @ ARM7/710
              .word      0xfff8fe00
              b     __arm7_mmu_cache_off
              b     __arm7_mmu_cache_off
              mov pc, lr

              .word      0x41807200           @ ARM720T (writethrough)
              .word      0xffffff00
              b     __armv4_mmu_cache_on
              b     __armv4_mmu_cache_off
              mov pc, lr

              .word      0x41007400           @ ARM74x
              .word      0xff00ff00
              b     __armv3_mpu_cache_on
              b     __armv3_mpu_cache_off
              b     __armv3_mpu_cache_flush

              .word      0x41009400           @ ARM94x
              .word      0xff00ff00
              b     __armv4_mpu_cache_on
              b     __armv4_mpu_cache_off
              b     __armv4_mpu_cache_flush

              .word      0x00007000           @ ARM7 IDs
              .word      0x0000f000
              mov pc, lr
              mov pc, lr
              mov pc, lr

              @ Everything from here on will be the new ID system.

              .word      0x4401a100           @ sa110 / sa1100
              .word      0xffffffe0
              b     __armv4_mmu_cache_on
              b     __armv4_mmu_cache_off
              b     __armv4_mmu_cache_flush

              .word      0x6901b110           @ sa1110
              .word      0xfffffff0
              b     __armv4_mmu_cache_on
              b     __armv4_mmu_cache_off
              b     __armv4_mmu_cache_flush

              @ These match on the architecture ID

              .word      0x00020000    @ ARMv4T //这个就是我们要找的arm920t的处理器相关数
              .word      0x000f0000       //据，偏移8后刚好是b   __armv4_mmu_cache_on
              b     __armv4_mmu_cache_on //指令的地址
              b     __armv4_mmu_cache_off
              b     __armv4_mmu_cache_flush

              .word      0x00050000           @ ARMv5TE
              .word      0x000f0000
              b     __armv4_mmu_cache_on
              b     __armv4_mmu_cache_off
              b     __armv4_mmu_cache_flush

              .word      0x00060000           @ ARMv5TEJ
              .word      0x000f0000
              b     __armv4_mmu_cache_on
              b     __armv4_mmu_cache_off
              b     __armv4_mmu_cache_flush

              .word      0x0007b000           @ ARMv6
              .word      0x0007f000
              b     __armv4_mmu_cache_on
              b     __armv4_mmu_cache_off
              b     __armv6_mmu_cache_flush

              .word      0                   @ unrecognised type
              .word      0
              mov pc, lr
              mov pc, lr
              mov pc, lr

              .size proc_types, . - proc_types

当找到我和我们处理器后，就调用相应的处理函数，我根据我们的arm920t处理器，这里应该调用__armv4_mmu_cache_on,这句调用指令在605行

              .word      0x00020000           @ ARMv4T
              .word      0x000f0000
              b     __armv4_mmu_cache_on
              b     __armv4_mmu_cache_off
              b     __armv4_mmu_cache_flush

__armv4_mmu_cache_on的在424行定义，

__armv4_mmu_cache_on:
              mov r12, lr
              bl     __setup_mmu
              mov r0, #0
              mcr p15, 0, r0, c7, c10, 4     @ drain write buffer
              mcr p15, 0, r0, c8, c7, 0      @ flush I,D TLBs
              mrc p15, 0, r0, c1, c0, 0      @ read control reg
              orr   r0, r0, #0x5000             @ I-cache enable, RR cache replacement
              orr   r0, r0, #0x0030
              bl     __common_mmu_cache_on
              mov r0, #0
              mcr p15, 0, r0, c8, c7, 0      @ flush I,D TLBs
              mov pc, r12 //返回到cache_on
这里首跳转到__setup_mmu，然后清空write buffer、I/Dcache、TLB.接着打开i-cache，设置为Round-robin replacement。调用__common_mmu_cache_on,打开mmu和d-cache.把页表基地址和域访问控制写入协处理器寄存器c2、c3. __common_mmu_cache_on函数数定义在450行。
__common_mmu_cache_on:
#ifndef DEBUG
              orr   r0, r0, #0x000d             @ Write buffer, mmu
#endif
              mov r1, #-1 //-1的补码是ffff ffff,
              mcr p15, 0, r3, c2, c0, 0      @ load page table pointer
              mcr p15, 0, r1, c3, c0, 0      @ load domain access control //将domain access control寄存
              b     1f                                 //全部置’1’
              .align       5                   @ cache line aligned
1:            mcr p15, 0, r0, c1, c0, 0      @ load control register
              mrc p15, 0, r0, c1, c0, 0      @ and read it back to
              sub pc, lr, r0, lsr #32    @ properly flush pipeline

重占来看一下__setup_mmu这个函数，定义在386行

__setup_mmu:       sub r3, r4, #16384        @ Page directory size
              bic   r3, r3, #0xff          @ Align the pointer
              bic   r3, r3, #0x3f00
这里r4中存放着内核执行地址，将16K的一级页表放在这个内核执行地址下面的16K空间里，上面通过 sub r3, r4, #16384 获得16K空间后，又将页表的起始地址进行16K对齐放在r3中。即ttb的低14位清零。

/*
* Initialise the page tables, turning on the cacheable and bufferable
* bits for the RAM area only.
*/
//初始化页表，并在RAM空间里打开cacheable 和bufferable位
              mov r0, r3
              mov r9, r0, lsr #18
              mov r9, r9, lsl #18         @ start of RAM
              add r10, r9, #0x10000000    @ a reasonable RAM size

上面这几行把一级页表的起始地址保存在r0中，并通过r0获得一个ram起始地址（256K对齐），并从这个起始地址开始的256M ram空间对应的描述符的C和B位均置”1” (参考arm920t datasheet 3.3.3, table 3-2 level one descryiptor bits), r9和r10中存放了这段内存的起始地址和结束地址

              mov r1, #0x12 //一级描述符的bit[1:0]为10，表示这是一个section描述符。bit[4]
//为1（参考arm9205 datasheet 3.3.3 table 3-2 level one
//descryiptor bits）此时bit[8:5]均为0，选择了D0域。

              orr   r1, r1, #3 << 10 //一级描述符的access permission bits bit[11:10]为11. 即
                         //all access types permitted in both modes
// (参考arm920t datasheet 3.3.3, table 3-2 level
//one descryiptor bits, 3.6, table 3-11 interpreting access
// permission(AP) bit)

              add r2, r3, #16384 //一级描述符表的结束地址存放在r2中。

1:            cmp r1, r9                    @ if virt > start of RAM
              orrhs       r1, r1, #0x0c         @ set cacheable, bufferable
              cmp r1, r10                  @ if virt > end of RAM
              bichs       r1, r1, #0x0c         @ clear cacheable, bufferable
              str   r1, [r0], #4            @ 1:1 mapping
              add r1, r1, #1048576
              teq   r0, r2
              bne 1b

上面这段就是对一级描述符表（页表）的初始化，首先比较这个描述符所描述的地址是否在那个256M的空间中，如果在则这个描述符对应的内存区域是cacheable ,bufferable。如果不在则noncacheable, nonbufferable.然后将描述符写入一个一级描述符表的入口，并将一级描述符表入口地址加4，而指向下一个1M section的基地址。如果页表入口未初始化完，则继续初始化。

一级描述符表的高12位是每个setcion的基地址，可以描述4096个section。一级页表大小为16K，每个页表项，即描述符占4字节，刚好可以容纳4096个描述符，所以这里就映射了4096*1M = 4G的空间。

/*
* If ever we are running from Flash, then we surely want the cache
* to be enabled also for our execution instance... We map 2MB of it
* so there is no map overlap problem for up to 1 MB compressed kernel.
* If the execution is in RAM then we would only be duplicating the above.
*/
              mov r1, #0x1e
              orr   r1, r1, #3 << 10 //这两行将描述的bit[11:10] bit[4:1]置位，(参考arm920t
// datasheet 3.3.3, table 3-2 level one descryiptor bits)
              mov r2, pc, lsr #20
              orr   r1, r1, r2, lsl #20 //将当前地址进1M对齐，并与r1中的内容结合形成一个
                            //描述当前指令所在section的描述符。

              add r0, r3, r2, lsl #2   //r3为刚才建立的一级描述符表的起始地址。通过将当前地
//址(pc)的高12位左移两位(形成14位索引)与r3中的地址
                            // (低14位为0)相加形成一个4字节对齐的地址，这个
                            //地址也在16K的一级描述符表内。当前地址对应的
                            //描述符在一级页表中的位置

              str   r1, [r0], #4
              add r1, r1, #1048576
              str   r1, [r0]          //这里将上面形成的描述符及其连续的下一个section描述
//写入上面4字节对齐地址处（一级页表中索引为r2左移
//2位）

              mov pc, lr       //返回，调用此函数时，调用指令的下一语句mov   r0, #0的地
                       //址保存在lr中

这里进行的是1:1的映射，物理地址和虚拟地址是一样。

__common_mmu_cache_on:执行完后返回到bl cache_on下一条指令处226行，

              mov r1, sp                    @ malloc space above stack
              add r2, sp, #0x10000    @ 64k max

/*
* Check to see if we will overwrite ourselves.
*   r4 = final kernel address
*   r5 = start of this image
*   r2 = end of malloc space (and therefore this image)
* We basically want:
*   r4 >= r2 -> OK
*   r4 + image length <= r5 -> OK
*/
              cmp r4, r2
              bhs wont_overwrite
              sub r3, sp, r5        @ > compressed kernel size
              add r0, r4, r3, lsl #2      @ allow for 4x expansion
              cmp r0, r5
              bls   wont_overwrite

这段代码首先在堆栈上确定了64K的malloc空间，空间的起始地址和结束地址分别存放在r1、r2中。然后判断最终内核地址，也就是解压后内核的起始地址，是否大于malloc空间的结束地址，如果大于就跳到wont_overwrite执行，wont_overwrite函数后面会讲到。否则，检查最终内核地址加解压后内核大小，也就是解压后内核的结束地址，是否小于现在未解压内核映像的起始地址。小于也会跳到wont_owerwrite执行。如两这两个条件都不满足，则继续往下执行。

              mov r5, r2                    @ decompress after malloc space
              mov r0, r5
              mov r3, r7
              bl     decompress_kernel

这里将解压后内核的起始地址设为malloc空间的结束地址。然后后把处理器id（开始时保存在r7中）保存到r3中，调用decompress_kernel开始解压内核。这个函数的四个参数分别存放在r0-r3中，它在arch/arm/boot/compressed/misc.c中定义。

              add r0, r0, #127
              bic   r0, r0, #127           @ align the kernel length
/*
* r0     = decompressed kernel length
* r1-r3 = unused
* r4     = kernel execution address
* r5     = decompressed kernel start
* r6     = processor ID
* r7     = architecture ID
* r8     = atags pointer
* r9-r14 = corrupted
*/
              add r1, r5, r0        @ end of decompressed kernel
              adr   r2, reloc_start
              ldr   r3, LC1
              add r3, r2, r3
1:            ldmia       r2!, {r9 - r14}              @ copy relocation code
              stmia       r1!, {r9 - r14}
              ldmia       r2!, {r9 - r14}
              stmia       r1!, {r9 - r14}
              cmp r2, r3
              blo   1b
这里首先计算出解压后内核的大小，然后对它的进行重定位

              bl     cache_clean_flush
              add pc, r5, r0        @ call relocation code
重定位结束后跳到解压后内核的起始处开始执行，在运行解压后内核之前，先调用了
cache_clean_flush这个函数。这个函数的定义在第700行

cache_clean_flush:
              mov r3, #16
              b     call_cache_fn
其实这里又调用了call_cache_fn这个函数，注意，这里r3的值为16，call_cache_fn这个函数在前面有讲解，下面看看当r3为16时会调用到哪个函数,回到proc_types这个对像的定义，最终找到处理器相关的处理代码在603行开始

              .word      0x00020000           @ ARMv4T
              .word      0x000f0000
              b     __armv4_mmu_cache_on
              b     __armv4_mmu_cache_off
              b     __armv4_mmu_cache_flush
当偏移量为16时，会跳到b       __armv4_mmu_cache_flush这条指令，调用__armv4_mmu_cache_flush这个函数，它的定义在730行

__armv4_mmu_cache_flush:
              mov r2, #64*1024         @ default: 32K dcache size (*2)
              mov r11, #32         @ default: 32 byte line size
              mrc p15, 0, r3, c0, c0, 1      @ read cache type
              teq   r3, r6                    @ cache ID register present?
              beq no_cache_id
              mov r1, r3, lsr #18
              and r1, r1, #7 //获得Dsize中的size
              mov r2, #1024
              mov r2, r2, lsl r1           @ base dcache size *2//获得dcache字节大小
              tst    r3, #1 << 14          @ test M bit
              addne      r2, r2, r2, lsr #1     @ +1/2 size if M == 1
              mov r3, r3, lsr #12
              and r3, r3, #3 //上两句获得Dsize中 cache line的长度len
              mov r11, #8
              mov r11, r11, lsl r3 @ cache line size in bytes //cache line的字节长度
no_cache_id:
              bic   r1, pc, #63            @ align to longest cache line
              add r2, r1, r2
1:            ldr   r3, [r1], r11           @ s/w flush D cache 这个是指什么呢？？
              teq   r1, r2
              bne 1b
上面这几句做了什么呢？为什么要这么做呢？

              mcr p15, 0, r1, c7, c5, 0      @ flush I cache
              mcr p15, 0, r1, c7, c6, 0      @ flush D cache
              mcr p15, 0, r1, c7, c10, 4     @ drain WB
              mov pc, lr

这里主要还是刷新I/Dcache和写缓冲。

下面看看前面提到的wont_overwrite函数。这个函数在282行定义
wont_overwrite:     mov r0, r4
              mov r3, r7
              bl     decompress_kernel
              b     call_kernel

同样，这里先设置好decompress_kernel的参数，然后调用decompress_kernel解压内核映像。然后调用call_kernel函数。此函数在491行定义

call_kernel:     bl     cache_clean_flush
              bl     cache_off
              mov r0, #0                    @ must be zero
              mov r1, r7                    @ restore architecture number
              mov r2, r8                    @ restore atags pointer
              mov pc, r4                    @ call kernel

这里也是先调用cache_clean_flush刷新i/d-cache，然后调用cashe_off函数。最后设置好参数，跳到解压后的内核执行。

cashe_off函数在644行定义
cache_off:      mov r3, #12                  @ cache_off function
              b     call_cache_fn
同样又是调用call_cache_fn函数，注意，这里r3的值是12，也就是偏移量是12，最终通过call_cache_fn找到603行的一段代码

              .word      0x00020000           @ ARMv4T
              .word      0x000f0000
              b     __armv4_mmu_cache_on
              b     __armv4_mmu_cache_off
              b     __armv4_mmu_cache_flush

因这里的偏移量是12，所以将执行b       __armv4_mmu_cache_off指令，调用__armv4_mmu_cache_off函数，这个函数在665行定义。

__armv4_mmu_cache_off:
              mrc p15, 0, r0, c1, c0
              bic   r0, r0, #0x000d
              mcr p15, 0, r0, c1, c0   @ turn MMU and cache off
              mov r0, #0
              mcr p15, 0, r0, c7, c7   @ invalidate whole cache v4
              mcr p15, 0, r0, c8, c7   @ invalidate whole TLB v4
              mov pc, lr
这里首先读控制寄存器，然后关闭icache和mmu，接着使全部cache和tlb无效。

现在总结一下在进入解压后的内核入口前都做了些什么（解压后的kernel入口在arch/arm/kernel/head.S中）：

首先保存从uboot中传入的参数，然后会执行一段处理器相关的代码位于arch/arm/boot/compressed/head-xxx.S中，这段代码我们这里没有分析，在移植内核时会对这段代码作出分析。接着会判断一下要不要重定位，我们这里是不需要重定位，所以开始对bss段清零。之后初始化页表，进行1:1映射。因为打开cache前必须打开mmu，所以这里先对页表进行初始化，然后打开mmu和cache。这些都准备好后，判断一下解压内核是否会覆盖未解压的内核映像。如果会，则进行一些调整，然后开始解压内核；如果不会，则直接解压。最后是刷新cache，关闭mmu和dcache,使cache和tlb内容无效，跳到解压后的内核入口执行arm相关的内核代码。

----

The GNU Aseembler.pdf

====

阅读(1038) | 评论(0) | 转发(0) |

上一篇：详细分析make uboot 最后的编译链接的具体执行过程

下一篇：spsr_cxsf,cpsr_cxsf的理解

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6