分类:
2009-07-12 20:44:26
A stack is a datastructure. You can push and pop elements from and to it. Unlike a FIFO (first in, first out) the popped elements from a stack is the ones you pushed last. Because of that a stack is also called a LIFO (last in, first out).
In the X86 architecture, and many others, there is one stack used for code execution. It is used to store return pointers, when calling routines, but you can also store temporary data and local variables on it.
Many languages and architectures have a stack at its disposal. When return values are stored on it, the concept of stack frames emerges. A stack is divided into a number of stack frames. Each stack frame contains the local/temporary data for the routine, parameters and a return value to the former routine (the caller).
On the X86 architecture the stack grows downwards. The stack frames have a certain structure regarding the calling convention. The CDECL calling convention is the most widely used. It is most likely used by your compiler. Two registers are used:
Take care when implementing your kernel. If you use segmentation, the DS segment should be configured to have it's base at the same address as SS does. Otherwise you may run into problems when passing pointers to local variables into functions, because normal GPRs can't access the stack the way you might think.
Here is an example stack. The elements are 4 byte words in protected mode:
Memory address: Stack elements: +----------------------------+ 0x105000 | Parameter 1 for routine 1 | \ +----------------------------+ | 0x104FFC | First callers return addr. | > Stack frame 1 +----------------------------+ | 0x104FF8 | First callers EBP | / +----------------------------+ 0x104FF4 +->| Parameter 2 for routine 2 | \ <-- Routine 1's EBP | +----------------------------+ | 0x104FF0 | | Parameter 1 for routine 2 | | | +----------------------------+ | 0x104FEC | | Return address, routine 1 | | | +----------------------------+ | 0x104FE8 +--| EBP value for routine 1 | > Stack frame 2 +----------------------------+ | 0x104FE4 +->| Local data | | <-- Routine 2's EBP | +----------------------------+ | 0x104FE0 | | Local data | | | +----------------------------+ | 0x104FDC | | Local data | / | +----------------------------+ 0x104FD8 | | Parameter 1 for routine 3 | \ | +----------------------------+ | 0x104FD4 | | Return address, routine 2 | | | +----------------------------+ > Stack frame 3 0x104FD0 +--| EBP value for routine 2 | | +----------------------------+ | 0x104FCC +->| Local data | / <-- Routine 3's EBP | +----------------------------+ 0x104FC8 | | Return address, routine 3 | \ | +----------------------------+ | 0x104FC4 +--| EBP value for routine 3 | | +----------------------------+ > Stack frame 4 0x104FC0 | Local data | | <-- Current EBP +----------------------------+ | 0x104FBC | Local data | / +----------------------------+ 0x104FB8 | | <-- Current ESP \/\/\/\/\/\/\/\/\/\/\/\/\/\/
The CDECL calling convention is described here:
It looks like this in assembly (NASM):
SECTION .text caller: ; ... ; Caller responsibilities: PUSH 3 ; push the parameters in reverse order PUSH 2 CALL callee ; perform the call ADD ESP, 8 ; stack cleaning (remove the 2 words) ; ... Use the return value in EAX ... callee: ; Callee responsibilities: PUSH EBP ; store caller's EBP MOV EBP, ESP ; save current stack pointer in EBP ; ... Code, store return value in EAX ... ; Callee responsibilities: MOV ESP, EBP ; remove an unknown number of local data elements POP EBP ; restore caller's EBP RET ; return
The GCC compiler does all this automatically, but if you have to call C/C++ methods from assembly or reverse, you have to know the convention. Now look at one stack frame (the callee's):
+-------------------------+ | Parameter 3 | +-------------------------+ | Parameter 2 | +-------------------------+ | Parameter 1 | +-------------------------+ | Caller's return address | +-------------------------+ | Caller's EBP value | +-------------------------+ | Local variable | <-- Current EBP +-------------------------+ | Local variable | +-------------------------+ | Local variable | +-------------------------+ | Temporary data | +-------------------------+ | Temporary data | +-------------------------+ | | <-- Current ESP +-------------------------+
Using EBP the callee can access both parameters and local variables:
MOV EAX, [[EBP + 12]] ; Load parameter 1 into EAX MOV EAX, [[EBP + 16]] ; Load parameter 2 MOV EAX, [[EBP + 4 * EBX + 12]] ; Load parameter EBX (0-indexed) MOV EAX, [[EBP]] ; Load local variable 1 MOV EAX, [[EBP - 4]] ; Load local variable 2
There are other calling conventions for X86 just mentioning a few: Pascal calling convention, fastcall convention, stdcall. Read more on Wikipedia, see the links below.
When creating a kernel you have to manually setup the stack. It can be done by the bootloader, but it is good to know how anyway.
If you go from real mode to protected mode you have to setup the stack as well. This is because the SS segment might change, so ESP in protected mode does not point at the same location as SP in real mode did. If you switch a lot between real mode and protected mode it can be nice for them to share the stack. You have to find a smart solution for that yourself. It can be done.
In protected mode you setup the stack by just moving a new pointer value into the ESP register:
MOV ESP, 0x105000 ; Set the stack pointer
Remember that it grows downwards. You might allocate space for it in the kernel's .BSS section if it contains one:
SECTION .text setup_stack: MOV ESP, stack_end ; Set the stack pointer SECTION .bss stack_begin: RESB 4096 ; Reserve 4 KiB stack space stack_end:
If your kernel is booted by a multiboot compliant bootloader, like GRUB, you are provided a memory map. You can setup the stack by looking for free memory chunks of the appropriate size. You just have to ensure that you don't overwrite any important data or code, when setting the stack pointer.
The stack is easy to use, but it has one problem. There is no "end", so it is vulnerable to a variation of the buffer overflow attack. The attacker pushes more elements that the stack is able to contain, so they are pushes outside the stack memory, overwriting code, which the attacker can execute.
In X86 protected mode it can be solved by allocating a GDT descriptor solely for the stack, which defines its boundaries.
When debugging a stack trace is often shown and can be helpful. Stack Trace describes how this can be done and provides sample code for X86 CDECL using the stack layout above.
Unwinding the stack is complex. It is done when using exceptions, like in C++. It is performed when an exception is thrown. The purpose of unwinding the stack is to call the destructor of local objects of the stack frames and to remove stack frames until an appropriate landing pad is found. The landing pad is the try..catch block in C++ or Java. The catch block has to match the exception, i.e. a RuntimeException object can't be caught as a String object.
The unwinding algorithm depends on the architecture. Normally this algorithm is provided in the language runtime library. When using GCC and C++ it is defined in the libsupc++ library linked with your application. However it doesn't happen, when creating a kernel. The libsupc++ library is also too bloated to use in kernel space.