I bet you dont want to know.
分类: LINUX
2015-01-20 21:51:11
The term "Virtual Memory" is used to describe a method by which the physical RAM of a computer is not directly addressed, but is instead accessed via an indirect "lookup". On the Intel platform, paging is used to accomplish this task.
Paging, in CPU specific terms, should not be confused with swap. These terms are related, but paging is used to refer to virtual to physical address translation. The author encourages readers to find the Intel Manuals online or order them in print for a deeper understanding of the Intel paging system. (Note - In Intel documents, the term virtual address, as used in the kernel code, is replaced with linear address).
To accomplish address translation (paging) the CPU needs to be told:
a) where to find the address translation information. This is accomplished by pointing the CPU to a lookup table called a 'page table'. b) to activate paging mode. This is accomplished by setting a specific flag in a control register.
Kernel use of virtual memory begins very early on in the boot process. head.S contains code to create provisional page tables and get the kernel up and running, however that is beyond this overview.
Every physical page of memory up to 896MB is mapped directly into the kernel space. Memory greater than 896MB (High Mem) is not permanently mapped, but is instead temporarily mapped using kmap and kmap_atomic (see HighMemory).
The descriptions of virtual memory will be broken into two distinct sections; kernel paging and user process paging.
Kernel Initialization:
Paging is initialized in arch/i386/mm/init.c. The function 'paging_init()' is called once by setup_arch during kernel initialization. It immediately calls pagetable_init(). pagetable_init() starts by defining the base of the page table directory:
*pgd_base = swapper_pg_dir;
swapper_pg_dir is defined in head.S, using .org directives (.org allows structures to be placed in desired memory locations). It points to 0x1000 above the 'root' of kernel memory. Kernel memory is defined to start at PAGE_OFFSET,which in x86 is 0XC0000000, or 3 gigabytes. (This is where the 3gig/1gig split is defined.) Every virtual address above PAGE_OFFSET is the kernel, any address below PAGE_OFFSET is a user address.
After some capability checking, pagetable_init() calls 'kernel_physical_mapping_init'. This function performs the lions share of the kernel page table setup.
Definitions:
pgd = Page Directory
pmd = Page Middle Directory
pte = Page Table (Entry)This function performs the bulk of the kernel page table setup. By looping for each pmd and pte, the function calls one_md_table_init and one_page_table_init respectively. These functions create new page middle directories and page tables by allocating space using the boot memory allocator. In non-PAE mode (PAE, or Physical Addressing Extensions allows Intel architectures to address greater than 4 Gig), the pmd is not used and no memory is allocated. Here is the important part of one_page_table_init:
*page_table = (pte_t*)alloc_bootmem_low_pages(PAGE_SIZE);
set_pmd(pmd, __pmd(__pa(page_table) |_PAGE_TABLE ));
The first line allocates a page of memory to hold the table using the bootmem allocator, the next inserts the table into the pmd. Once the table is returned, kernel_physical_mapping_init populates it the page table using code similar to this:
set_pte(pte, pfn_pte(pfn, PAGE_KERNEL))
This code populates the page tables in a linear fashion. What I mean to say is the mapping from physical page number to virtual addressis linear and only differs by PAGE_OFFSET. To translate a physical address to a virtual address, one only needs to add PAGE_OFFSET(0XC0000000). This can be seen in the macro va from page.h:
#define __va(x) ((void *)((unsigned long)(x)+ PAGE_OFFSET))
The virtual address of x is returned by adding PAGE_OFFSET.
Once the page tables have been set, pagetable_init() calls permanent_kmaps_init() to set up the page tables for use by kmap. Recall that we discussed the use of kmap to temporarily map high memory (>896MB) into the kernel as required. This function call sets the page tables for use by kmap. Once all is set, the return is made back to paging_init(). On return, paging_init loads the new page table address to CR3, here:
load_cr3(swapper_pg_dir);
After flushing the TLB's to force a reload for our new page tables, kmap_init() is the last piece of the paging setup. It completes the setup of the kmap initialized above.
Kernel paging is active. Once paging is active, the kernel can address all physical memory (aside from HighMem) via linear addressing starting at PAGE_OFFSET (0xC0000000 in 3/1 split).
User Space Virtual Memory:
Every process in linux is able to address 4 gigabytes of linear address space. In a standard kernel config, the first 3 gigabytes (0x00000000 - 0xC0000000) are referred to as 'user space' and represent data, functions and the stack of user processes. The top 1 gigabyte (0xC0000000 - 0xFFFFFFFF) of memory is 'kernel space'. User processes typically do not have access to kernel memory space, and will normally not address this region.
Process virtual memory is handled using a number of internal structures. The first of interest is mm_struct: ~coywolf/lxr/source/include/linux/sched.h#L293
mm_struct provides the top level management of a process' memory space. By referring to the link above, we see some important items:
struct vm_area_struct * mmap;
A list of vma structs (described later) that comprise the VM space of the process
pgd_t * pgd;
A pointer to the process page tables
unsigned long start_code, end_code, start_data, end_data;
unsigned long start_brk, brk, start_stack;
unsigned long arg_start, arg_end, env_start, env_end;
And some familiar items that indicate the start and end of various user process sections (code, data, stack).
As we can see, the mm_struct maintains the overall picture of a process's memory profile. The page tables described above keep track of physical pages allocated to the process by the kernel. They may be in low or high memory. It is important to note that user page tables will directly map high pages, unlike the restriction imposed by HighMem on the kernel tables.
The detail of each virtual area in a user process is stored in vm_area_struct. The definition given in the kernel source is:
"... A VM area is any part of the process virtual memory space that has a special rule for the page-fault handlers (ie a shared library, the executable area etc)."
The structure can be seen here:
~coywolf/lxr/source/include/linux/mm.h?v=2.6.15#L57
Each discrete area in a process virtual memory space has a vm_area_struct to describe, among other things, its start, end, mm_struct parent, permissions, file mapping informationand a number of "tree's" member pointers for fast searching of the vm space.
With these data structures, the kernel is able to manage memory for user processes. Allocating, freeing and moving/swapping (PageFaultHandling) of pages can occur with the data stored here.
For more information, the interested reader is directed to the main wiki pages here. Other good sources of information include Mel Gorman's book on the Linux Virtual Memory Manager, Understanding the Linux Kernel (O'Reilly) and Linux Device Drivers 3 (LDD3).
IRC convo on virtual memory: