一: file-backed
简而言之:struct page(mapping成员)-> struct address_space (两个链表成员i_mmap and i_mmap_shared) =》每一个映射到该文件的vm_area_struct =》mm_struct==>进程page table.见
The struct page structure for a given page is in the upper left corner. One of the fields of that structure is called mapping; it points to an address_space structure describing the object which backs up that page. That structure includes the inode for the file, various data structures for managing the pages belonging to the file, and two linked lists (i_mmap and i_mmap_shared) containing the vm_area_struct structures for each process which has a mapping into the file. The vm_area_struct (usually called a "VMA") describes how the mapping appears in a particular process's address space; the file /proc/pid/maps lists out the VMAs for the process with ID pid. The VMA provides the information needed to find out what a given page's virtual address is in that process's address space, and that, in turn, can be used to find the correct page table entry.
问题:
已知(1)other process的VMA (2)页面page, 如何确定other process的VMA 是否映射了和页面page同样的物理页帧?
见下面的patch。 page->index 包含了页面在file中的index,vma->vm_pgoff包含了vma在文件中的偏移,所以下面的两行可以确定other process
映射同样页帧的virtual address,
loffset = (page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT));
address = vma->vm_start + ((loffset - vma->vm_pgoff) << PAGE_SHIFT);
剩下的事情就非常简单了,检查页帧是否相等。
if (page_to_pfn(page) != pte_pfn(*pte))
[PATCH 2.5.62] Full updated partial object-based rmap
try_to_unmap_obj_one
+static inline int
+try_to_unmap_obj_one(struct vm_area_struct *vma, struct page *page)
+{
+ struct mm_struct *mm = vma->vm_mm;
+ pgd_t *pgd;
+ pmd_t *pmd;
+ pte_t *pte;
+ pte_t pteval;
+ unsigned long loffset;
+ unsigned long address;
+ int ret = SWAP_SUCCESS;
+
+ loffset = (page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT));
+ if (loffset < vma->vm_pgoff)
+ goto out;
+
+ address = vma->vm_start + ((loffset - vma->vm_pgoff) << PAGE_SHIFT);
+
+ if (address >= vma->vm_end)
+ goto out;
+
+ if (!spin_trylock(&mm->page_table_lock)) {
+ ret = SWAP_AGAIN;
+ goto out;
+ }
+ pgd = pgd_offset(mm, address);
+ if (!pgd_present(*pgd))
+ goto out_unlock;
+
+ pmd = pmd_offset(pgd, address);
+ if (!pmd_present(*pmd))
+ goto out_unlock;
+
+ pte = pte_offset_map(pmd, address);
+ if (!pte_present(*pte))
+ goto out_unmap;
+
+ if (page_to_pfn(page) != pte_pfn(*pte))
+ goto out_unmap;
+
+ if (vma->vm_flags & VM_LOCKED) {
+ ret = SWAP_FAIL;
+ goto out_unmap;
+ }
+
+ flush_cache_page(vma, address);
+ pteval = ptep_get_and_clear(pte);
+ flush_tlb_page(vma, address);
+
+ if (pte_dirty(pteval))
+ set_page_dirty(page);
+
+ if (atomic_read(&page->pte.mapcount) == 0)
+ BUG();
+
+ mm->rss--;
+ atomic_dec(&page->pte.mapcount);
+ page_cache_release(page);
+
+out_unmap:
+ pte_unmap(pte);
+
+out_unlock:
+ spin_unlock(&mm->page_table_lock);
+
+out:
+ return ret;
+}
优化措施: radix priority search tree ,见
Documentation/prio_tree.txt 和ULK 3RD。目标是为了快速定位page所属的VMA, 因为有些VMA虽然映射到同一文件,但并不包含该页,这样可能会导致性能问题scalability 。
2 Anonymous Pages
共享Anonymous Pages有两种情况(1)父子进程 (2)Another (quite unusual) case occurs when a process creates a memory region specifying both the MAP_ANONYMOUS and MAP_SHARED flag: the pages of such a region will be shared among the future descendants of the process. 见ULK 3RD 17.2.1. Reverse Mapping for Anonymous Pages.
从这样来看,这样的vma总是处于相同的virtual address(有趣的是mremap将导致结论不成立),而file-backed 可以映射到不同的VA,见 和。
处理方式:
创建一个anon-vma对象将有相关的Anonymous VMA链接起来,anonymous page的mapping成员指向anon-vma,这样unmap该page时就可以找到所有可能相关的VMA。
问题:
已知(1)other process的Anonymous VMA (2)Anonymous页面page, 如何确定other process的Anonymous VMA 是否映射了和Anonymous页面page同样的物理页帧?
解答: 和前面file-back VMA处理方式类似。
优化措施:anon_vma_chain (2.6.34)