There totally 3 kind of cache miss:
- instruction cache miss
- data cache read miss
- data cache write miss
We know cache fetch data in ``lines'', each cache line is typically 32/64 bytes.
When data cache write miss happened, we could queue the request to the (CPU) local buffer, the cost of data cache write miss thus is very low.
When data cache read miss happened, we need to wait for the data to be fetched into cache line, but before that the CPU could execute instructions in the instruction cache, the cost is higher than data cache write miss, but still, not very high.
When instruction cache miss occurred, the processor can not do anything unless some instruction is fetched to the instruction cache, cost of icache miss is quite high. That's also why some times we need to optimized for branches, branches are conditional/unconditional jumps, regardless the branch prediction, the performance could goes down dramatically.
This slides (http://blogimg.chinaunix.net/blog/upfile2/091222202600.pdf) shows an example of data cache miss (section: Debug TLB Miss Rate using OProfile): we could write to some storage, by each time we skip more than one cache line, in this way every time we access the data, we got a dcache miss. If we skip more than one page each time, we also get another dTLB miss, dTLB miss cost is even much higher than dcache miss: every time the kernel have to lookup the page table manually, and update the TLB cache. for Linux kernel, it's a DSI error (exception), if the the page is not physically in memory (not addressable by PTE), another DataStorage exception is raised, which cause do_page_fault() to be called, respectively.
This link ( & ) have an excellent introduction about CPU caches.
阅读(1441) | 评论(0) | 转发(0) |