//If there is no space, we free a page //slot to make root for page X
reclaimfor(page X)
begin
if there are free page slots then
put X into a free page slot
else if(|A1in| > Kin)
page out the tail of A1in, call it Y, add identifier of Y to the head of A1out
if(|A1out| > Kout)
remove identifier of Z from the tail of A1out
end if
put X into the reclaimed page slot
else
page out the tail of Am, call it Y
//do not put it on A1out;it hasn't been
//accessed for a while
put X into the reclaimed page slot
end if
end
On accessing a page X:
begin
if X is in Am then
move X to the head of Am
else if(X is in A1out) then
reclaimfor(X)
add X to the head of Am
else if(X is in A1in) // do nothing
else // X is in no queue
reclaimfor(X)
add X to the head of A1in
end if
end
从算法流程可以看出,当一个新文件被访问时候,2Q会首先将它放入A1in中,在随后的多次访问中都认为是correlated reference,所以如果文件在A1in中,2Q不更新任何信息。A1in中内容删除后会放入A1out中,只有A1out中的内容被访问时2Q才认为该文件是热访问文件,并将其放入Am中。 在代理服务器中,经HTTP等协议访问的文件大小,次数都是在设计web cache时需要考虑的事情。GDS算法考虑使用文件的大小,GDF算法考虑使用文件访问频率,而GDFS算法[3]综合考虑了这两个参数。这三种算法都考虑使用了一种优化的aging mechanism,以避免cache pollution。GDS算法能够得到很好的byte hit ratio,但file hit ratio不是很好,而GDF算法能够得到很好的file hit ratio,但byte hit ratio不是很好,GDFS[5]算法尝试在二者之间平衡。GDFS使用优先级队列保存文件,并通过公式(1):
实验结果表明,降低文件大小的权重或者增加文件访问次数的权重都会使cache替换算法的性能达到更优,同时提高file hit ratio和byte hit ratio。 小结 (1)LRU算法和Simplified 2Q算法都会导致cache pollution,而LRU/k,2Q以及GDFS都不会产生cache pollution。 (2)LRU/k与GDFS算法使用了一个优先级队列,时间复杂度相对较高。而2Q与LRU/k都没有考虑web cache中访问文件本身具有的特征。 下一章可以看到,TS中的CLFUS算法综合考虑了这些算法的优缺点。 一些英文原文术语定义:
correlated reference: A common occurrence is for a page receive many references for a short period of time, then no references for a long time
参考文献: [1] E.J. O'Neil, P.E. O'Neil, and G. Weikum. The lru-k page replacement algorithm for database disk buffering. [2] Theodore Johnson, Dennis Shasha. 2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm. [3] Ludmila Cherkasova, Gianfranco Ciardo. Role of Aging, Frequency, and Size in Web Cache Replacement Policies.