全部博文(183)
分类: LINUX
2010-02-05 16:04:03
The analysis of CFQ for Linux 2.6.28.5
How to add request:
Instruction:
(1)(2)(5):
Every process in CFQ has a correspond queue whose name is cfqq. Actually, each cfqq has two queues, like DEADLINE, one queue is sorted by LBA of requests, the other queue is sorted by arrival time. The queue sorted by LBA is actually a red-black tree while the queue sorted by arrival time is called FIFO queue.
(6)(7)(8):
unsigned long elapsed = jiffies - cic->last_end_request;
unsigned long ttime = min(elapsed, 2UL * cfqd->cfq_slice_idle);
cic->ttime_samples = (7*cic->ttime_samples + 256) / 8;
cic->ttime_total = (7*cic->ttime_total + 256*ttime) / 8;
cic->ttime_mean = (cic->ttime_total + 128) / cic->ttime_samples;
**********************************************************
sector_t sdist;
u64 total;
if (cic->last_request_pos < rq->sector)
sdist = rq->sector - cic->last_request_pos;
else
sdist = cic->last_request_pos - rq->sector;
if (cic->seek_samples <= 60) /* second&third seek */
sdist = min(sdist, (cic->seek_mean * 4) + 2*1024*1024);
else
sdist = min(sdist, (cic->seek_mean * 4) + 2*1024*64);
cic->seek_samples = (7*cic->seek_samples + 256) / 8;
cic->seek_total = (7*cic->seek_total + (u64)256*sdist) / 8;
total = cic->seek_total + (cic->seek_samples/2);
do_div(total, cic->seek_samples);
cic->seek_mean = (sector_t)total;
**********************************************************
Idle_window:
Every cfqq has a flag to indicate whether this process is worth to do anticipation. In a simple word, I use Idle_window to describe it. If this parameter is set on, it means the I/O scheduler can do anticipation(whether it should do still depend on other conditions, however, if this parameter is set off, the I/O scheduler will never do anticipation)
How to set idle_window?
If this process is not sync, do not set;
If the priority of this process is idle, do not set;
If the process might have exited, do not set;
If the user has tuned off cfq_slice_idle(set it to 0), do not set;
If the process is seeky((cic)->seek_mean > (8 * 1024)), do not set;
If cic->ttime_mean > cfqd->cfq_slice_idle, do not set;
otherwise, set it;
How to
dispatch requests:
Suppose a
request queue has been selected:
The condition to decide whether it is worth to wait:
there is idle_slice_timer or
there is req serving in the device driver from this cfqq and the idle_window of this cfqq is turned on
When a
request is served by the device driver, the I/O scheduler will check
whether the current active_queue is the same queue with this request,
if it is, the scheduler then will check whether this cfqq need to do
anticipation.
The tunable parameters:
back_seek_max:
there is a routine named cfq_choose_req which is used to select the next request to be dispatched. Since the selection is occurred in the sorted queue, it is just the same as AS which is based on DEADLINE. The next request might be the nearest front request from the current disk head or the nearest back request from the current disk head. However, the I/O scheduler always want to select the request lied front of the disk head, which means there is a limitation distance to choose the request lied back. This is just the value.
back_seek_penalty:
The distance between back request and current disk head should multi this value as a penalty. In other word, this back request should be near the current disk head enough so that it can be selected to be the next request.
fifo_expire_async:
fifo_expire_sync:
Actually, every process has two queues, one for sync, the other is for async. And each sync actually has two queues, one for fifo, the other for sorted. From this point of view, there are 4 queues for a process. But generally, we use one queue to stand for a process. The two parameters are the time expiration for fifo of sync and async.
Quantum:
there is a limitation for the number of requests serving in the device driver. Since most sata now support NCQ, the I/O scheduler can dispatch more than one request to the device driver. Actually, I am not quite understand the relationship between NCQ and the number of requests serving in the device driver.
slice_sync:
slice_async:
time slice for the queue
slice_async_rq:
I am not sure about this parameter, since after dispatching the requests to the device driver, the I/O scheduler will check this parameter and decide whether the cfqq should be expired immediately. This parameter is for async request.
slice_idle:
The anticipation time for sync queue. There is no anticipation for async queue.