linux IO调度器-j_cle-ChinaUnix博客

扶摇

首页　| 　博文目录　| 　关于我

j_cle

博客访问： 521908
博文数量： 138
博客积分： 0
博客等级：民兵
技术积分： 716
用户组：普通用户
注册时间： 2015-03-03 21:48

文章分类

全部博文（138）

日志分析（3）
Nginx（1）
操作系统（1）
数据结构（1）
DNS（1）
Mysql数据库（3）
系统运维（27）
LVS（0）
运维知识点（1）
网络（1）
ssh（1）
Git（1）
Makefile（3）
大数据（2）

hive（1）
Storm（1）
nginx（0）
java（1）
elasticsearch（1）
git（0）
ClamAV（1）
木马（1）
Linux C++（2）
Linux C（12）
nginx（0）
设计开发（3）
OpenIOC（1）
vim（7）
mail（2）
perl（9）
Linux（35）
kvm（3）
未分配的博文（13）

文章存档

2019年（1）

2017年（5）

2016年（99）

2015年（33）

我的朋友

cgweb

相关博文

linux IO调度器

分类： LINUX

2015-12-13 02:05:28

原文地址：linux IO调度器作者：tuyer

IO调度器（IO Scheduler）是操作系统用来决定块设备上IO操作提交顺序的方法。存在的目的有两个，一是提高IO吞吐量，二是降低IO响应时间。然而IO吞吐量和IO响应时间往往是矛盾的，为了尽量平衡这两者，IO调度器提供了多种调度算法来适应不同的IO请求场景。其中，对数据库这种随机读写的场景最有利的算法是DEANLINE。接着我们按照从简单到复杂的顺序，迅速扫一下Linux 2.6内核提供的几种IO调度算法。
1、NOOP
NOOP算法的全写为No Operation。该算法实现了最最简单的FIFO队列，所有IO请求大致按照先来后到的顺序进行操作。之所以说“大致”，原因是NOOP在FIFO的基础上还做了相邻IO请求的合并，并不是完完全全按照先进先出的规则满足IO请求。
假设有如下的io请求序列：
100，500，101，10，56，1000
NOOP将会按照如下顺序满足：
100(101)，500，10，56，1000
2、CFQ
CFQ算法的全写为Completely Fair Queuing。该算法的特点是按照IO请求的地址进行排序，而不是按照先来后到的顺序来进行响应。
假设有如下的io请求序列：
100，500，101，10，56，1000
CFQ将会按照如下顺序满足：
100，101，500，1000，10，56
在传统的SAS盘上，磁盘寻道花去了绝大多数的IO响应时间。CFQ的出发点是对IO地址进行排序，以尽量少的磁盘旋转次数来满足尽可能多的IO请求。在 CFQ算法下，SAS盘的吞吐量大大提高了。但是相比于NOOP的缺点是，先来的IO请求并不一定能被满足，可能会出现饿死的情况。
3、DEADLINE
DEADLINE在CFQ的基础上，解决了IO请求饿死的极端情况。除了CFQ本身具有的IO排序队列之外，DEADLINE额外分别为读IO和写IO提供了FIFO队列。读FIFO队列的最大等待时间为500ms，写FIFO队列的最大等待时间为5s。FIFO队列内的IO请求优先级要比CFQ队列中的高，，而读FIFO队列的优先级又比写FIFO队列的优先级高。优先级可以表示如下：
FIFO(Read) > FIFO(Write) > CFQ
4、ANTICIPATORY
CFQ和DEADLINE考虑的焦点在于满足零散IO请求上。对于连续的IO请求，比如顺序读，并没有做优化。为了满足随机IO和顺序IO混合的场景，Linux还支持ANTICIPATORY调度算法。ANTICIPATORY的在DEADLINE的基础上，为每个读IO都设置了6ms的等待时间窗口。如果在这6ms内OS收到了相邻位置的读IO请求，就可以立即满足。
IO调度器算法的选择，既取决于硬件特征，也取决于应用场景。
在传统的SAS盘上，CFQ、DEADLINE、ANTICIPATORY都是不错的选择；对于专属的数据库服务器，DEADLINE的吞吐量和响应时间都表现良好。然而在新兴的固态硬盘比如SSD、Fusion IO上，最简单的NOOP反而可能是最好的算法，因为其他三个算法的优化是基于缩短寻道时间的，而固态硬盘没有所谓的寻道时间且IO响应时间非常短。
查看和修改IO调度器的算法非常简单。假设我们要对sda进行操作，如下所示：
cat /sys/block/sda/queue/scheduler
echo “cfq” > /sys/block/sda/queue/scheduler

The 2.6 Linux Kernel included selectable IO schedulers. IO Schedulers control the way the kernel commits reads and writes to disks – the intention of providing different schedulers is to allow better optimsation for different classes of workload.

Without an IO scheduler, the kernel would basically just issue each request to disk in the order that it received them. This could result in massive thrashing of the disk subsystem – if one process was reading from one part of the disk, and one writing to another, it would have to seek back and forth across the disk for every operation. The schedulers main goal is to optimise disk access times.

An IO scheduler can use the following techniques to improve performance:

Request merging The scheduler merges adjacent requests together to reduce disk seeking Elevator The scheduler orders requests based on their physical location on the block device, and it basically tries to seek in one direction as much as possible. Prioritisation The scheduler has complete control over how it prioritises requests, and can do so in a number of ways

All IO schedulers should also take into account resource starvation, to ensure requests eventually do get serviced!

The Schedulers

There are currently 4 available:

Noop Scheduler
Anticipatory IO Scheduler ("as scheduler")
Deadline Scheduler
Complete Fair Queueing Scheduler ("cfq scheduler")

Noop Scheduler

This scheduler only implements request merging.
Anticipatory IO Scheduler ("as scheduler")

The anticipatory scheduler is the default scheduler in older 2.6 kernels – if you’ve not specified one, this is the one that will be loaded. It implements request merging, a one-way elevator, read and write request batching, and attempts some anticapatory reads by holding off a bit after a read batch if it thinks a user is going to ask for more data. It tries to optimise for physical disks by avoiding head movements if possible – one downside to this is that it probably give highly erratic performance on database or storage systems.

Deadline Scheduler

The deadline scheduler implements request merging, a one-way elevator, and imposes a deadline on all operations to prevent resource starvation. Because writes return instantly within linux, with the actual data being held in cache, the deadline scheduler will also prefer readers – as long as the deadline for a write request hasn’t passed. The kernel docs suggest this is the preferred scheduler for database systems, especially if you have aware disks, or any system with high disk performance.

Complete Fair Queueing Scheduler ("cfq scheduler")

The complete fair queueing scheduler implements both request merging and the elevator, and attempts to give all users of a particular device the same number of IO requests over a particular time interval. This should make it more efficient for multiuser systems. It seems that Novel SLES sets cfq as the scheduler by default, as does the latest release. As of the 2.6.18 kernel, this is the default schedular in kernel.org releases.

Changing Schedulers

The most reliable way to change schedulers is to set the kernel option ‘elevator’ at boot time. You can set it to one of "as", "cfq", "deadline" or "noop", to set the appropriate scheduler.

It seems under more recent 2.6 kernels (2.6.11, possibly earlier), you can change the scheduler at runtime by echoing the name of the scheduler into /sys/block//queue/scheduler, where devicename is the base name of the block device, eg sda for /dev/sda

Which one should I use?

I’ve not personally done any testing on this, so I can’t speak from experience yet. The anticipatory scheduler will be the default one for a reason however – it is optimised for the common case. If you’ve only got single disk systems (ie, no RAID – hardware or software) then this scheduler is probably the right one for you. If it’s a multiuser system, you will probably find cfq or deadline providing better performance, and the numbers seem to back deadline giving the best performance for database systems.

Tuning the IO schedulers

The schedulers may have parameters that can be tuned at runtime. Read the linux documentation on the schedulers listed in the section below

More information

Read the documents mentioned in the section below, especially the linux kernel documentation on the anticipatory and deadline schedulers.

阅读(722) | 评论(0) | 转发(0) |

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6