== Sturcture ==
The block I/O layer 是层次结构,read,write系统调用要穿过这些层,当然也可能在Page cache停下来,
有可读的data或是write的数据被缓冲暂存延迟写了。每一曾都有一个主要角色,
实际上VFS 和FS这两层建立read,write函数映射。而Page cache和Buffercache,对系统性能提升起到了很大的促进作用。
结构
VFS
FS:Btrfs, Ext fs
Page cache,Buffer cache, submit_bio
结构
首先这个cache和L1 L2这样的硬件cache不是一回儿事,说的只是RAM内存缓存!这一层主要的结构体是buffer_head. It holds all the information that the kernel needs to manipulate buffers such as from which block device and which specific block the buffer is, which page holds the buffer and next buffer in the same page.
struct buffer_head {
unsigned long b_state; /* buffer state bitmap (see above) */
struct buffer_head *b_this_page;/* circular list of page's buffers */
struct page *b_page; /* the page this bh is mapped to */
sector_t b_blocknr; /* start block number */
size_t b_size; /* size of mapping应该是buffer实际数据大小 */
char *b_data; /* pointer to data within the page数据的起始地址配合b_size 天衣无缝 */
struct block_device *b_bdev;
bh_end_io_t *b_end_io; /* I/O completion */
void *b_private; /* reserved for b_end_io */
struct list_head b_assoc_buffers; /* associated with another mapping */
struct address_space *b_assoc_map; /* mapping this buffer is associated with */
atomic_t b_count; /* users using this buffer_head 进程引用次数*/
};
其中b_state表示本buffer当下状态。
功能
Robert love 提到 The purpose of a buffer head is to describe this mapping between the on-disk block and the physiacl in-memory buffer(which is a sequence of bytes on a specific page).Acting as a descriptor of this buffer-to-block mapping is the data structre's only role in the kernel.说的实在点,buffer_head就是一个戏子,一生都在扮演别人,而我们又何尝不是?
The generic block layer:struct bio
The primary purpose of bio structure is to represent an in-flight block I/O operations.
你想知道bio 是怎么来的?在kernel2.5的时候,内核打什么发现buffer_head(就是上面的那个)用来做执行block I/O操作时数据对象有点浪费,为什么这么说,一个buffer_head只可以描述一个blcok or buffer。执行1MB的写就要20个buffer_head用来描述,确实够费,于是那些大神,设计了bio,核心就是bio_vec这个数组结构,数组?对数组那么一个结构里有一个数组就够了 ,管你有多少个buffer or block。其实很简单的。每个bio_vec其实就是上面的buffer_head
的变形,不同之处bio_vec可以同时包括多个block ,前提必须是在同一个page 里面的block。看一下bio_vec的定义你就知道为什么他能描述多个block了,就是offset+ length。
struct bio {
sector_t bi_sector; /* device address in 512 byte sectors */
struct bio *bi_next; /* request queue link */
struct block_device *bi_bdev;
unsigned long bi_flags; /* status, command, etc */
unsigned long bi_rw; /* bottom bits READ/WRITE, top bits priority */
unsigned short bi_vcnt; /* how many bio_vec's */
unsigned short bi_idx; /* current index into bvl_vec */
unsigned int bi_phys_segments;/* Number of segments in this BIO after physical address coalescing is performed. */
unsigned int bi_size; /* residual I/O count */
/** To keep track of the max segment size, we account for the sizes of the first and last mergeable segments in this bio. */
unsigned int bi_seg_front_size;
unsigned int bi_seg_back_size;
unsigned int bi_max_vecs; /* max bvl_vecs we can hold */
atomic_t bi_cnt; /* pin count */
struct bio_vec *bi_io_vec; /* the actual vec list */
bio_end_io_t *bi_end_io;
void *bi_private;
#ifdef CONFIG_BLK_CGROUP
/** Optional ioc and css associated with this bio. Put on bio * release. Read comment on top of bio_associate_current(). */
struct io_context *bi_ioc;
struct cgroup_subsys_state *bi_css;
#endif
#if defined(CONFIG_BLK_DEV_INTEGRITY)
struct bio_integrity_payload *bi_integrity; /* data integrity */
#endif
bio_destructor_t *bi_destructor; /* destructor */
/** We can inline a number of vecs at the end of the bio, to avoid double allocations for a small number of bio_vecs. This member MUST obviously be kept at the very end of the bio. */
struct bio_vec bi_inline_vecs[0];
};
I/O scheduler
优化!通过Merging sorting request,then dispatch to block device queue will improve performance。
The linus elevator
The deadline I/O scheduler
The anticipatory I/O scheduler
The complete fair queuing I/O scheduler
Block driver
q->request_fn
这里就是调用真正的Block device driver。
Blcock device
to be continued
阅读(3218) | 评论(0) | 转发(0) |