很少有和zfs相关的代码分析,中文的材料基本没有,我来剖析一下zfs的写磁盘的底层的实现吧。
在zio 的流水线当中,有一个流水线的名称为vdev_disk_io_start。
在vdev_disk_io_start(zio *zio)函数中,做了如下的事情:1.判断zio的请求类型(zio->io_type)
2.根据类型,将zio 传递给__vdev_disk_physio(vd->vd_bdev, zio, zio->io_data,
zio->io_size, zio->io_offset, flags);
在函数 __vdev_disk_physio函数当中,涉及到dio_request 结构体,为了submit_bio的时候使用的对单个请求的包装。
dio_request 请求如下
- /*
- * Virtual device vector for disks.
- */
- typedef struct dio_request {
- struct completion dr_comp; /* Completion for sync IO */
- atomic_t dr_ref; /* References */
- zio_t *dr_zio; /* Parent ZIO */
- int dr_rw; /* Read/Write */
- int dr_error; /* Bio error */
- int dr_bio_count; /* Count of bio's */
- struct bio *dr_bio[0]; /* Attached bio's */
- } dio_request_t;
可以看到dr_bio[]变量为可变的。并且在这个函数当中,给bio进行赋值
对于zio传下来的参数,可以告诉vdev层,请求的io偏移量,请求的buf指针。以及请求的类型
- error = __vdev_disk_physio(vd->vd_bdev, zio, zio->io_data,
- zio->io_size, zio->io_offset, flags);
- static int
- __vdev_disk_physio(struct block_device *bdev, zio_t *zio, caddr_t kbuf_ptr,
- size_t kbuf_size, uint64_t kbuf_offset, int flags)
- {
- dio_request_t *dr;
- caddr_t bio_ptr;
- uint64_t bio_offset;
- int bio_size, bio_count = 16;
- int i = 0, error = 0;
- ASSERT3U(kbuf_offset + kbuf_size, <=, bdev->bd_inode->i_size);
- retry:
- dr = vdev_disk_dio_alloc(bio_count);
- if (dr == NULL)
- return ENOMEM;
- ...
- dr->dr_zio = zio;
- dr->dr_rw = flags;
- /*bio_ptr为真正的存放数据的内存指针*/
- bio_ptr = kbuf_ptr;
- bio_offset = kbuf_offset;
- bio_size = kbuf_size;
- for (i = 0; i <= dr->dr_bio_count; i++) {
- ...
- if (bio_size <= 0)
- break;
- dr->dr_bio[i] = bio_alloc(GFP_NOIO,
- bio_nr_pages(bio_ptr, bio_size));
- if (dr->dr_bio[i] == NULL) {
- vdev_disk_dio_free(dr);
- return ENOMEM;
- }
- /* Matching put called by vdev_disk_physio_completion */
- vdev_disk_dio_get(dr);
- dr->dr_bio[i]->bi_bdev = bdev;
- dr->dr_bio[i]->bi_sector = bio_offset >> 9; //根据提供的偏移量,缩小为以扇区为单位。
- dr->dr_bio[i]->bi_rw = dr->dr_rw;
- dr->dr_bio[i]->bi_end_io = vdev_disk_physio_completion;
- dr->dr_bio[i]->bi_private = dr;
- /* Remaining size is returned to become the new size */
- bio_size = bio_map(dr->dr_bio[i], bio_ptr, bio_size); 在这里给每个bio的bi_vect变量赋值。bio_size 如果小于0,那么将跳出循环。传进bio_ptr,将数据以页的形式加入到bio当中。
-
- /* Advance in buffer and construct another bio if needed */
- bio_ptr += dr->dr_bio[i]->bi_size;
- bio_offset += dr->dr_bio[i]->bi_size;
- }
- /* Extra reference to protect dio_request during submit_bio */
- vdev_disk_dio_get(dr);
- if (zio)
- zio->io_delay = jiffies_64;
- /* Submit all bio's associated with this dio */
- for (i = 0; i < dr->dr_bio_count; i++)
- if (dr->dr_bio[i])
- submit_bio(dr->dr_rw, dr->dr_bio[i]);//同一个request,bio的请求类型都是相同的,并将生成的bio提交给内核来处理。
- /*
- * On synchronous blocking requests we wait for all bio the completion
- * callbacks to run. We will be woken when the last callback runs
- * for this dio. We are responsible for putting the last dio_request
- * reference will in turn put back the last bio references. The
- * only synchronous consumer is vdev_disk_read_rootlabel() all other
- * IO originating from vdev_disk_io_start() is asynchronous.
- */
- if (vdev_disk_dio_is_sync(dr)) {
- wait_for_completion(&dr->dr_comp);
- error = dr->dr_error;
- ASSERT3S(atomic_read(&dr->dr_ref), ==, 1);
- }
- (void)vdev_disk_dio_put(dr);
- return error;
- }
到这里,所有的处理工作都已经结束了。然后该执行zio的其他的流水线的函数了。
阅读(2142) | 评论(0) | 转发(0) |