Chinaunix首页 | 论坛 | 博客
  • 博客访问: 617026
  • 博文数量: 16
  • 博客积分: 10150
  • 博客等级: 上将
  • 技术积分: 209
  • 用 户 组: 普通用户
  • 注册时间: 2006-08-11 14:59
文章分类

全部博文(16)

文章存档

2015年(2)

2014年(2)

2013年(1)

2010年(7)

2009年(3)

2008年(1)

分类: 服务器与存储

2010-07-13 14:31:30

Benefits of Blocks:

Having a block abstraction for a distributed filesystem brings several benefits.

The first benefit is the most obvious: a file can be larger than any single disk in the network.

Second, making the unit of abstraction a block rather than a file simplifies the storage subsystem.

Furthermore, blocks fit well with replication for providing fault tolerance and availability. 

数据切片的好处

(1)文件尺寸不再受限于集群内单一磁盘大小

(2)简化存储子系统的设计,一是优化存储管理机制;二是降低元数据复杂。

(3)与备份机制结合,提供系统高可用可靠性

Namenode and Datanode

The namenode manages the filesystem namespace. It maintains the filesystem tree and the metadata for all the files and directories in the tree. This information is stored persistently on the local disk in the form of two files: the namespace image and the edit log. The namenode also knows the datanodes on which all the blocks for a given file are located, however, it does not store block locations persistently, since this information is reconstructed from datanodes when the system starts.

Datanodes are the work horses of the filesystem. They store and retrieve blocks when they are told to (by clients or the namenode), and they report back to the namenode periodically with lists of blocks that they are storing.

 控制节点:

维护文件系统树以及树上所有的文件和目录的元数据

存储节点:

1)存储和恢复数据切片
2)定期向控制节点报告所存储的切片的列表

HDFS Architecture

HDFS – Interfaces (exclude Java)

 

HDFS - Anatomy of a File Read

(1)   The client opens the file it wishes to read by calling open() on the FileSystem object, which for HDFS is an instance of DistributedFileSystem.

(2)   DistributedFileSystem calls the namenode, using RPC, to determine the locations of the blocks for the first few blocks in the file. The DistributedFileSystem returns a FSDataInputStream (an input stream that supports file seeks) to the client for it to read data from. FSDataInputStream in turn wraps a DFSInputStream, which manages the datanode and namenode I/O.

(3)   The client then calls read() on the stream DFSInputStream, which has stored the datanode addresses for the first few blocks in the file, then connects to the first (closest) datanode for the first block in the file.

(4)   Data is streamed from the datanode back to the client, which calls read() repeatedly on the stream.

(5)   When the end of the block is reached, DFSInputStream will close the connection to the datanode, then find the best datanode for the next block. This happens transparently to the client.

(6)   When the client has finished reading, it (DFSInputStream) calls close() on the FSDataInputStream.

  中文翻译,效果不好请大家见谅。

(1)客户端通过调用open()方法打开要读取文件系统对象(对于Hadoop来说,是DistributedFileSystem实例)中的某一的文件。
(2)FileSystem对象与控制节点RPC通讯以确定待读取文件的文件头几个分块所在位置。FileSystem对象 返回一个 FSDataInputStream对象(支持文件查找的输入流)给客户端供其读取数据。FSDataInputStream派生出 DFSInputStream管理控制节点和存储节点的通讯。
(3)客户端调用DFSInputStreamread()方法读取最近(远近的解释,详见后续slide)的存储节点上的文件分块。
(4)客户端不断调用read()方法将数据从存储节点下载到本地节点。
(5)数据读取到一个块的末尾时,DFSInputStream将关闭到该块所在存储节点的连接,然后寻找最近的下一个数据块。这一过程对客户端是透明的。
(6)客户端读取数据完毕后,调用FSDataInputStream对象的close()方法。
 

HDFS - Anatomy of a File Write

(1)   The client creates the file by calling create() on DistributedFileSystem.

(2)   DistributedFileSystem makes an RPC call to the namenode to create a new file in the filesystem’s namespace, with no blocks associated with it. Namenode performs various checks. If all checks pass, DistributedFileSystem returns a FSDataOutputStream for the client to start writing data to. FSDataOutputStream wraps a DFSOutput Stream, which handles communication with the datanodes and namenode.

(3)   The client writes data. DFSOutputStream splits data into packets, which it writes to an internal queue, called the data queue. The data queue is consumed by the Data Streamer.

(4)   The DataStreamer streams the packets to the first datanode in the pipeline, which stores the packet and forwards it to the second datanode in the pipeline. Similarly, the second datanode stores the packet and forwards it to the third (and last) datanode in the pipeline. (Here we assume the replication level is 3)

(5)   DFSOutputStream also maintains an internal queue of packets that are waiting to be acknowledged by datanodes, called the ack queue. A packet is removed from the ack queue only when it has been acknowledged by all the datanodes in the pipeline.

(6)   When the client has finished writing data it calls close() on the stream. This action flushes all the remaining packets to the datanode pipeline and waits for acknowledgments.

(7)   The client contacts the namenode to signal that the file is complete. The namenode waits for blocks to be minimally replicated before returning successfully.

 

(1)客户端通过调用文件系统对象(对于Hadoop来说,是DistributedFileSystem实例)的create()方法创建文件。
(2)FileSystem对象与控制节点RPC通讯以在文件系统命名空间内创建新的文件,此时文件没有关联数据块。控制节点例行检查,比如文件有没有存在,客户端有没有权限创建文件等。检查通过,FileSystem对象返回一个 FSDataOutputStream对象给客户端供其写入数据。与客户端读取文件时一样,FSDataOutputStream派生出 DFSOutputStream管理控制节点和存储节点的通讯。
(3)客户端写入数据。DFSOutputStream将数据分组写入被称为数据队列的一个内部队列。数据队列供Data Streamer使用。
(4)DataStreamer将数据组写入存储节点管线中的第一个存储节点。该节点存储相应数据并将其转交给下一个存储节点。同理,第二个节点客户端转交给最后一个(第三个)存储节点。(默认备份数为3
(5)DFSOutputStream维护着一个被称为应答队列的内部数据组队列,该队列内容为等待被写入存储节点的数据组。只有管线中的所有存储节点都确认正确存储某一数据组,该数据组才被从应答队列中删除。
(6)客户端写完数据后,调用DFSOutputStreamclose()方法。该操作将所有未保存的数据组压入存储节点管线,然后等待所有数据组都被正确存储。
(7)客户端联系控制节点,说明文件已经写操作执行完毕。控制节点实际上知道当前文件的存储现状,在数据块副本数达到最低副本数要求时,返回文件存储成功的消息。
阅读(2052) | 评论(1) | 转发(0) |
给主人留下些什么吧!~~

ming_nuaa2010-07-13 14:44:55

先是Hadoop的分块机制,分块机制对于提高系统容量,简化存储子系统以及提高系统容错性和可用性方面的用处。 接着介绍Namenode和Datanode的概念。 然后是Hadoop的系统架构图示以及Hadoop支持的接口类型。 最后重中之重,介绍Hadoop读写的data flow。