Google的数据中心使用廉价的Linux PC机组成集群,在上面运行各种应用。即使是分布式开发的新手也可以迅速使用Google的基础设施。核心组件是3个:
1、GFS(Google File System)。一个分布式文件系统,隐藏下层负载均衡,冗余复制等细节,对上层程序提供一个统一的文件系统API接口。Google根据自己的需求对它进行了特别优化,包括:超大文件的访问,读操作比例远超过写操作,PC机极易发生故障造成节点失效等。GFS把文件分成64MB的块,分布在集群的机器上,使用Linux的文件系统存放。同时每块文件至少有3份以上的冗余。中心是一个Master节点,根据文件索引,找寻文件块。详见Google的工程师发布的GFS论文。
另外,这是云计算(Cloud Computing)的初级阶段的实现,是通向未来的桥梁。
How to ?
Welcome to Hadoop!
Hadoop is a software platform that lets one easily write and run applications that process vast amounts of data.
Here's what makes Hadoop especially useful:
- Scalable: Hadoop can reliably store and process petabytes.
- Economical: It distributes the data and processing across clusters of commonly available computers. These clusters can number into the thousands of nodes.
- Efficient: By distributing the data, Hadoop can process it in parallel on the nodes where the data is located. This makes it extremely rapid.
- Reliable: Hadoop automatically maintains multiple copies of data and automatically redeploys computing tasks based on failures.
Hadoop implements , using the Hadoop Distributed File System (HDFS) (see figure below.) MapReduce divides applications into many small blocks of work. HDFS creates multiple replicas of data blocks for reliability, placing them on compute nodes around the cluster. MapReduce can then process the data where it is located.
Hadoop has been demonstrated on clusters with 2000 nodes. The current design target is 10,000 node clusters.
For more information about Hadoop, please see the Hadoop wiki.
Getting Started
The Hadoop project plans to scale Hadoop up to handling thousands of computers. However, to begin with you can start by installing in on a single machine or a very small cluster.
- Learn about Hadoop by reading the documentation.
- Hadoop from the release page.
- Hadoop Quickstart.
- Hadoop Cluster Setup.
- Discuss it on the mailing list.
Getting Involved
Hadoop is an open source volunteer project under the Apache Software Foundation. We encourage you to learn about the project and contribute your expertise. Here are some starter links:
- See our page.
- Give us : What can we do better?
- Join the : Meet the community.
阅读(709) | 评论(0) | 转发(0) |