hadoop 快速安装-skybin090804-ChinaUnix博客

Sky_欧彬skybin090804.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

skybin090804

博客访问： 842482
博文数量： 167
博客积分： 7173
博客等级：少将
技术积分： 1671
用户组：普通用户
注册时间： 2009-08-04 23:07

文章分类

全部博文（167）

轻度运维（12）
云计算（10）
编程（1）
NoSQL（1）
PostgreSQL（1）
mongoDB（4）
其他架构测试总结（1）
工作中积累的文档（1）
网络相关（1）
TTSERVER（1）
其他（2）
Varnish（1）
Memcached（3）
NGINX（9）
RESIN（5）
网络收集的文档（12）
原创文档（1）
SQUID（2）
TCP（1）
SNMP（5）
MogileFS（7）
LVS（7）
MySQL（8）
SQL（3）
Coding（14）

Java（2）

Perl（2）

Python（1）

SHELL（4）
AMS Series（1）
Oracle（9）
AIX（1）
Linux（16）
生活杂谈（5）
Solaris（22）
未分配的博文（0）

文章存档

2018年（1）

2017年（11）

2012年（2）

2011年（27）

2010年（88）

2009年（38）

我的朋友

相关博文

hadoop 快速安装

分类：云计算

2011-08-05 12:34:18

这里省略hadoop的介绍，直接介绍安装步骤，按照这步骤就能克隆搭建一个实例。
角色列表：
namenode & jobtracker 192.168.237.13
datanode & tasktracker 192.168.237.74
datanode & tasktracker 192.168.239.128

#useradd hadoop
download hadoop-0.20.2.tar.gz
#mkdir /data/hadoop
#tar -zxvf hadoop-0.20.2.tar.gz
#chown -R hadoop:hadoop hadoop-0.20.2 hadoop
解决无密码登录问题
#./ssh_nopasswd.sh client && ./ssh_nopasswd.sh server 按需修改用户和路径

ssh_nopasswd.zip
----------------------------
以下四个文件的配置，在一台机上编辑好后，传到其它机器上，面前重复编辑。
相关文件配置:
core-site.xml 配置namenode jobtracker基本信息
主要配置
fs.default.name：URI of NameNode
mapred.job.tracker：jobtracker ip 和端口
hadoop.tmp.dir：hadoop临时目录
dfs.name.dir：name table存储路径
dfs.data.dir：namenode数据块配置
dfs.replication：副本数

PS：
我的host中进行了如下设置：
192.168.237.13 hadoop-237-13.pconline.ctc      hadoop-237-13
192.168.237.74 hadoop-237-74.pconline.ctc      hadoop-237-74
192.168.239.128 hadoop-239-128.pconline.ctc      hadoop-239-128

例子：

fs.default.name
hdfs://hadoop-237-13:9000
The name of the default file system. Either the literal string "local" or a host:port for DFS.

mapred.job.tracker
192.168.237.13:9001
The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and
reduce task.

hadoop.tmp.dir
/data/hadoop/tmp
A base for other temporary directories.

dfs.name.dir
/data/hadoop/filesystem/name
Determines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.

dfs.data.dir
/data/hadoop/filesystem/data
Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are i
gnored.

dfs.replication
2
Default block replication. The actual number of replications can be specified when the file is created. The default isused if replication is not specified in create time.

mapred-site.xml
配置map reduce 的一些细节信息
看description进行配置就行

mapred.job.tracker
192.168.237.13:9001
The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task.

mapred.tasktracker.map.tasks.maximum
2
The maximum number of map tasks that will be run simultaneously by a task tracker.

mapred.tasktracker.reduce.tasks.maximum
2
The maximum number of reduce tasks that will be run simultaneously by a task tracker.

mapred.map.tasks
2
The default number of map tasks per job. Ignored when mapred.job.tracker is "local".

mapred.reduce.tasks
2
The default number of reduce tasks per job. Typically set to 99% of the cluster's reduce capacity, so that if a node fails the reduces can still be executed in a single wave. Ignored when mapred.job.tracker is "local".

mapred.userlog.retain.hours
2
The maximum time, in hours, for which the user-logs are to be retained.

   mapred.child.java.opts
   -Xmx700M -server

mapred.map.max.attempts
800
Expert: The maximum number of attempts per map task. In other words, framework will try to execute a map task these many number of times before giving up on it.

mapred.reduce.max.attempts
800
Expert: The maximum number of attempts per reduce task. In other words, framework will try to execute a reduce task these many number of times before giving up on it.

mapred.max.tracker.failures
800
The number of task-failures on a tasktracker of a given job after which new tasks of that job aren't assigned to it.

mapred.task.timeout
60000000
The number of milliseconds before a task will be terminated if it neither reads an input, writes an output, nor updates its status string.

masters secondarynamenode：这里只为测试只做在namenode本机上了
里面信息为 192.168.237.13

slaves:
里面信息为：
192.168.237.74
192.168.239.128

因为我机器中没配置JAVA_HOME环境变量，所以在hadoop-env.sh文件中进行设置
export JAVA_HOME=/usr/java/jdk1.6.0_22

----------------------------
#cd /datat/hadoop && su hadoop
$bin/hadoop namenode -format
$bin/start-all.sh
$ bin/hadoop dfsadmin -report
显示如下信息，为成功。
Configured Capacity: 107981234176 (100.57 GB)
Present Capacity: 101694681088 (94.71 GB)
DFS Remaining: 101694607360 (94.71 GB)
DFS Used: 73728 (72 KB)
DFS Used%: 0%
Under replicated blocks: 1
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 2 (2 total, 0 dead)

Name: 192.168.239.128:50010
Decommission Status : Normal
Configured Capacity: 53558603776 (49.88 GB)
DFS Used: 36864 (36 KB)
Non DFS Used: 3143274496 (2.93 GB)
DFS Remaining: 50415292416(46.95 GB)
DFS Used%: 0%
DFS Remaining%: 94.13%
Last contact: Fri Aug 05 12:19:33 CST 2011

Name: 192.168.237.74:50010
Decommission Status : Normal
Configured Capacity: 54422630400 (50.69 GB)
DFS Used: 36864 (36 KB)
Non DFS Used: 3143278592 (2.93 GB)
DFS Remaining: 51279314944(47.76 GB)
DFS Used%: 0%
DFS Remaining%: 94.22%
Last contact: Fri Aug 05 12:19:33 CST 2011

安装过程中的一些错误：
1./data/hadoop 没有做chown 提示没权限
2./data/hadoop中手工创建了tmp data相关目录，提示
2011-08-05 09:40:34,559 INFO org.apache.hadoop.mapred.JobTracker: problem cleaning system directory: hdfs://hadoop-237-13:9000/data/hadoop/tmp/mapred/system
org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /data/hadoop/tmp/mapred/system. Name node is in safe mode.

如果遇到错误，多查看hadoop_home/logs下来相关信息

参考信息：

阅读(1543) | 评论(1) | 转发(0) |

上一篇：Java实现对MongoDB的OR操作

下一篇：hive 快速安装

给主人留下些什么吧！~~

skybin0908042011-08-19 10:08:24

文章中漏了一个条件：安装hadoop前，要先安装JDK

回复 | 举报

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6