环境如下:
ubuntu server x86_64 12.04
hadoop 1.0.2
1) master和slave /etc/hosts文件修改- hadoop@hadoop-master:~$ cat /etc/hosts
- 192.168.10.100 slave1 hadoop-slave1
- 192.168.10.101 master hadoop-master
- 192.168.10.102 slave2 hadoop-slave2
2) 创建统一用户hadoop并且实现ssh认证登录(master可以无密码登录slave)- hadoop@hadoop-master:~$sudo useradd -m -s /bin/bash -G sudo hadoop
- hadoop@hadoop-master:~$sudo apt-get install ssh
- hadoop@hadoop-master:~$sudo /etc/init.d/sshd start
- #在hadoop-master创建ssh-key
- hadoop@hadoop-master:~$ssh-copy-id -i id_rsa.pub localhost
- hadoop@hadoop-master:~$ssh-copy-id -i id_rsa.pub hadoop-slave1
- hadoop@hadoop-master:~$ssh-copy-id -i id_rsa.pub hadoop-slave2
NOTE: 建议从master登录下hadoop-salve1和hadoop-salve2,因为电脑会出现安全认证 yes/no 以免下面的实现master无法同步slave
3) 安装jdk- hadoop@hadoop-master:~$ sudo apt-get install default-jdk
NOTE:大概是170M左右的文件。龟速下载中 (也可以用bin包,不过个人喜欢用apt..懒。)
配置/etc/profile- export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64
- export HADOOP_HOME=/home/hadoop/hadoop-1.0.2
- export PATH=$PATH:$HADOOP_HOME/bin:$JAVA_HOME/bin
- export HADOOP_HOME_WARN_SUPPRESS=1 #屏蔽hadoop的一个警告
4) 安装hadoop- #下载hadoop-1.0.2
- hadoop@hadoop-master:~$ wget -c http://archive.apache.org/dist/hadoop/core/hadoop-1.0.1/hadoop-1.0.2.tar.gz
- #解压
- hadoop@hadoop-master:~$tar xvzf hadoop-1.0.2.tar.gz
- #软链接
- hadoop@hadoop-master:~$ ln -s hadoop-1.0.2 hadoop
5) 配置hadoop
#conf/hadoop-env.sh - #添加jdk
- export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64
#conf/mapred-site.xml- #hdfs-site.xm<?xml version="1.0"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!-- Put site-specific property overrides in this file. -->
- <configuration>
- <property>
- <name>mapred.job.tracker</name>
- <value>hadoop-master:9001</value>
- </property>
- </configuration>
#conf/hdfs-site.xml- <?xml version="1.0"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!-- Put site-specific property overrides in this file. -->
- <configuration>
- <property>
- <name>dfs.name.dir</name>
- <value>/home/hadoop/name</value>
- </property>
- <property>
- <name>dfs.data.dir</name>
- <value>/home/hadoop/data</value>
- </property>
- <property>
- <name>dfs.replication</name>
- <value>2</value> #默认是3份
- </property>
- </configuration>
#conf/core-site.xml- <?xml version="1.0"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!-- Put site-specific property overrides in this file. -->
- <configuration>
- <property>
- <name>fs.default.name</name>
- <value>hdfs://hadoop-master:9000</value>
- </property>
- <property>
- <name>hadoop.tmp.dir</name>
- <value>/home/hadoop/tmp</value>
- </property>
- </configuration>
#conf/master
#conf/slaves- hadoop-slave1
- hadoop-slave2
NOTE:创建name data目录不能预先创建,hadoop格式化会自动创建。
6) 拷贝 master的 hadoop目录到slave- hadoop@hadoop-master:~$ scp -r hadoop hadoop-slave1:
- hadoop@hadoop-master:~$ scp -r hadoop hadoop-slave2:
7) 格式化文件系统- hadoop@hadoop-master:~$ cd hadoop-1.0.2/
- hadoop@hadoop-master:~/hadoop-1.0.2$ bin/hadoop namenode -format
- #sucess output
- /************************************************************
- SHUTDOWN_MSG: Shutting down NameNode atv-jiwan-ubuntu-0/127.0.0.1
- *************************************************************
8) 启动所有结点
- hadoop@hadoop-master:~/hadoop-1.0.2$ bin/start-all.sh
9) 文件操作- hadoop@hadoop-master:~$ hadoop dfs -mkdir os
- hadoop@hadoop-master:~/hadoop-1.0.2$ bin/hadoop dfs -put bin/start-all.sh os
- hadoop@hadoop-master:~/hadoop-1.0.2$ bin/hadoop dfs -ls os
- drwxr-xr-x - hadoop supergroup 0 2012-05-08 11:38 /user/hadoop/os/start-all.sh
10)在slave上启动- hadoop@hadoop-slave1:~/hadoop-1.0.2$ bin/start-dfs.sh #单独启动HDFS集群DataNode
- hadoop@hadoop-slave1:~/hadoop-1.0.2$ bin/start-mapred.sh #单独启动Map/Reduce TaskTracker
11)关闭所有节点
- hadoop@hadoop-master:~/hadoop-1.0.2$ bin/stop-all.sh
阅读(728) | 评论(0) | 转发(0) |