脚踏实地、勇往直前!
全部博文(1005)
分类: HADOOP
2014-12-03 17:07:44
linux下安装spark
环境:
OS:Rad Hat Linux As5
spark-1.0.2
scala-2.10.2
安装spark需要安装scala环境,所以这里我们先安装scala.
1.安装scala
下载安装介质,下载地址为:
根据情况选择下载的版本,我这里下载的版本是scala-2.10.2.tgz
使用hadoop登陆
拷贝安装文件到usr1目录
[hadoop1@node1 sacala]$ cp scala-2.10.2.tgz /usr1/
解压
[hadoop1@node1 usr1]$ tar -zxvf scala-2.10.2.tgz
目录改名
[hadoop1@node1 usr1]$ mv scala-2.10.2 scala
将hive目录权限赋予hadoop用户
[root@node1 usr1]# chown -R hadoop1:hadoop1 ./ scala
export SCALA_HOME= /usr1/scala
修改后的红色标识
[hadoop1@node1 ~]$ vi .bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
export JAVA_HOME=/usr/java/jdk1.8.0_05
export JRE_HOME=/usr/java/jdk1.8.0_05/jre
export HADOOP_HOME=/usr1/hadoop
HIVE_HOME=/usr1/hive
ZOOKEEPER_HOME=/usr1/zookeeper
export SCALA_HOME=/usr1/scala
export SQOOP_HOME=/usr1/sqoop
export HBASE_HOME=/usr1/hbase
export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JRE_HOME/lib:$HADOOP_HOME/lib:$HBASE_HOME/lib
export PATH=$HADOOP_HOME/bin:$HIVE_HOME/bin:$ZOOKEEPER_HOME/bin:$HBASE_HOME/bin:$JAVA_HOME/bin:$JRE_HOME/bin:$SQOOP_HOME/bin:$SCALA_HOME/bin:$PATH
PATH=$PATH:$HOME/bin
export PATH
~
[hadoop1@node1 ~]$ scala -version
Scala code runner version 2.10.2 -- Copyright 2002-2013, LAMP/EPFL
2.安装spark
下载安装介质,下载地址为:
根据情况选择下载的版本,我这里下载的版本是spark-1.0.2-bin-hadoop1.tgz
使用hadoop登陆
拷贝安装文件到usr1目录
[hadoop1@node1 spark]$ cp spark-1.0.2-bin-hadoop1.tgz /usr1/
解压
[hadoop1@node1 usr1]$ tar -zxvf spark-1.0.2-bin-hadoop1.tgz
目录改名
[hadoop1@node1 usr1]$ mv spark-1.0.2-bin-hadoop1 spark
将hive目录权限赋予hadoop用户
[root@node1 usr1]# chown -R hadoop1:hadoop1 ./spark
[hadoop1@node1 ~]$ vi .bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
export JAVA_HOME=/usr/java/jdk1.8.0_05
export JRE_HOME=/usr/java/jdk1.8.0_05/jre
export HADOOP_HOME=/usr1/hadoop
HIVE_HOME=/usr1/hive
ZOOKEEPER_HOME=/usr1/zookeeper
export SPARK_HOME=/usr1/spark
export SCALA_HOME=/usr1/scala
export SQOOP_HOME=/usr1/sqoop
export HBASE_HOME=/usr1/hbase
export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JRE_HOME/lib:$HADOOP_HOME/lib:$HBASE_HOME/lib
export PATH=$HADOOP_HOME/bin:$HIVE_HOME/bin:$ZOOKEEPER_HOME/bin:$HBASE_HOME/bin:$JAVA_HOME/bin:$JRE_HOME/bin:$SQOOP_HOME/bin:$SCALA_HOME/bin:$SPARK_HOME/bin:$PATH
PATH=$PATH:$HOME/bin
export PATH
进入conf目录
cd $SPARK_HOME/conf
vi slaves
添加如下数据节点
192.168.56.102
192.168.56.103
192.168.56.104
这里填写数据节点的ip
进入到conf目录
cd $SPARK_HOME/conf
从模板复制一份
[hadoop1@node1 conf]$ cp spark-env.sh.template spark-env.sh
编辑spark-env.sh文件
添加如下内容
export JAVA_HOME=/usr/java/jdk1.8.0_05
export HADOOP_HOME=/usr1/hadoop
export SCALA_HOME=/usr1/scala
export SPARK_MASTER_IP=192.168.56.101
192.168.56.101是名称节点的ip
[hadoop1@node1 usr1]$ tar -cvf spark.tar ./spark
传到其他机器
scp spark.tar hadoop1@192.168.56.102:/home/hadoop1
scp spark.tar hadoop1@192.168.56.103:/home/hadoop1
scp spark.tar hadoop1@192.168.56.104:/home/hadoop1
在每个数据节点上解压缩并修改目录属主
[root@node2 usr1]# tar -xvf spark.tar
[root@node2 usr1]# chown -R hadoop1:hadoop1 ./spark
在主节点上执行
[hadoop1@node1 usr1]$ cd $SPARK_HOME/sbin
[hadoop1@node1 sbin]$ ./start-all.sh
[hadoop1@node1 sbin]$ jps
15026 Master
9668 JobTracker
9433 NameNode
9595 SecondaryNameNode
15135 Jps
名称节点上多出了Master
[hadoop1@node2 ~]$ jps
5152 DataNode
5236 TaskTracker
24184 Jps
24125 Worker
数据节点上多了Worker,说明spark已经启动成功.
cd $SPARK_HOME/bin
[hadoop1@node1 bin]$ ./run-example SparkPi
amp 1417597140066
14/12/03 16:59:00 INFO util.Utils: Fetching to /tmp/fetchFileTemp7599337252435878324.tmp
14/12/03 16:59:01 INFO executor.Executor: Adding file:/tmp/spark-5ce4253d-148a-48fb-a3f4-741778cc4a0b/spark-examples-1.0.2-hadoop1.0.4.jar to class loader
14/12/03 16:59:01 INFO executor.Executor: Serialized size of result for 0 is 675
14/12/03 16:59:01 INFO executor.Executor: Sending result for 0 directly to driver
14/12/03 16:59:01 INFO scheduler.TaskSetManager: Starting task 0.0:1 as TID 1 on executor localhost: localhost (PROCESS_LOCAL)
14/12/03 16:59:01 INFO scheduler.TaskSetManager: Serialized task 0.0:1 as 1411 bytes in 1 ms
14/12/03 16:59:01 INFO scheduler.DAGScheduler: Completed ResultTask(0, 0)
14/12/03 16:59:01 INFO scheduler.TaskSetManager: Finished TID 0 in 701 ms on localhost (progress: 1/2)
14/12/03 16:59:01 INFO executor.Executor: Running task ID 1
14/12/03 16:59:01 INFO executor.Executor: Serialized size of result for 1 is 675
14/12/03 16:59:01 INFO executor.Executor: Sending result for 1 directly to driver
14/12/03 16:59:01 INFO executor.Executor: Finished task ID 1
14/12/03 16:59:01 INFO scheduler.DAGScheduler: Completed ResultTask(0, 1)
14/12/03 16:59:01 INFO scheduler.DAGScheduler: Stage 0 (reduce at SparkPi.scala:35) finished in 0.772 s
14/12/03 16:59:01 INFO scheduler.TaskSetManager: Finished TID 1 in 63 ms on localhost (progress: 2/2)
14/12/03 16:59:01 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
14/12/03 16:59:01 INFO executor.Executor: Finished task ID 0
14/12/03 16:59:01 INFO spark.SparkContext: Job finished: reduce at SparkPi.scala:35, took 0.99398 s
Pi is roughly 3.14364
IE栏里输入:
-- The End --