Chinaunix首页 | 论坛 | 博客
  • 博客访问: 491289
  • 博文数量: 484
  • 博客积分: 10145
  • 博客等级: 上将
  • 技术积分: 5805
  • 用 户 组: 普通用户
  • 注册时间: 2008-10-27 18:34
文章分类

全部博文(484)

文章存档

2011年(52)

2010年(107)

2009年(287)

2008年(38)

我的朋友

分类: LINUX

2009-12-05 14:44:31

http://lvsheat.blog.51cto.com/431185/142814

 

rhel5 MOSIX集群


2009-03-27 13:32:05
 标签:集群 MOSIX   []

MOSIX集群(一)–安装

目的: 集群节点内进程能根据负载情况自动迁移

用vmware安装一台rhel5(192.168.100.5)

# 下载MOSIX和kernel代码,准备编译

# 解压到指定目录

[root@rhel5 ~]# tar xjvf MOSIX-2.24.2.2.tbz -C /usr/src/

[root@rhel5 ~]# tar xzvf linux-2.6.26.tar.gz -C /usr/src/

#进入源代码所在目录

[root@rhel5 ~]# cd /usr/src/

#由于other/patch-2.6.26的目标路径是linux-2.6.26.1,做个连接吧(可能是mosix没有为2.6.26单独写patch…,不过还是支持的)

[root@rhel5 src]# ln -s linux-2.6.26/ ./linux-2.6.26.1

#给kernel打上mosix补丁

[root@rhel5 src]# patch -p0 < /usr/src/mosix-2.24.2.2/other/patch-2.6.26

#进入源代码目录,开始编译

[root@rhel5 src]# cd linux-2.6.26

#生成配置文件

[root@rhel5 linux-2.6.26]# make menuconfig

#生成依赖关系

[root@rhel5 linux-2.6.26]# make dep

#编译内核

[root@rhel5 linux-2.6.26]# make bzImage

#编译内核模块

[root@rhel5 linux-2.6.26]# make modules

#安装内核模块

[root@rhel5 linux-2.6.26]# make modules_install

#安装内核

[root@rhel5 linux-2.6.26]# make install

#进入mosix目录

[root@rhel5 mosix-2.24.2.2]# cd ../mosix-2.24.2.2

#安装mosix,一路回车,只用安装,记得把你常用级别的mosix服务打开就可以了.配置以后再说

[root@rhel5 mosix-2.24.2.2]# ./mosix.install

关机以后,用rhel5(192.168.100.5)克隆出slave(192.168.100.6)

安装完成

MOSIX-2.24.2.2/linux-2.6.26集群(二)–配置

将rhel5和slave开启,开机的时候,在grub界面按回车,然后选择2.6.26内核启动

clip_image002

slave启动以后,把ip地址,机器名改好(应为是由rhel5克隆得到的嘛)

[reel5]

#配置mosix

[root@rhel5 ~]# mosconf

MOSIX CONFIGURATION

===================

If this is your cluster's file-server and you want to configure MOSIX

for a set of nodes with a common root, please type their common root

directory. Otherwise, if you want to configure the node that you are

running on, just press <ENTER> :-

What would you like to configure?

=================================

1. Which nodes are in this cluster (ESSENTIAL)

2. Authentication (ESSENTIAL)

3. Logical node numbering (recommended)

4. Queueing policies (recommended)

5. Freezing policies

6. Miscellaneous policies

7. Become part of a multi-cluster organizational Grid

Configure what :- 1

There are no nodes in your cluster yet:

=======================================

To add a new set of nodes to your cluster, type 'n'.

To turn on advanced options, type '+'.

For help, type 'h'.

To save and exit, type 'q'. (to abandon all changes and exit, type 'Q')

Option :- n <==添加节点

Adding new node(s) to the cluster:

First host-name or IP address :- 192.168.100.5 <==节点ip

Number of nodes :- 1 <==节点数

Nodes in your cluster:

======================

1. 192.168.100.5

To add a new set of nodes to your cluster, type 'n'.

To modify an entry, type its number.

To delete an entry, type 'd' followed by that entry-number (eg. d1).

To turn on advanced options, type '+'.

For help, type 'h'.

To save and exit, type 'q'. (to abandon all changes and exit, type 'Q')

Option :- n <==添加节点

Adding new node(s) to the cluster:

First host-name or IP address :- 192.168.100.6 <==节点ip

Number of nodes :- 1 <==节点数

Nodes in your cluster:

======================

1. 192.168.100.5

2. 192.168.100.6

To add a new set of nodes to your cluster, type 'n'.

To modify an entry, type its number.

To delete an entry, type 'd' followed by that entry-number (eg. d2).

To turn on advanced options, type '+'.

For help, type 'h'.

To save and exit, type 'q'. (to abandon all changes and exit, type 'Q')

Option :- q <==保存退出

Cluster configuration was saved.

OK to also update the logical node numbers [Y/n]? y

Suggesting to assign '192.168.100.5'

as the central queue manager for the cluster

(but be cautious if you mix 32-bit and 64-bit nodes in the same cluster)

OK to update it now [Y/n]?

What would you like to configure next?

======================================

1. Which nodes are in this cluster

2. Authentication (ESSENTIAL)

3. Logical node numbering

4. Queueing policies

5. Freezing policies

6. Miscellaneous policies

7. Become part of a multi-cluster organizational Grid

q. Exit

Configure what :- 2 <==设置密码

MOSIX Authentication:

=====================

To protect your MOSIX cluster from abuse, preventing unauthorized

persons from gaining control over your computers, you need to set

up a secret cluster-protection key. This key can include any

characters, but must be identical throughout your cluster.

Your secret cluster-protection key: xxxx <==输入密码

Your key is 5 characters long.

(in the future, please consider a longer one)

To allow your users to send batch-jobs to other nodes in the cluster,

you must set up a secret batch-client key. This key can include any

characters, but must match the 'batch-server' key on the node(s) that

can receive batch-jobs from this node.

Your secret batch-client key: xxxx <==输入密码

Your key is 5 characters long.

(in the future, please consider a longer one)

For this node to accept batch jobs,

you must set up a secret batch-server key. This key can include any

characters, but must match the 'batch-client' key on the sending nodes.

To make your batch-server key the same as your batch-client key, type '+'.

Your secret batch-server key: xxxx <==输入密码

Your key is 5 characters long.

(in the future, please consider a longer one)

#保持退出

[root@rhel5 ~]# service mosix restart

[root@slave ~]# mosconf

....

#操作同rhel5一样

#重启服务

[root@slave ~]# service mosix restart

#看看状态吧

[root@slave ~]# service mosix status

This MOSIX node is: 192.168.100.6 (no features)

Nodes in cluster:

=================

192.168.100.5: proximate

192.168.100.6: proximate

Status: Running Normally (32-bits)

Load: 0.01 (equivalent to about 0.0066 CPU processes)

Speed: 6650 units

CPUS: 1

Frozen: 0

Util: 100%

Avail: YES

Procs: Running 0 MOSIX processes

Accept: Yes, will welcome processes from here

Memory: Available 461MB/503MB

Swap: Available 0.9GB/0.9GB

Daemons:

Master Daemon: Up

MOSIX Daemon : Up

Queue Manager: Up

Remote Daemon: Up

Postal Daemon: Up

Guest processes from other clusters in the grid: 0/8

#我比较喜欢看看端口是不是起来了

#TCP/IP ports 249-253 and UDP/IP ports 249-250 must be available for MOSIX

[root@slave ~]# netstat -antu | grep -E "24|25"

tcp 0 0 0.0.0.0:2401 0.0.0.0:* LISTEN

tcp 0 0 0.0.0.0:249 0.0.0.0:* LISTEN

tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN

tcp 0 0 0.0.0.0:250 0.0.0.0:* LISTEN

tcp 0 0 0.0.0.0:251 0.0.0.0:* LISTEN

tcp 0 0 0.0.0.0:252 0.0.0.0:* LISTEN

udp 0 0 0.0.0.0:249 0.0.0.0:*

udp 0 0 0.0.0.0:250 0.0.0.0:*

#好了,装完了

MOSIX-2.24.2.2/linux-2.6.26集群(三)–应用测试

#先在rehl5和slave上各开启一个终端,运行mon命令,检查

[root@rhel5 ~]# mon

clip_image004

#2个节点上应该都是闲置的吧

#为了能出些效果,做点费cpu的脚本,还必须是多线程的,

#mosix能够迁移的最小单位是进程,而不是指令或者函数,

#所以单进程负载再高也没意义

[root@rhel5 ~]# cat a.sh << EOF

awk 'BEGIN {for(i=0;i<100000;i++)for(j=0;j<100000;j++);}' &

awk 'BEGIN {for(i=0;i<100000;i++)for(j=0;j<100000;j++);}' &

awk 'BEGIN {for(i=0;i<100000;i++)for(j=0;j<100000;j++);}' &

awk 'BEGIN {for(i=0;i<100000;i++)for(j=0;j<100000;j++);}' &

awk 'BEGIN {for(i=0;i<100000;i++)for(j=0;j<100000;j++);}' &

awk 'BEGIN {for(i=0;i<100000;i++)for(j=0;j<100000;j++);}' &

EOF

[root@rhel5 ~]# chmod +x a.sh

#在rhel5上运行a.sh,也就是产生6个进程了

[root@rhel5 ~]# mosrun -e ./a.sh

#开始观察2个节点上的mon画面,刚开始rhel负载很高,然后slave的负载也起来了,能够看到

clip_image006

#能够看到在rhel5上,awk的6个进程还在,但是只有3个在运行,还有3个的状态是T(stop),哈哈,应该是迁移了

[root@rhel5 ~]# ps -aux | grep awk

Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.7/FAQ

root 25648 0.6 0.0 0 0 pts/0 T 16:16 0:00 [awk]

root 25650 0.4 0.0 0 0 pts/0 T 16:16 0:00 [awk]

root 25652 32.0 0.7 4168 3812 pts/0 R 16:16 0:37 awk BEGIN {for(i=0;i<100000;i++)for(j=0;j<100000;j++);}

root 25654 32.0 0.7 4168 3816 pts/0 R 16:16 0:37 awk BEGIN {for(i=0;i<100000;i++)for(j=0;j<100000;j++);}

root 25656 32.0 0.7 4168 3816 pts/0 R 16:16 0:37 awk BEGIN {for(i=0;i<100000;i++)for(j=0;j<100000;j++);}

root 25658 1.4 0.0 0 0 pts/0 T 16:16 0:01 [awk]

root 25665 0.0 0.1 3860 624 pts/0 R+ 16:18 0:00 grep awk

#到slave上top看看吧,明显看到有3个叫remoted的进程占用了cpu,这个就是迁移过来的状态吧

top - 16:19:19 up 3:10, 3 users, load average: 2.78, 1.18, 0.44

Tasks: 99 total, 5 running, 94 sleeping, 0 stopped, 0 zombie

Cpu(s): 99.3%us, 0.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.3%si

Mem: 515376k total, 423576k used, 91800k free, 107980k buff

Swap: 1048568k total, 0k used, 1048568k free, 234028k cach

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

16929 root 20 0 4168 3936 0 R 33.2 0.8 0:48.13 remoted

16925 root 20 0 4168 3932 0 R 32.9 0.8 0:50.57 remoted

16927 root 20 0 4168 3932 0 R 32.9 0.8 0:50.13 remoted

1 root 20 0 2036 664 572 S 0.0 0.1 0:01.36 init

2 root 15 -5 0 0 0 S 0.0 0.0 0:00.00 kthreadd

3 root RT -5 0 0 0 S 0.0 0.0 0:00.00 migratio

4 root 15 -5 0 0 0 S 0.0 0.0 0:02.00 ksoftirq

##############全文测试结束############

阅读(425) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~