博客首页 注册 建议与交流 排行榜 加入友情链接
推荐 投诉 搜索: 帮助

剑心通明的资料库

文章均为转载,本人不负因参考它所导致的一切后果,请谨慎参考!如您的文章不愿被转载,请点击此处联系本人!
您的点击,将是我最大的动力!多谢!
  jxtm.cublog.cn

关于作者
姓名:剑心通明
职业:高级工程师(专修灵魂^_^)
年龄:20出头30不到
位置:网络上一节点
个性介绍:努力学习每一天!
倾心打造:http://www.bsdlover.cn
http://bbs.bsdlover.cn
BSD爱好者的乐园!
|| << >> ||
我的分类


初学HACMP要用到的脚本3
安装、配置、测试HACMP,差不多1个月了,各种实验也都做了,相信自己已经完全可以应付公司日后HACMP方面的维护和纠错工作了。对HACMP的学习应该告一段落了,最后把自己觉得有用的脚本都记录下来。希望以后有机会能参加一次HACMP的培训,把自己的零散的知识串起来,得到全面、系统的提高。

1、简化hacmp.out的脚本
    hacmp.out中的内容太乱,如果只想看都有哪些event依次发生,这样就好了。

month=`date| awk '{print $2}'`
RG=`/usr/sbin/cluster/sbin/cl_lsvg |tail -1 |awk '{print $1}'`
cat /tmp/hacmp.out |egrep "(^$RG|^:|$month)" |awk -F[ '{print $1}' |uniq
说明:我的环境只有一个RESOURCE GROUP,如果你有多个,脚本可能要稍微修改一下。


2、查看HACMP启动的所有Subsystem相关信息的脚本
for i in `cat 1 |awk -F"The " '{print $2}' |awk -F" Subsystem" '{print $1}'`; do
echo ================================
echo $i
lssrc -ls $i;
done
说明:文件1就是smit clstart后屏幕输出的内容,自己复制、粘贴一下吧。

    下面这个是我机器上的输出,看明白这些内容,对HACMP的理解肯定能上一个层次。
================================
portmap
0513-005 The Subsystem, portmap, only supports signal communication.
================================
inetd
Subsystem         Group            PID          Status
 inetd            tcpip            73792        active
 
Debug         Inactive
 
Signal        Purpose
 SIGALRM      Establishes socket connections for failed services
 SIGHUP       Rereads configuration database and reconfigures services
 
 SIGCHLD      Restarts service in case the service dies abnormally
 
Service       Command                  Arguments                Status
 godm         /usr/es/sbin/cluster/godmd                          active
 xmquery      /usr/bin/xmtopas         xmtopas -p3              active
 telnet       /usr/sbin/telnetd        telnetd -a               active
 ftp          /usr/sbin/ftpd           ftpd                     active
 
 
================================
clsmuxpdES
SRC request not supported.
================================
topsvcs
Subsystem         Group            PID     Status
 topsvcs          topsvcs          630958  active
Network Name   Indx Defd  Mbrs  St   Adapter ID      Group ID
net_ether_01_0 [ 0] 2     2     S    192.168.168.2   192.168.168.2 
net_ether_01_0 [ 0] en1              0x4383d5f8      0x4383d5fa
HB Interval = 1.000 secs. Sensitivity = 10 missed beats
Missed HBs: Total: 0 Current group: 0
Packets sent    : 858 ICMP 0 Errors: 0 No mbuf: 0
Packets received: 1288 ICMP 0 Dropped: 0
NIM's PID: 557170
net_ether_01_1 [ 1] 2     2     S    192.168.68.2    192.168.68.2  
net_ether_01_1 [ 1] en0              0x4383d5f9      0x4383d5fa
HB Interval = 1.000 secs. Sensitivity = 10 missed beats
Missed HBs: Total: 0 Current group: 0
Packets sent    : 858 ICMP 0 Errors: 0 No mbuf: 0
Packets received: 1290 ICMP 0 Dropped: 0
NIM's PID: 381208
rs232_0        [ 2] 2     2     S    255.255.0.1     255.255.0.1   
rs232_0        [ 2] tty0             0x8383d5fa      0x8383d5fd
HB Interval = 2.000 secs. Sensitivity = 5 missed beats
Missed HBs: Total: 0 Current group: 0
Packets sent    : 599 ICMP 0 Errors: 0 No mbuf: 0
Packets received: 599 ICMP 0 Dropped: 0
NIM's PID: 454786
  2 locally connected Clients with PIDs:
haemd(795082) hagsd(508140)
  Dead Man Switch Enabled:
     reset interval = 1 seconds
     trip  interval = 20 seconds
  Configuration Instance = 198
  Daemon employs no security
  Segments pinned: Text Data.
  Text segment size: 768 KB. Static data segment size: 957 KB.
  Dynamic data segment size: 3713. Number of outstanding malloc: 175
  User time 0 sec. System time 0 sec.
  Number of page faults: 0. Process swapped out 0 times.
  Number of nodes up: 2. Number of nodes down: 0.
================================
grpsvcs
Subsystem         Group            PID          Status
 grpsvcs          grpsvcs          508140       active
2 locally-connected clients.  Their PIDs:
795082(haemd) 647448(clstrmgr)
HA Group Services domain information:
Domain established by node 1
Number of groups known locally: 3
                   Number of   Number of local
Group name         providers   providers/subscribers
ha_em_peers              2           1           0
CLRESMGRD_1130856062       2           1           0
CLSTRMGR_1130856062       2           1           0
================================
emsvcs
Subsystem         Group            PID          Status
 emsvcs           emsvcs           795082       active
 
No trace flags are set

Configuration Data Base version from local copy of CDB:
        941748909,457092608,0

Daemon started on Wednesday 11/23/05 at 10:37:46
Daemon has been running 0 days, 0 hours, 13 minutes and 31 seconds
Daemon connected to group services: Yes
Daemon has joined peer group:       Yes
Daemon communications enabled:      Yes
Daemon security:                    No support
Peer count:                         1

Peer group state:
        941748909,457092608,0
        NOSECSUPPORT

Logical Connection Information for Local Clients
    LCID          FD           PID     Start Time
       0          11         647448    Wednesday 11/23/05 10:38:35

Logical Connection Information for Remote Clients
    LCID          FD           PID     Start Time

Logical Connection Information for Peers
    LCID         Node   

Resource Monitor Information
         Name            Inst     Type      FD     SHMID     PID     Locked
IBM.HACMP.clresmgrd        0        C        -1         -1      -2  00/00  No
IBM.HACMP.clstrmgr         0        C        12         -1      -2  00/00  No
IBM.PSSP.harmpd            0        S        -1         -1      -1  00/00  No
Membership                 0        I        -1         -1      -2  00/00  No
aixos                      0        S        10   28311558      -2  00/01  No

Highest file descrīptor in use is 12

Highest file descrīptor allowed for client connections is 1500

Peer Daemon Status
   1 I A                                                      

Internal Daemon Counters
    GS init attempts =          1  GS join attempts =          1
    GS resp callback =          6  CCI conn rejects =          0
    RMC conn rejects =          0  HR conn rejects  =          0
    Retry req msg    =          0  Retry rsp msg    =          0
    Intervl usr util =          0  Total usr util   =          2
    Intervl sys util =          1  Total sys util   =          2
    Intervl time     =      12000  Total time       =      72001
    lccb's created   =          1  lccb's freed     =          0
    Reg rcb's creatd =          0  Reg rcb's freed  =          0
    Qry rcb's creatd =          0  Qry rcb's freed  =          0
    vrr created      =          0  vrr freed        =          0
    vqr created      =          0  vqr freed        =          0
    var inst created =        168  var inst freed   =          0
    Events regstrd   =          0  Events unregstrd =          0
    Insts assigned   =          0  Insts unassigned =          0
    Smem vars obsrv  =          0  State vars ōbsrv =          2
    Preds evaluated  =          0  Events generated =          0
    Smem lck intrvl  =          0  Smem lck total   =          0
    PRM msgs to all  =          0  PRM msgs to peer =          0
    PRM resp msgs    =          0  PRM msgs rcvd    =          0
    PRM_NODATA       =          0  PRM_BADMSG errs  =          0
    Sched q elements =         16  Free q elements  =         16
    xcb alloc'd      =          3  xcb freed        =          3
    xcb freed msgfp  =          0  xcb freed reqp   =          0
    xcb freed reqn   =          0  xcb freed rspc   =          1
    xcb freed rspp   =          0  xcb freed cmdrm  =          2
    xcb freed unkwn  =          0  Sec enable       =          0
    Sec disable      =          0  Sec authent      =          0
    Wake sec thread  =          0  Wake main thread =          0
    Missed sec rsps  =          0  Enq sec request  =          0
    Deq sec request  =          0  Enq sec response =          0
    Deq sec response =          0

Daemon Resource Utilization Last Interval
User:                 0.000 seconds    0.000%
System:               0.010 seconds    0.008%
User+System:          0.010 seconds    0.008%

Daemon Resource Utilization Total
User:                 0.020 seconds    0.003%
System:               0.020 seconds    0.003%
User+System:          0.040 seconds    0.006%

Data segment size:  528K
================================
emaixos
Subsystem         Group            PID          Status
 emaixos          emsvcs           725244       active
 
Trace Level:         None
Domain Type:         HACMP
Domain Name:         testdb_ha
RMAPI Initialized:   TRUE
Data Initialized:    TRUE
Data Init. Attempts: 1
Data Init. Delay:    5
Inst. Interval:      600
Inst. Count:         2
SRC FD:              3
Server FD:           7
Class Count:         7
Variable Count:      41
================================
clstrmgrES
Current state: ST_STABLE
i_local_nodeid 1, i_local_siteid -1, my_handle 2
ml_idx[1]=0     ml_idx[2]=1    
There are 0 events on the Ibcast queue
There are 0 events on the RM Ibcast queue
CLversion: 7
sccsid = "@(#)36        1.139 src/43haes/usr/sbin/cluster/hacmprd/main.C, hacmp.pe, 51haes_r520, r520s006a 7/20/05 14:32:42"
local node vrmf is 5200
cluster fix level is "0"
The following timer(s) are currently active:
Current DNP values
DNP Values for NodeId - 1  NodeName - TESTDB1
    PgSpFree = 127719  PvPctBusy = 0  PctTotalTimeIdle = 99.842852
DNP Values for NodeId - 2  NodeName - TESTDB2
    PgSpFree = 129269  PvPctBusy = 0  PctTotalTimeIdle = 99.789802
================================
gsclvmd
 Subsystem       Group           PID     Status
 gsclvmd         gsclvmd        295002  active
 
 No Active VGs.
 
================================
clinfoES
SRC request not supported.


3、除了网卡、网络、节点失效这三种HACMP自动要监控的故障外,对应用故障的监控也很必要,Notify Method脚本也写了一个,实验很顺利。
banner stop >>/ha52/man_fallover.log
date >>/ha52/man_fallover.log
/usr/es/sbin/cluster/utilities/clstop -grsy >>/ha52/man_fallover.log 2>&1

banner wait >>/ha52/man_fallover.log
ps -e |grep clstrmgr |grep -v grep
while [ $? = 0 ];do
    date >>/ha52/man_fallover.log
    echo clstrmgrES is stopping >>/ha52/man_fallover.log
    sleep 15
    ps -e |grep clstrmgr |grep -v grep
done

banner start >>/ha52/man_fallover.log
date >>/ha52/man_fallover.log
/usr/es/sbin/cluster/etc/rc.cluster -boot -N -i >>/ha52/man_fallover.log 2>&1

发表于: 2008-05-11,修改于: 2008-05-11 22:14,已浏览84次,有评论0条 推荐 投诉


网友评论
 发表评论