分类:
2011-04-13 16:16:47
hostA (172.29. 0. 28)
hostB(172.29.0.29)
故障:hostA和hostB同时init 6 重起,过了一会发现hostB已可以连接,但hostA连接不上,从hostB telnet上 hostA的管理IP.
1)root@hostB/#telnet 172.29.8.64
Trying 172.29.8.64...
Connected to 172.29.8.64.
Escape character is '^]'.
Copyright 2003 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
Sun(tm) Advanced Lights Out Manager 1.3 (hostA)
Please login: admin
Please Enter password: ********
sc> console -f
Warning: User < > currently has write permission to this console and forcibly removing them will terminate any current write actions and all work will be lost. Would you like to continue? [y/n]y
Enter #. to return to ALOM.
hostA console login: root
Password:
Sun Microsystems Inc. SunOS 5.10 Generic January 2005
常规检查: 空间正常, ip up正常, /etc/hosts信息正确.
root@hostA/#df -h
Filesystem size used avail capacity Mounted on
/dev/dsk/c2t1d0s0 30G 28G 2.2G 93% /
/devices 0K 0K 0K 0% /devices
ctfs 0K 0K 0K 0% /system/contract
proc 0K 0K 0K 0% /proc
mnttab 0K 0K 0K 0% /etc/mnttab
swap 5.5G 1.6M 5.5G 1% /etc/svc/volatile
objfs 0K 0K 0K 0% /system/object
sharefs 0K 0K 0K 0% /etc/dfs/sharetab
fd 0K 0K 0K 0% /dev/fd
swap 5.5G 0K 5.5G 0% /tmp
swap 5.5G 16K 5.5G 1% /var/run
swap 5.5G 0K 5.5G 0% /dev/vx/dmp
swap 5.5G 0K 5.5G 0% /dev/vx/rdmp
/dev/dsk/c2t1d0s7 550M 1.0M 494M 1% /export/home
root@hostA/#ifconfig -a
lo0: flags=2001000849
inet 127.0.0.1 netmask ff000000
bge0: flags=201000843
inet 10.198.90.28 netmask fffff800 broadcast 10.198.95.255
ether 0:3:ba:8c:ef:ad
从hostA可以连接到 hostB,反之则不行.
root@hostA/#rsh hostB
Last login: Wed Apr 13 10:46:48 from 172.29.0.211
Sun Microsystems Inc. SunOS 5.10 Generic January 2005
You have new mail.
root@hostB/#rsh hostA
^C
检查hostA的路由表(其实我的环境里不需要检查这一步,因为hostA,hostB 都在同一个网段, 没有这个问题).
root@hostA/#netstat -nr
Routing Table: IPv4
Destination Gateway Flags Ref Use Interface
-------------------- -------------------- ----- ----- ---------- ---------
default 10.198.88.1 UG 1 26
10.198.88.0 10.198.90.28 U 1 3 bge0
224.0.0.0 10.198.90.28 U 1 0 bge0
127.0.0.1 127.0.0.1 UH 1 0 lo0
root@hostA/#more /etc/defaultrouter
10.198.88.1
检查 message log也没看出异常.
root@hostA/#dmesg
Wed Apr 13 11:01:30 CST 2011
…………………………………….
Apr 13 10:34:46 hostA Had[4152]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (hostB) Agent is calling clean for resource(cssd) because the resource became OFFLINE unexpectedly, on its own.
Apr 13 10:35:54 hostA ip: [ID 390400 kern.notice]
Apr 13 10:36:26 hostA root: [ID 702911 user.error] Oracle CRSD 6339 set to stop
Apr 13 10:35:58 hostA last message repeated 2 times
Apr 13 10:36:26 hostA root: [ID 702911 user.error] Oracle CRSD 6339 shutdown completed
Apr 13 10:36:26 hostA root: [ID 702911 user.error] Oracle CRSD 6339 set to stop
Apr 13 10:36:26 hostA root: [ID 702911 user.error] Oracle CRSD 6339 shutdown completed
Apr 13 10:36:26 hostA root: [ID 702911 user.error] Oracle EVMD set to stop
Apr 13 10:36:26 hostA root: [ID 702911 user.error] Oracle EVMD set to stop
Apr 13 10:36:26 hostA root: [ID 702911 user.error] Oracle CSSD being stopped
Apr 13 10:36:26 hostA root: [ID 702911 user.error] Oracle CSSD being stopped
Apr 13 10:36:28 hostA ip: [ID 390400 kern.notice]
Apr 13 10:36:32 hostA last message repeated 2 times
…………………………………………….
Apr 13 10:37:34 hostA gab: [ID 719437 kern.notice] GAB ERROR V-15-1-20015 unconfigure failed: clients still registered
Apr 13 10:37:34 hostA svc.startd[7]: [ID 652011 daemon.warning] svc:/system/gab:default: Method "/lib/svc/method/gab stop" failed with exit status 1.
Apr 13 10:37:34 hostA svc.startd[7]: [ID 748625 daemon.error] system/gab:default failed: transitioned to maintenance (see 'svcs -xv' for details)
Apr 13 10:37:34 hostA syslogd: going down on signal 15
现在怀疑是telnet服务没起来,端口也没起来. telnet localhost也不可以.
root@hostA/#netstat -an|grep 22
root@hostA/#netstat -an|grep 23
root@hostA/#telnet 127.0.0.1
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused
root@hostA/#netstat -an|grep LISTEN
127.0.0.1.5987 *.* 0 0 49152 0 LISTEN
127.0.0.1.898 *.* 0 0 49152 0 LISTEN
127.0.0.1.32768 *.* 0 0 49152 0 LISTEN
127.0.0.1.5988 *.* 0 0 49152 0 LISTEN
127.0.0.1.32769 *.* 0 0 49152 0 LISTEN
root@hostA/#svcs -a |grep telnet
uninitialized 10:40:13 svc:/network/telnet:default
root@hostA/#svcadm enable telnet
root@hostA/#svcs -xv
svc:/system/filesystem/local:default (local file system mounts)
State: maintenance since Wed Apr 13 10:44:06 2011
Reason: Start method exited with $SMF_EXIT_ERR_FATAL.
See:
See: /var/svc/log/system-filesystem-local:default.log
Impact: 50 dependent services are not running:
svc:/system/vxatd:default
svc:/system/vcs:default
svc:/system/vxodm:default
svc:/system/llt:default
svc:/system/gab:default
svc:/system/vxfen:default
svc:/system/vcsmm:default
svc:/system/lmx:default
svc:/network/inetd:default
svc:/milestone/multi-user:default
svc:/system/vxvm/vxvm-recover:default
svc:/milestone/multi-user-server:default
svc:/system/basicreg:default
svc:/system/zones:default
svc:/application/graphical-login/cde-login:default
svc:/system/xprtld:default
svc:/system/vxdcli:default
svc:/system/vxdbdctrl:default
svc:/application/cde-printinfo:default
svc:/system/sysidtool:net
svc:/network/rpc/bind:default
svc:/network/nfs/nlockmgr:default
svc:/network/nfs/client:default
svc:/system/filesystem/autofs:default
svc:/system/system-log:default
svc:/network/smtp:sendmail
svc:/system/webconsole:console
svc:/application/management/seaport:default
svc:/application/management/snmpdx:default
svc:/application/management/dmi:default
svc:/application/management/sma:default
svc:/system/fpsd:default
svc:/network/rarp:default
svc:/system/dumpadm:default
svc:/system/fmd:default
svc:/network/ssh:default
svc:/network/nfs/status:default
svc:/network/nfs/cbd:default
svc:/network/nfs/mapid:default
svc:/application/stosreg:default
svc:/network/rpc/bootparams:default
svc:/system/sysidtool:system
svc:/system/postrun:default
svc:/system/cron:default
svc:/system/vxfs/vxfsldlic:default
svc:/system/vxpbx:default
svc:/application/font/fc-cache:default
svc:/system/boot-archive-update:default
svc:/network/shares/group:default
svc:/system/sac:default
svc:/network/rpc/gss:default (Generic Security Service)
State: uninitialized since Wed Apr 13 10:40:11 2011
Reason: Restarter svc:/network/inetd:default is not running.
See:
See: man -M /usr/share/man -s 1M gssd
Impact: 27 dependent services are not running:
svc:/network/nfs/client:default
svc:/system/filesystem/autofs:default
svc:/system/system-log:default
svc:/milestone/multi-user:default
svc:/system/vxvm/vxvm-recover:default
svc:/system/vxfen:default
svc:/system/vcs:default
svc:/system/vxodm:default
svc:/milestone/multi-user-server:default
svc:/system/basicreg:default
svc:/system/zones:default
svc:/application/graphical-login/cde-login:default
svc:/system/xprtld:default
svc:/system/vxdcli:default
svc:/system/vxdbdctrl:default
svc:/application/cde-printinfo:default
svc:/network/smtp:sendmail
svc:/system/webconsole:console
svc:/application/management/seaport:default
svc:/application/management/snmpdx:default
svc:/application/management/dmi:default
svc:/application/management/sma:default
svc:/system/fpsd:default
svc:/network/rarp:default
svc:/system/dumpadm:default
svc:/system/fmd:default
svc:/network/ssh:default
svc:/network/rpc/rstat:default (kernel statistics server)
State: uninitialized since Wed Apr 13 10:40:12 2011
Reason: Restarter svc:/network/inetd:default is not running.
See:
See: man -M /usr/share/man -s 1M rpc.rstatd
See: man -M /usr/share/man -s 1M rstatd
Impact: 1 dependent service is not running:
svc:/application/management/sma:default
svc:/application/print/server:default (LP print server)
State: disabled since Wed Apr 13 10:40:11 2011
Reason: Disabled by an administrator.
See:
See: man -M /usr/share/man -s 1M lpsched
Impact: 1 dependent service is not running:
svc:/application/print/ipp-listener:default
root@hostA/#cd /etc/
root@hostA/etc#more resolv.conf
domain sjt.swift
nameserver 10.200.6.165
#domain cdc.veritas.com
#nameserver 10.198.88.18
"resolv.conf" 4 lines, 90 characters 00001
root@hostA/etc#cp resolv.conf resolv.conf_old
root@hostA/etc#vi resolv.conf
#domain sjt.swift
#nameserver 10.200.6.165
domain cdc.veritas.com
nameserver 10.198.88.18
~
"resolv.conf" 4 lines, 90 characters
root@hostA/etc#svcs -xv svc:/network/telnet
svc:/network/telnet:default (Telnet server)
State: uninitialized since Wed Apr 13 10:40:13 2011
Reason: Restarter svc:/network/inetd:default is not running.
See:
See: man -M /usr/share/man -s 1M in.telnetd
See: man -M /usr/share/man -s 1M telnetd
Impact: This service is not running.
root@hostA/var/svc/log#svcs -xv svc:/network/inetd
svc:/network/inetd:default (inetd)
State: offline since Wed Apr 13 11:24:45 2011
Reason: Service svc:/system/filesystem/local:default
is not running because a method failed.
See:
Path: svc:/network/inetd:default
svc:/system/filesystem/local:default
See: man -M /usr/share/man -s 1M inetd
Impact: 13 dependent services are not running:
svc:/system/vxfen:default
svc:/system/vcs:default
svc:/system/vxodm:default
svc:/milestone/multi-user:default
svc:/system/vxvm/vxvm-recover:default
svc:/milestone/multi-user-server:default
svc:/system/basicreg:default
svc:/system/zones:default
svc:/application/graphical-login/cde-login:default
svc:/system/xprtld:default
svc:/system/vxdcli:default
svc:/system/vxdbdctrl:default
svc:/application/cde-printinfo:default
root@hostA/var/svc/log# more "/etc/vfstab"
14 lines, 569 characters ab
#device device mount FS fsck mount mount
#to mount to fsck point type pass at boot options
#
fd - /dev/fd fd - no -
/proc - /proc proc - no -
/dev/dsk/c2t1d0s1 - - swap - no -
/dev/dsk/c2t1d0s0 /dev/rdsk/c2t1d0s0 / ufs 1 no
-
/dev/dsk/c2t1d0s7 /dev/rdsk/c2t1d0s7 /export/home ufs 2
yes -
/devices - /devices devfs - no -
sharefs - /etc/dfs/sharetab sharefs - no -
ctfs - /system/contract ctfs - no -
objfs - /system/object objfs - no -
swap - /tmp tmpfs - yes -
/dev/vx/dsk/si_oradata_dg/oradata_vol /dev/vx/rdsk/si_oradata_dg/oradata_vol
/si_oradata vxfs 1 yes -
root@hostA/var/svc/log#vi /etc/vfstab
#device device mount FS fsck mount mount
#to mount to fsck point type pass at boot options
#
fd - /dev/fd fd - no -
/proc - /proc proc - no -
/dev/dsk/c2t1d0s1 - - swap - no -
/dev/dsk/c2t1d0s0 /dev/rdsk/c2t1d0s0 / ufs 1 no
-
/dev/dsk/c2t1d0s7 /dev/rdsk/c2t1d0s7 /export/home ufs 2
yes -
/devices - /devices devfs - no -
sharefs - /etc/dfs/sharetab sharefs - no -
ctfs - /system/contract ctfs - no -
objfs - /system/object objfs - no -
swap - /tmp tmpfs - yes -
~
~~~
"/etc/vfstab" 13 lines, 467 characters
root@hostA/var/svc/log#init 6
root@hostA/var/svc/log#svc.startd: The system is coming down. Please wait.
svc.startd: 98 system services are now being stopped.
WARNING: vmem_destroy('devfsadm_event_channel'): leaked 1 identifiers
svc.startd: The system is down.
syncing file systems... done
rebooting...
SC Alert: Host System has Reset
Sun Fire V240, No Keyboard
Copyright 1998-2003 Sun Microsystems, Inc. All rights reserved.
OpenBoot 4.11.4, 4096 MB memory installed, Serial #59568045.
Ethernet address 0:3:ba:8c:ef:ad, Host ID: 838cefad.
Rebooting with command: boot
Boot device: /pci@1c,600000/scsi@2/disk@1,0:a File and args:
SunOS Release 5.10 Version Generic_139555-08 64-bit
Copyright 1983-2009 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
Hardware watchdog enabled
Hostname: hostA
hostA console login:
hostA console login:
参考
总结: 主机通过secureCRT/ ssh 工具无法连接的处理办法:
1) 能否ping通,rsh 上去
2) 通过console连上去检查 df, ifconfig –a输出, /etc/hosts,路由信息(同网段的不需要)
3) 检查/var/adm/message
4) 是否是telnet,rsh服务未启动
5) 如果是telnet,rsh服务未启动, solaris10之前单独用脚本启动, 10用svcs –a 查看, 用 svcs –xv 检查原因(这是最重要的)
附 Solaris 10 x86开启root远程telnet登陆
Solaris 10和后续版本为了保护系统安全,缺省只提供ssh服务,而且还不允许root用户直接登录,这对于开发和调试系统的使用用户来说很不方便。在Solaris 10上开放telnet服务并允许root用户登录的方法如下:
1. 打开telnet服务
# svcadm enable telnet
svcadm是Solaris下最新的网络服务管理系统,要了解它和svcs的使用方法,请参见其帮助手册。
2. 开放root登录权限
修改/etc/default/login文件,注释掉其中的:
CONSOLE=/dev/console
3. 修改root用户的缺省shell为bash
修改/etc/passwd文件,将root用户的shell改为/usr/bin/bash
root:x:0:0:Super-User:/:/usr/bin/bash
不需要重新启动,你再试一试,哈哈,能够telnet登录了吧