[描述]
用户机房电源维护,用户将E6900中所有Domain中的OS中断后,未按正常流程关闭整个平台,而是直接将AC电源物理切断,导致两个SSC在启动时挂起,错误日志如下:
…
POST Complete.
ERI Device Present
Getting MAC address for SSC0
busyWait() timeout waiting for RRDY, status=0x6027c008
busyWait() timeout waiting for RRDY, status=0x6027c008
busyWait() timeout waiting for RRDY, status=0x6027c008
busyWait() timeout waiting for WRDY, status=0x60278008
Cannot read ID board; using MAC address from TODNVRAM
MAC address is 0:3:ba:38:b6:3e
Hostname: E6900-SC0
Address: 192.168.10.1
Netmask: 255.255.255.0
Attached TCP/IP interface to eri unit 0
Attaching interface lo0...done
Gateway:
Timeout waiting for network driver (flags=0x8062)
Copyright 2001-2004 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
Sun Fire System Firmware
RTOS version: 38
ScApp version: 5.17.0
SC POST diag level: min
busyWait() timeout waiting for RRDY, status=0x6027c008
busyWait() timeout waiting for RRDY, status=0x6027c008
busyWait() timeout waiting for RRDY, status=0x6027c008
busyWait() timeout waiting for WRDY, status=0x60278008
The date is Sunday, March 19, 2006, 11:36:35 AM CST.
Mar 19 11:36:37 E6900-SC0 Platform.SC: Boot: ScApp 5.17.0, RTOS 38
Mar 19 11:36:39 E6900-SC0 Platform.SC: SBBC Reset Reason(s): Power On Reset
Mar 19 11:36:39 E6900-SC0 Platform.SC: Initializing the SC SRAM
Mar 19 11:36:42 E6900-SC0 Platform.SC: ERROR: Boot Failure, ID0 not installed
Mar 19 11:36:43 E6900-SC0 Platform.SC: ERROR: Boot Failure, ID0 not installed
ERROR: PSU type not recognized, use A166
Mar 19 11:36:44 E6900-SC0 Platform.SC: SCC is inaccessible - please insert valid SCC
[分析与解决方法]
从错误日志来看,似乎SSC不能获取到存放在ID Board中的一些重要配置信息而导致系统误认为SSC板无效,尝试将E6900的整个平台power cycle,并将SSC0板拔出再重新插入一次,以上错误仍然存在,所以确认是E6900 ID板故障,有可能是因为用户的非正常关机导致该板被损坏.
在更换新的ID板(PN:501-5880)后,故障消除,系统恢复正常.
[ID板更换步骤]
step1. 将E6900平台物理关电.
step2. 由于E6900的ID板位于中心板上,所以需要先将IB9移除,更换ID板.
step3. 加电系统,SSC识别到ID板已被更换,提示以下信息:
It appears that the ID Board has been replaced.
Please confirm the ID information:
(Model, System Serial Number, Mac Address Domain A, HostID Domain A, COD Status)
Sun Fire E4900, 0419HH20BF, 00:03:ba:07:c6:94, 8307c694, non-COD
Is the information above correct? (yes/no): no
There is no ID information for this system.
Please enter System Serial Number: 0421AK20FD
Please enter the model number (3800/4800/4810/6800/E4900/E6900): E6900
MAC address from the sticker on chassis: 00:03:ba:38:b6:3a (指Domain A的MAC地址)
Host ID from the sticker on chassis: 8338b63a (指Domain A的hostid)
Is COD (Capacity on Demand) system? (yes/no): no
Mar 20 08:29:05 E6900-SC0 Platform.SC: Caching ID information (SSC自动缓存一份配置信息)
Mar 20 08:29:05 E6900-SC0 Platform.SC: Programming ID Board
Mar 20 08:29:07 E6900-SC0 Platform.SC: Clock Source: 75MHz
Mar 20 08:29:52 E6900-SC0 Platform.SC: Chassis is in dual partition mode.
由于用户直接将E4900中的ID板拆卸用在E6900上,所以SSC上缓存的配置数据错误,不能直接从SSC同步数据到新的ID板上,只能手工写入这些配置信息.
当手工写入这些信息时,只需要提供Domain A的MAC地址和hostid信息,其它域和SC的MAC地址和hostid将根据某种规则计算出来(详细请参考”域MAC地址计算”一节)
[注意事项]
1.ID板中配置信息只能被写入一次,已写入的数据无法再次被擦写,所以在更换ID板时,必须确保将所有的重要配置信息一次写入成功.
2.虽然E4900/E6900中的ID板的PN号一样,但不能将E4900中的ID板直接使用在E6900上,因为ID板只能被写一次,不能再被擦写,只能用一块完全新的ID板(PN:501-5880)来代替损坏的ID板.
3.在SSC正常启动后,所有存储在ID板中的配置信息将自动被缓存在SSC中,所以在更换新的ID板后(其PN号必须为:501-5880),系统将能够识别到该板为新更换的ID板,并询问用户是否将缓存中的配置信息自动同步到新的ID板中,如果确认SSC中的配置信息完全正确,用户可选择自动同步数据,否则,必须手工提供这些配置数据.
[背景知识--关于ID板]
ID板位于中心板上,属于中心板的子板.该板包含了一个SEPPROM芯片,存储了以下信息:
- 主机的Chassis ID
- 主机的SN和hostid
- 每个域的MAC地址;对于SF6800,有6个MAC地址;3800,4810,4800则有4个.
对于SF3800,ID板被集成在中心板上,不能被单独移除(不是一个可更换件),如果需要更换ID板,则必须同时更换中心板,并且需要手工更新SEPPROM中的内容.而SF4800/E4900和SF6800/E6900中的ID板更换则不需要手工更新SEPPROM中的内容,它会自动更新.
[域MAC地址计算]
域的MAC地址被存储在ID板的EEPROM芯片中,根据不同的硬件存储的MAC地址数而有所不同,例如SF6800最多支持划分4个域,所以6800的ID板中包含4个域的MAC地址.
由于系统控制板SC的MAC地址已经确定,通过SC的MAC地址和下表便可确定每个域的MAC地址.
3800,4800,4810 6800/6900
--------------- ----------
Domain A = Base MAC address Domain A = Base MAC address
Domain B = Base MAC address + 1 Domain B = Base MAC address + 1
SC0 = Base MAC address + 2 Domain C = Base MAC address + 2
SC1 = Base MAC address + 3 Domain D = Base MAC address + 3
SC0 = Base MAC address + 4
SC1 = Base MAC address + 5
[HostID值计算]
HostID值根据域的MAC地址得到,其格式如下:
80xxxxxx
其中后6位取域MAC地址的后6位.
例如域A的MAC地址为: 08:00:20:d8:86:53,则该域的hostid为:80d88653.
阅读(3679) | 评论(0) | 转发(0) |