==========================background======================
Background:
1.printer server 10.40.1.13 windows2003
2.backup server 10.40.1.25 windows2003
3.printer server also map a lun from Netapp Storage.
User reported that the local printer exist many print process can not be canceled. so we restart the printer server.
Then we found the lun from Netapp Storage can not be accessed.
We checked the lun, the lun property show us the free space and used space are all zero.
##summarized in one simple sentence##
we restarted the printer server, then we found the lun can not be accessed.
#####################################
=========================impact===========================
User could not access their data files on the LUN
=========================root cause=======================
The LUN disk partition information was damaged
========================process===========================
1.check
We checked the ISCSI initiator status, the ISCSI initiator connected to Netapp Storage was normal.
We checked the LUN, it could not be accessed, we checked the LUN properties, found the LUN's used space and free space were both zero.
-----------------------------------------------------------
2.try to create a clone LUN and unmaped current LUN, mapped the clone lun and accessed it.
Below were the commands and steps:
1.created snapshots. (snapshots name is 20111228)
suzfas3020> snap create archive 20111228
2.created clone lun (clone lun name is suz_clone)
suzfas3020> lun clone create /vol/archive/printerfiles/suz_clone -o noreserve -b /vol/archive/printerfiles/suz1 20111228
3.unmapped current lun and map the clone lun, try to access the clone lun by windows 2003 server.
suzfas3020> lun unmap /vol/archive/printerfiles/suz1 suz
suzfas3020> lun map /vol/archive/printerfiles/suz_clone suz
We accessed the clone Lun by windows server 2003, it was the same error. Because the current LUN was not normal, so the clone LUN status was not normal either. It could not be soloved.
--------------------------------------------------------------
3.
Because clone LUN could not be accessed, we destroyed the clone LUN and mapped the current LUN back.
1.destroyed the clone lun
suzfas3020> lun unmap /vol/archive/printerfiles/suz_clone suz
suzfas3020> lun offline /vol/archive/printerfiles/suz_clone
suzfas3020> lun destroy /vol/archive/printerfiles/suz_clone
2.mapped the current lun to initial status.
suzfas3020> lun map /vol/archive/printerfiles/suz1 suz
3.checked the status,.
suzfas3020> lun show
/vol/archive/printerfiles/suz1 500.0g (536905152000) (r/w, online, mapped).
---------------------------------------------------------------
4.connect the lun to backup server to eliminate the lun or printer server issue.
We connected the LUN to another server (10.40.1.25) and tried to access the LUN.
We could not access the LUN either.
---------------------------------------------------------------
5. Called NETAPP support for help
1.collected the current system information for NETAPP engineer.
suzfas3020> lun show -v
suzfas3020> df -r
suzfas3020> igroup show -v
From above information:
Netapp support judged the LUN status was normal.
They explained that the LUN on Netapp Storage was just a file. The file size was 500GB. And they could not access into the LUN to check the data files on Netapp Storage.
After Netapp Storage provided the LUN to the client.
The client(windows 2003) will initialize and format the LUN then use the LUN like a raw disk.
The LUN partition information is on the LUN.
Now, we can find the LUN, but can not access the data files on the LUN.
It should be the LUN partition has been damaged.
How come the disk partition was damaged?
It maybe below three reasons:
1.Network card issue or network issue result in ISCSI commad issue. ISCSI command issue result in the LUN partition issue. Because LUN receive ISCSI Input/Output command when the LUN is running.
2.ISCSI Initiator software issue result in ISCSI Input/Output command isse, ISCSI Input/Output command issue result in the LUN partition information issue.
3.Abnormal reboot, shutdown or power failure result in LUN partition issue.
At last , Netapp Support adviced that we should find some recover tools to find the data files back on windows system level.
They advised us to provide the issue to Microsoft. We called Microsoft 800, but ISCSI Initiate is a free software,no support for it.
6.
We tried to search the third-party tools to recover data on windows system level
We find a recover tool, it's name is R-studio.
We use the R-studio software to scan the LUN and recover the data files to local disk volume D.
Note1.
disk partition infomation location:
1. each disk partition infomation on each disk.
so, the lun partition infomation on the lun.
the windows 2003 disk partition information on the server disk.
the os can manage the partition on the first block MBR,but each disk partition on each disk.
阅读(1833) | 评论(0) | 转发(0) |