硬盘神武连接服务器失败

神武连接服务器失败  时间:2021-04-14  阅读:()
知RAID固件升级SW_BundleAHS周锋2017-06-27发表某局点H3CFlexServerR390服务器阵列失败数据丢失的经验案例某局点一台H3CFlexServerR390服务器,安装有7块硬盘,其中6块硬盘做RAID10,1块硬盘配置成热备盘.
阵列失败,数据丢失,无法正常进入系统.
开机自检时能看到如下的告警信息:1792-Slot0DriveArray-ValidDataFoundinWrite-BackCache.
Datawillautomaticallybewrittentodrivearray.
1779-Slot0DriveArray-Replacementdrive(s)detectedORpreviouslyfaileddrive(s)nowappeartobeoperational:Port1I:Box2:Bay2Port2I:Box2:Bay5Logicaldrive(s)disabledduetopossibledataloss.
Select"F1"tocontinuewithlogicaldrive(s)disabledSelect"F2"toacceptdatalossandtore-enablelogicaldrive(s)(RESUME="F1"OR"F2"KEY)[default="F1"in45seconds]**TIMEDOUT**1716-Slot0DriveArray-UnrecoverableMediaErrorsDetectedonDrivesduringpreviousRebuildorBackgroundSurfaceAnalysis(ARM)scan.
Errorswillbefixedautomaticallywhenthesector(s)areoverwritten.
BackupandRestorerecommended.
分析日志发现问题如下:1.
IML记录有大量的介质错误,如下:Critical,1192,29197,0x0013,DriveArray,,,05/30/201709:10:00,4:InternalStorageEnclosureDeviceFailure(Bay5,Box2,Port2I,Slot0)Critical,1192,29231,0x0013,DriveArray,,,05/30/201709:10:00,5:InternalStorageEnclosureDeviceFailure(Bay2,Box2,Port1I,Slot0)Repaired,1192,29234,0x0013,DriveArray,,,05/30/201709:10:00,4:InternalStorageEnclosureDeviceFailure(Bay5,Box2,Port2I,Slot0)Repaired,1192,29274,0x0013,DriveArray,,,05/30/201709:10:00,5:InternalStorageEnclosureDeviceFailure(Bay2,Box2,Port1I,Slot0)Caution,1193,933,0x000A,POSTMessage,,,05/30/201711:03:00,6:POSTError:1792-SlotXDriveArray-ValidDataFoundinCacheModule.
Datawillautomaticallybewrittentodrivearray.
Caution,1193,934,0x000A,POSTMessage,,,05/30/201711:03:00,7:POSTError:1779-SlotXDriveArray-Replacementdrive(s)detectedORpreviouslyfaileddrive(s)nowappeartobeoperational.
Caution,1193,935,0x000A,POSTMessage,,,05/30/201711:03:00,8:POSTError:1716-SlotXDriveArray-UnrecoverableMediaErrorsDetectedonDrivesduringpreviousRebuildorBackgroundSurfaceAnalysis(ARM)scan.
Errorswillbefixedautomaticallywhenthesector(s)areoverwritten.
·2.
分析ADU日志能发现当前的阵列配置信息情况是使用P420i阵列卡将bay1-bay6硬盘配置RAID10,组建ArrayA,logicaldrive1;bay1和bay4;bay2和bay5;bay3和bay6组成RAID1组互为镜像,然后3个RAID1组再组成一个RAID0阵列.
bay7硬盘是做热备的,上面报错的bay2和bay5硬盘刚好在同一个RAID1组内,具体如下:BigDriveAssignmentMap0x3f0x000x000x000x000x000x000x000x000x000x000x000x000x000x000x00PositionDeviceStatus0PhysicalDrive(500GBSAS)1I:2:1Informational1PhysicalDrive(500GBSAS)1I:2:2Informational2PhysicalDrive(500GBSAS)1I:2:3Informational3PhysicalDrive(500GBSAS)1I:2:4Informational4PhysicalDrive(500GBSAS)2I:2:5Informational5PhysicalDrive(500GBSAS)2I:2:6InformationalFaultToleranceMode10(0x0002)SmartArrayP420iinEmbeddedSlot:SASArrayA:LogicalDrive1:Mirror/ParityGroupInformationPairedDrive0x00030x00040x00050x00000x00010x00020x00060x00070x00080x00090x000a0x000b0x000c0x000d0x000e0x000f0x00100x00110x00120x00130x00140x00150x00160x00170x00180x00190x001a0x001b0x001c0x001d0x001e0x001f0x00200x00210x00220x00230x00240x00250x00260x00270x00280x00290x002a0x002b0x002c0x002d0x002e0x002f0x00300x00310x00320x00330x00340x00350x00360x00370x00380x00390x003a0x003b0x003c0x003d0x003e0x003f0x00400x00410x00420x00430x00440x00450x00460x00470x00480x00490x004a0x004b0x004c0x004d0x004e0x004f0x00500x00510x00520x00530x00540x00550x00560x00570x00580x00590x005a0x005b0x005c0x005d0x005e0x005f0x00600x00610x00620x00630x00640x00650x00660x00670x00680x00690x006a0x006b0x006c0x006d0x006e0x006f0x00700x00710x00720x00730x00740x00750x00760x00770x00780x00790x007a0x007b0x007c0x007d0x007e0x007f0x00800x00810x00820x00830x00840x00850x00860x00870x00880x00890x008a0x008b0x008c0x008d0x008e0x008f0x00900x00910x00920x00930x00940x00950x00960x00970x00980x00990x009a0x009b0x009c0x009d0x009e0x009f0x00a00x00a10x00a20x00a30x00a40x00a50x00a60x00a70x00a80x00a90x00aa0x00ab0x00ac0x00ad0x00ae0x00af0x00b00x00b10x00b20x00b30x00b40x00b50x00b60x00b70x00b80x00b90x00ba0x00bb0x00bc0x00bd0x00be0x00bf0x00c00x00c10x00c20x00c30x00c40x00c50x00c60x00c70x00c80x00c90x00ca0x00cb0x00cc0x00cd0x00ce0x00cf0x00d00x00d10x00d20x00d30x00d40x00d50x00d60x00d70x00d80x00d90x00da0x00db0x00dc0x00dd0x00de0x00df0x00e00x00e10x00e20x00e30x00e40x00e50x00e60x00e70x00e80x00e90x00ea0x00eb0x00ec0x00ed0x00ee0x00ef0x00f00x00f10x00f20x00f30x00f40x00f50x00f60x00f70x00f80x00f90x00fa0x00fb0x00fc0x00fd0x00fe0x00ffPositionDeviceAssociationStatus0PhysicalDrive(500GBSAS)1I:2:1PhysicalDrive(500GBSAS)1I:2:4Informational1PhysicalDrive(500GBSAS)1I:2:2PhysicalDrive(500GBSAS)2I:2:5Informational2PhysicalDrive(500GBSAS)1I:2:3PhysicalDrive(500GBSAS)2I:2:6Informational3PhysicalDrive(500GBSAS)1I:2:4PhysicalDrive(500GBSAS)1I:2:1Informational4PhysicalDrive(500GBSAS)2I:2:5PhysicalDrive(500GBSAS)1I:2:2Informational5PhysicalDrive(500GBSAS)2I:2:6PhysicalDrive(500GBSAS)1I:2:3Informational6PhysicalDrive(500GBSAS)2I:2:7PhysicalDrive(500GBSAS)2I:2:7Informational3.
阵列失败的情况是bay5硬盘发现被拔掉,导致logicaldrive降级,不长时间bay2硬盘又有被拔掉的记录,由于bay2和bay5在同一个RAID1组内,同时和其他硬盘组成RAID10,所以导致阵列失败,逻辑驱动器失败,bay7这个热备盘也在随后被发现有拔除记录,具体如下:Critical,1192,29211,SmartArray,Physicaldriveremoved,,0x00,05/30/201709:10:03,[05/3010:45:21]Hot-plugdriveremoved,Port=2IBox=2Bay=5SN=9XF2L38300009411DFVHCritical,1192,29212,SmartArray,Physicaldrivefailure,,0x00,05/30/201709:10:03,[05/3010:45:21]Physicaldrivefailure,Port=2IBox=2Bay=5reason=0x14Caution,1192,29213,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:21]Statechange,logicaldrive0,newstate=DEGRADEDCaution,1192,29214,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:26]Statechange,logicaldrive0,newstate=NEEDS_REBUILDCaution,1192,29215,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:26]Statechange,logicaldrive0,newstate=REBUILDINGCaution,1192,29216,SmartArray,Physicaldriveinserted,,0x00,05/30/201709:10:03,[05/3010:45:43]Hot-plugdriveinserted,Port=2IBox=2Bay=5SN=9XF2L38300009411DFVHCaution,1192,29217,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:43]Statechange,logicaldrive0,newstate=NEEDS_REBUILDCritical,1192,29218,SmartArray,Physicaldriveremoved,,0x00,05/30/201709:10:03,[05/3010:45:43]Hot-plugdriveremoved,Port=1IBox=2Bay=2SN=9XF2L2JE000094141M37Critical,1192,29219,SmartArray,Physicaldrivefailure,,0x00,05/30/201709:10:03,[05/3010:45:43]Physicaldrivefailure,Port=1IBox=2Bay=2reason=0x14Caution,1192,29220,SmartArray,Logicaldriveexchangedmedia,,0x00,05/30/201709:10:03,[05/3010:45:43]Mediaexchangeddetected,logicaldrive0Caution,1192,29221,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:43]Statechange,logicaldrive0,newstate=FAILEDCaution,1192,29222,SmartArray,Rebuildcompletedespiteuncorrectablemediaerrors,,0x00,05/30/201709:10:03,[05/3010:45:45]RebuildURE,LDrv=0LBA=0x0005E3800-0x0005E4FFFCaution,1192,29239,SmartArray,Physicaldriveinserted,,0x00,05/30/201709:10:08,[05/3010:45:57]Hot-plugdriveinserted,Port=1IBox=2Bay=2SN=9XF2L2JE000094141M37Critical,1192,29314,SmartArray,Physicaldriveremoved,,0x00,05/30/201709:11:18,[05/3010:46:36]Hot-plugdriveremoved,Port=2IBox=2Bay=7SN=9XF2L2BM00009413GJFDCritical,1192,29315,SmartArray,Physicaldrivefailure,,0x00,05/30/201709:11:18,[05/3010:46:36]Physicaldrivefailure,Port=2IBox=2Bay=7reason=0x14Caution,1192,29316,SmartArray,Physicaldriveinserted,,0x00,05/30/201709:11:18,[05/3010:46:57]Hot-plugdriveinserted,Port=2IBox=2Bay=7SN=9XF2L2BM00009413GJFD4.
分析每块硬盘的M&P记录,发现2块硬盘(bay2,bay7)有读写/恢复错误,同时有指向硬盘背板的busfaults记录,1块硬盘(bay5)本身没有任何错误,只有busfaults记录,如下:SmartArrayP420iinEmbeddedSlot:InternalDriveCageatPort1I:Box2:PhysicalDrive(500GBSAS)1I:2:2:MonitorandPerformanceStatistics(SinceFactory)SerialNumber9XF2L2JE000094141M37FirmwareRevisionHPD8ProductRevisionHPMM0500FBFVQReferenceTime0x00156e40SectorsRead0x0000002195fb69f4ReadErrorsHard0x00000000ReadErrorsRetryRecovered0x00000000ReadErrorsECCCorrected0x0000000000000000SectorsWritten0x0000000078debd2bWriteErrorsHard0x00000000WriteErrorsRetryRecovered0x00000000SeekCount0xffffffffffffffffSeekErrors0xffffffffffffffffSpinCycles0x00000000SpinUpTime0x0000PerformanceTest10x0000PerformanceTest20xffffPerformanceTest30xffffPerformanceTest40xffffReallocationSectors0xffffffffReallocatedSectors0xffffffffDRQTimeOuts0xffffOtherTimeOuts0x0000DriveRebuildCount0(0x0000)SpinRetries65535(0xffff)RecoversFailedRead0x0002RecoversFailedWrite0x0000FormatErrors0x0000SelfTestFailures0xffffNotReadyFailures0x00000000RemapAbortFailures0xffffffffIRQDeglitchCount4294967295(0xffffffff)BusFaults0x00000016HotPlugCount1(0x00000001)TrackRewriteErrors0xffffWriteErrorsAfterRemap0x0000BackgroundFirmwareRevision0x000x000x000x000x000x000x000x00MediaFailures0x0000HardwareErrors0x0000AbortedCommandFailures0x0000SpinUpFailures0x0000BadTargetCount0(0x0000)PredictiveFailureErrors0x00000000SmartArrayP420iinEmbeddedSlot:InternalDriveCageatPort2I:Box2:PhysicalDrive(500GBSAS)2I:2:5:MonitorandPerformanceStatistics(SinceFactory)SerialNumber9XF2L38300009411DFVHFirmwareRevisionHPD8ProductRevisionHPMM0500FBFVQReferenceTime0x00156e40SectorsRead0x0000002193dd9f06ReadErrorsHard0x00000000ReadErrorsRetryRecovered0x00000000ReadErrorsECCCorrected0x0000000000000000SectorsWritten0x0000000078deb745WriteErrorsHard0x00000000WriteErrorsRetryRecovered0x00000000SeekCount0xffffffffffffffffSeekErrors0xffffffffffffffffSpinCycles0x00000000SpinUpTime0x0000PerformanceTest10x0000PerformanceTest20xffffPerformanceTest30xffffPerformanceTest40xffffReallocationSectors0xffffffffReallocatedSectors0xffffffffDRQTimeOuts0xffffOtherTimeOuts0x0000DriveRebuildCount0(0x0000)SpinRetries65535(0xffff)RecoversFailedRead0x0000RecoversFailedWrite0x0000FormatErrors0x0000SelfTestFailures0xffffNotReadyFailures0x00000000RemapAbortFailures0xffffffffIRQDeglitchCount4294967295(0xffffffff)BusFaults0x00000016HotPlugCount1(0x00000001)TrackRewriteErrors0xffffWriteErrorsAfterRemap0x0000BackgroundFirmwareRevision0x000x000x000x000x000x000x000x00MediaFailures0x0000HardwareErrors0x0000AbortedCommandFailures0x0000SpinUpFailures0x0000BadTargetCount0(0x0000)PredictiveFailureErrors0x00000000SmartArrayP420iinEmbeddedSlot:InternalDriveCageatPort2I:Box2:PhysicalDrive(500GBSAS)2I:2:7:MonitorandPerformanceStatistics(SinceFactory)SerialNumber9XF2L2BM00009413GJFDFirmwareRevisionHPD8ProductRevisionHPMM0500FBFVQReferenceTime0x00156e40SectorsRead0x000000000004056fReadErrorsHard0x00000001ReadErrorsRetryRecovered0x00000000ReadErrorsECCCorrected0x0000000000000000SectorsWritten0x0000000000234999WriteErrorsHard0x00000000WriteErrorsRetryRecovered0x00000000SeekCount0xffffffffffffffffSeekErrors0xffffffffffffffffSpinCycles0x00000000SpinUpTime0x0000PerformanceTest10x0000PerformanceTest20xffffPerformanceTest30xffffPerformanceTest40xffffReallocationSectors0xffffffffReallocatedSectors0xffffffffDRQTimeOuts0xffffOtherTimeOuts0x0000DriveRebuildCount0(0x0000)SpinRetries65535(0xffff)RecoversFailedRead0x0000RecoversFailedWrite0x0000FormatErrors0x0000SelfTestFailures0xffffNotReadyFailures0x00000000RemapAbortFailures0xffffffffIRQDeglitchCount4294967295(0xffffffff)BusFaults0x00000016HotPlugCount1(0x00000001)TrackRewriteErrors0xffffWriteErrorsAfterRemap0x0000BackgroundFirmwareRevision0x000x000x000x000x000x000x000x00MediaFailures0x0000HardwareErrors0x0000AbortedCommandFailures0x0000SpinUpFailures0x0000BadTargetCount0(0x0000)PredictiveFailureErrors0x000000005.
另外,发现阵列卡固件,BIOS和iLO4固件均偏低,如下:iLO(iLOAdvancedLicense)iLO4v2.
00p67builtonJul302014SystemROM02/10/2014SlotControllerSerial#VersionVersionVersionRevisionRevision0P420i0014380300131606.
001.
9001.
90.
002.
002140综上日志分析,若排除人为拔盘的操作,可以定位主要是硬盘背板的原因导致的阵列失败,同时可以确认2块硬盘(bay2,bay7)有问题,与bay2同一RAID1组的bay5硬盘没有硬件错误,bay7是热备盘,所以如果更换硬盘背板解决连接稳定性后阵列数据是没有丢失的.
1.
更换硬盘背板,然后先拔掉bay2和bay7问题硬盘(拔掉这两个硬盘对阵列数据完整性没有影响);2.
重启机器,然后重新激活阵列后能进入系统,做好数据备份;3.
同时更换掉bay2,bay7问题硬盘,然后使用最新的SWBundle更新机器固件.
1.
从日志中找到阵列失败的时间点和具体硬盘如何组成的阵列对分析问题十分有帮助;2.
针对阵列、存储、硬盘类问题需要收集全AHS和ADU日志;3.
硬盘M&P的记录对分析硬盘是否有硬件问题以及硬盘背板是否正常非常有用.

pacificrack:$12/年-1G内存/1核/20gSSD/500g流量/1Gbps带宽

pacificrack在最新的7月促销里面增加了2个更加便宜的,一个月付1.5美元,一个年付12美元,带宽都是1Gbps。整个系列都是PR-M,也就是魔方的后台管理。2G内存起步的支持Windows 7、10、Server 2003\2008\2012\2016\2019以及常规版本的Linux!官方网站:https://pacificrack.com支持PayPal、支付宝等方式付款7月秒杀VP...

TmhHost暑假活动:高端线路VPS季付8折优惠,可选洛杉矶CN2 GIA/日本软银/香港三网CN2 GIA/韩国双向CN2等

tmhhost怎么样?tmhhost正在搞暑假大促销活动,全部是高端线路VPS,现在直接季付8折优惠,活动截止时间是8月31日。可选机房及线路有美国洛杉矶cn2 gia+200G高防、洛杉矶三网CN2 GIA、洛杉矶CERA机房CN2 GIA,日本软银(100M带宽)、香港BGP直连200M带宽、香港三网CN2 GIA、韩国双向CN2。点击进入:tmhhost官方网站地址tmhhost优惠码:Tm...

Pacificrack:新增三款超级秒杀套餐/洛杉矶QN机房/1Gbps月流量1TB/年付仅7美刀

PacificRack最近促销上瘾了,活动频繁,接二连三的追加便宜VPS秒杀,PacificRack在 7月中下旬已经推出了五款秒杀VPS套餐,现在商家又新增了三款更便宜的特价套餐,年付低至7.2美元,这已经是本月第三波促销,带宽都是1Gbps。PacificRack 7月秒杀VPS整个系列都是PR-M,也就是魔方的后台管理。2G内存起步的支持Windows 7、10、Server 2003\20...

神武连接服务器失败为你推荐
支持ipad特朗普吐槽iPhone为什么那么多人吐槽iphone客服电话赶集网客服电话是多少抢米网会知道怎样抢小米的请进来说一下。瑞东集团中粮集团主要生产什么的?是国企么佛山海虹海虹蒸多长时间香港空间香港有哪些购物场所商务软件什么是商业软件?建站无忧求好点的免费建站网财务单据会计的单据怎么写
域名申请 虚拟主机评测网 太原域名注册 免费试用vps 如何注销域名备案 idc评测网 网站保姆 阿里云代金券 云主机51web 免费全能空间 怎样建立邮箱 hkg 100m独享 绍兴电信 能外链的相册 卡巴斯基是免费的吗 新世界服务器 如何建立邮箱 江苏双线服务器 双线asp空间 更多