硬盘神武连接服务器失败

神武连接服务器失败  时间:2021-04-14  阅读:()
知RAID固件升级SW_BundleAHS周锋2017-06-27发表某局点H3CFlexServerR390服务器阵列失败数据丢失的经验案例某局点一台H3CFlexServerR390服务器,安装有7块硬盘,其中6块硬盘做RAID10,1块硬盘配置成热备盘.
阵列失败,数据丢失,无法正常进入系统.
开机自检时能看到如下的告警信息:1792-Slot0DriveArray-ValidDataFoundinWrite-BackCache.
Datawillautomaticallybewrittentodrivearray.
1779-Slot0DriveArray-Replacementdrive(s)detectedORpreviouslyfaileddrive(s)nowappeartobeoperational:Port1I:Box2:Bay2Port2I:Box2:Bay5Logicaldrive(s)disabledduetopossibledataloss.
Select"F1"tocontinuewithlogicaldrive(s)disabledSelect"F2"toacceptdatalossandtore-enablelogicaldrive(s)(RESUME="F1"OR"F2"KEY)[default="F1"in45seconds]**TIMEDOUT**1716-Slot0DriveArray-UnrecoverableMediaErrorsDetectedonDrivesduringpreviousRebuildorBackgroundSurfaceAnalysis(ARM)scan.
Errorswillbefixedautomaticallywhenthesector(s)areoverwritten.
BackupandRestorerecommended.
分析日志发现问题如下:1.
IML记录有大量的介质错误,如下:Critical,1192,29197,0x0013,DriveArray,,,05/30/201709:10:00,4:InternalStorageEnclosureDeviceFailure(Bay5,Box2,Port2I,Slot0)Critical,1192,29231,0x0013,DriveArray,,,05/30/201709:10:00,5:InternalStorageEnclosureDeviceFailure(Bay2,Box2,Port1I,Slot0)Repaired,1192,29234,0x0013,DriveArray,,,05/30/201709:10:00,4:InternalStorageEnclosureDeviceFailure(Bay5,Box2,Port2I,Slot0)Repaired,1192,29274,0x0013,DriveArray,,,05/30/201709:10:00,5:InternalStorageEnclosureDeviceFailure(Bay2,Box2,Port1I,Slot0)Caution,1193,933,0x000A,POSTMessage,,,05/30/201711:03:00,6:POSTError:1792-SlotXDriveArray-ValidDataFoundinCacheModule.
Datawillautomaticallybewrittentodrivearray.
Caution,1193,934,0x000A,POSTMessage,,,05/30/201711:03:00,7:POSTError:1779-SlotXDriveArray-Replacementdrive(s)detectedORpreviouslyfaileddrive(s)nowappeartobeoperational.
Caution,1193,935,0x000A,POSTMessage,,,05/30/201711:03:00,8:POSTError:1716-SlotXDriveArray-UnrecoverableMediaErrorsDetectedonDrivesduringpreviousRebuildorBackgroundSurfaceAnalysis(ARM)scan.
Errorswillbefixedautomaticallywhenthesector(s)areoverwritten.
·2.
分析ADU日志能发现当前的阵列配置信息情况是使用P420i阵列卡将bay1-bay6硬盘配置RAID10,组建ArrayA,logicaldrive1;bay1和bay4;bay2和bay5;bay3和bay6组成RAID1组互为镜像,然后3个RAID1组再组成一个RAID0阵列.
bay7硬盘是做热备的,上面报错的bay2和bay5硬盘刚好在同一个RAID1组内,具体如下:BigDriveAssignmentMap0x3f0x000x000x000x000x000x000x000x000x000x000x000x000x000x000x00PositionDeviceStatus0PhysicalDrive(500GBSAS)1I:2:1Informational1PhysicalDrive(500GBSAS)1I:2:2Informational2PhysicalDrive(500GBSAS)1I:2:3Informational3PhysicalDrive(500GBSAS)1I:2:4Informational4PhysicalDrive(500GBSAS)2I:2:5Informational5PhysicalDrive(500GBSAS)2I:2:6InformationalFaultToleranceMode10(0x0002)SmartArrayP420iinEmbeddedSlot:SASArrayA:LogicalDrive1:Mirror/ParityGroupInformationPairedDrive0x00030x00040x00050x00000x00010x00020x00060x00070x00080x00090x000a0x000b0x000c0x000d0x000e0x000f0x00100x00110x00120x00130x00140x00150x00160x00170x00180x00190x001a0x001b0x001c0x001d0x001e0x001f0x00200x00210x00220x00230x00240x00250x00260x00270x00280x00290x002a0x002b0x002c0x002d0x002e0x002f0x00300x00310x00320x00330x00340x00350x00360x00370x00380x00390x003a0x003b0x003c0x003d0x003e0x003f0x00400x00410x00420x00430x00440x00450x00460x00470x00480x00490x004a0x004b0x004c0x004d0x004e0x004f0x00500x00510x00520x00530x00540x00550x00560x00570x00580x00590x005a0x005b0x005c0x005d0x005e0x005f0x00600x00610x00620x00630x00640x00650x00660x00670x00680x00690x006a0x006b0x006c0x006d0x006e0x006f0x00700x00710x00720x00730x00740x00750x00760x00770x00780x00790x007a0x007b0x007c0x007d0x007e0x007f0x00800x00810x00820x00830x00840x00850x00860x00870x00880x00890x008a0x008b0x008c0x008d0x008e0x008f0x00900x00910x00920x00930x00940x00950x00960x00970x00980x00990x009a0x009b0x009c0x009d0x009e0x009f0x00a00x00a10x00a20x00a30x00a40x00a50x00a60x00a70x00a80x00a90x00aa0x00ab0x00ac0x00ad0x00ae0x00af0x00b00x00b10x00b20x00b30x00b40x00b50x00b60x00b70x00b80x00b90x00ba0x00bb0x00bc0x00bd0x00be0x00bf0x00c00x00c10x00c20x00c30x00c40x00c50x00c60x00c70x00c80x00c90x00ca0x00cb0x00cc0x00cd0x00ce0x00cf0x00d00x00d10x00d20x00d30x00d40x00d50x00d60x00d70x00d80x00d90x00da0x00db0x00dc0x00dd0x00de0x00df0x00e00x00e10x00e20x00e30x00e40x00e50x00e60x00e70x00e80x00e90x00ea0x00eb0x00ec0x00ed0x00ee0x00ef0x00f00x00f10x00f20x00f30x00f40x00f50x00f60x00f70x00f80x00f90x00fa0x00fb0x00fc0x00fd0x00fe0x00ffPositionDeviceAssociationStatus0PhysicalDrive(500GBSAS)1I:2:1PhysicalDrive(500GBSAS)1I:2:4Informational1PhysicalDrive(500GBSAS)1I:2:2PhysicalDrive(500GBSAS)2I:2:5Informational2PhysicalDrive(500GBSAS)1I:2:3PhysicalDrive(500GBSAS)2I:2:6Informational3PhysicalDrive(500GBSAS)1I:2:4PhysicalDrive(500GBSAS)1I:2:1Informational4PhysicalDrive(500GBSAS)2I:2:5PhysicalDrive(500GBSAS)1I:2:2Informational5PhysicalDrive(500GBSAS)2I:2:6PhysicalDrive(500GBSAS)1I:2:3Informational6PhysicalDrive(500GBSAS)2I:2:7PhysicalDrive(500GBSAS)2I:2:7Informational3.
阵列失败的情况是bay5硬盘发现被拔掉,导致logicaldrive降级,不长时间bay2硬盘又有被拔掉的记录,由于bay2和bay5在同一个RAID1组内,同时和其他硬盘组成RAID10,所以导致阵列失败,逻辑驱动器失败,bay7这个热备盘也在随后被发现有拔除记录,具体如下:Critical,1192,29211,SmartArray,Physicaldriveremoved,,0x00,05/30/201709:10:03,[05/3010:45:21]Hot-plugdriveremoved,Port=2IBox=2Bay=5SN=9XF2L38300009411DFVHCritical,1192,29212,SmartArray,Physicaldrivefailure,,0x00,05/30/201709:10:03,[05/3010:45:21]Physicaldrivefailure,Port=2IBox=2Bay=5reason=0x14Caution,1192,29213,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:21]Statechange,logicaldrive0,newstate=DEGRADEDCaution,1192,29214,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:26]Statechange,logicaldrive0,newstate=NEEDS_REBUILDCaution,1192,29215,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:26]Statechange,logicaldrive0,newstate=REBUILDINGCaution,1192,29216,SmartArray,Physicaldriveinserted,,0x00,05/30/201709:10:03,[05/3010:45:43]Hot-plugdriveinserted,Port=2IBox=2Bay=5SN=9XF2L38300009411DFVHCaution,1192,29217,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:43]Statechange,logicaldrive0,newstate=NEEDS_REBUILDCritical,1192,29218,SmartArray,Physicaldriveremoved,,0x00,05/30/201709:10:03,[05/3010:45:43]Hot-plugdriveremoved,Port=1IBox=2Bay=2SN=9XF2L2JE000094141M37Critical,1192,29219,SmartArray,Physicaldrivefailure,,0x00,05/30/201709:10:03,[05/3010:45:43]Physicaldrivefailure,Port=1IBox=2Bay=2reason=0x14Caution,1192,29220,SmartArray,Logicaldriveexchangedmedia,,0x00,05/30/201709:10:03,[05/3010:45:43]Mediaexchangeddetected,logicaldrive0Caution,1192,29221,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:43]Statechange,logicaldrive0,newstate=FAILEDCaution,1192,29222,SmartArray,Rebuildcompletedespiteuncorrectablemediaerrors,,0x00,05/30/201709:10:03,[05/3010:45:45]RebuildURE,LDrv=0LBA=0x0005E3800-0x0005E4FFFCaution,1192,29239,SmartArray,Physicaldriveinserted,,0x00,05/30/201709:10:08,[05/3010:45:57]Hot-plugdriveinserted,Port=1IBox=2Bay=2SN=9XF2L2JE000094141M37Critical,1192,29314,SmartArray,Physicaldriveremoved,,0x00,05/30/201709:11:18,[05/3010:46:36]Hot-plugdriveremoved,Port=2IBox=2Bay=7SN=9XF2L2BM00009413GJFDCritical,1192,29315,SmartArray,Physicaldrivefailure,,0x00,05/30/201709:11:18,[05/3010:46:36]Physicaldrivefailure,Port=2IBox=2Bay=7reason=0x14Caution,1192,29316,SmartArray,Physicaldriveinserted,,0x00,05/30/201709:11:18,[05/3010:46:57]Hot-plugdriveinserted,Port=2IBox=2Bay=7SN=9XF2L2BM00009413GJFD4.
分析每块硬盘的M&P记录,发现2块硬盘(bay2,bay7)有读写/恢复错误,同时有指向硬盘背板的busfaults记录,1块硬盘(bay5)本身没有任何错误,只有busfaults记录,如下:SmartArrayP420iinEmbeddedSlot:InternalDriveCageatPort1I:Box2:PhysicalDrive(500GBSAS)1I:2:2:MonitorandPerformanceStatistics(SinceFactory)SerialNumber9XF2L2JE000094141M37FirmwareRevisionHPD8ProductRevisionHPMM0500FBFVQReferenceTime0x00156e40SectorsRead0x0000002195fb69f4ReadErrorsHard0x00000000ReadErrorsRetryRecovered0x00000000ReadErrorsECCCorrected0x0000000000000000SectorsWritten0x0000000078debd2bWriteErrorsHard0x00000000WriteErrorsRetryRecovered0x00000000SeekCount0xffffffffffffffffSeekErrors0xffffffffffffffffSpinCycles0x00000000SpinUpTime0x0000PerformanceTest10x0000PerformanceTest20xffffPerformanceTest30xffffPerformanceTest40xffffReallocationSectors0xffffffffReallocatedSectors0xffffffffDRQTimeOuts0xffffOtherTimeOuts0x0000DriveRebuildCount0(0x0000)SpinRetries65535(0xffff)RecoversFailedRead0x0002RecoversFailedWrite0x0000FormatErrors0x0000SelfTestFailures0xffffNotReadyFailures0x00000000RemapAbortFailures0xffffffffIRQDeglitchCount4294967295(0xffffffff)BusFaults0x00000016HotPlugCount1(0x00000001)TrackRewriteErrors0xffffWriteErrorsAfterRemap0x0000BackgroundFirmwareRevision0x000x000x000x000x000x000x000x00MediaFailures0x0000HardwareErrors0x0000AbortedCommandFailures0x0000SpinUpFailures0x0000BadTargetCount0(0x0000)PredictiveFailureErrors0x00000000SmartArrayP420iinEmbeddedSlot:InternalDriveCageatPort2I:Box2:PhysicalDrive(500GBSAS)2I:2:5:MonitorandPerformanceStatistics(SinceFactory)SerialNumber9XF2L38300009411DFVHFirmwareRevisionHPD8ProductRevisionHPMM0500FBFVQReferenceTime0x00156e40SectorsRead0x0000002193dd9f06ReadErrorsHard0x00000000ReadErrorsRetryRecovered0x00000000ReadErrorsECCCorrected0x0000000000000000SectorsWritten0x0000000078deb745WriteErrorsHard0x00000000WriteErrorsRetryRecovered0x00000000SeekCount0xffffffffffffffffSeekErrors0xffffffffffffffffSpinCycles0x00000000SpinUpTime0x0000PerformanceTest10x0000PerformanceTest20xffffPerformanceTest30xffffPerformanceTest40xffffReallocationSectors0xffffffffReallocatedSectors0xffffffffDRQTimeOuts0xffffOtherTimeOuts0x0000DriveRebuildCount0(0x0000)SpinRetries65535(0xffff)RecoversFailedRead0x0000RecoversFailedWrite0x0000FormatErrors0x0000SelfTestFailures0xffffNotReadyFailures0x00000000RemapAbortFailures0xffffffffIRQDeglitchCount4294967295(0xffffffff)BusFaults0x00000016HotPlugCount1(0x00000001)TrackRewriteErrors0xffffWriteErrorsAfterRemap0x0000BackgroundFirmwareRevision0x000x000x000x000x000x000x000x00MediaFailures0x0000HardwareErrors0x0000AbortedCommandFailures0x0000SpinUpFailures0x0000BadTargetCount0(0x0000)PredictiveFailureErrors0x00000000SmartArrayP420iinEmbeddedSlot:InternalDriveCageatPort2I:Box2:PhysicalDrive(500GBSAS)2I:2:7:MonitorandPerformanceStatistics(SinceFactory)SerialNumber9XF2L2BM00009413GJFDFirmwareRevisionHPD8ProductRevisionHPMM0500FBFVQReferenceTime0x00156e40SectorsRead0x000000000004056fReadErrorsHard0x00000001ReadErrorsRetryRecovered0x00000000ReadErrorsECCCorrected0x0000000000000000SectorsWritten0x0000000000234999WriteErrorsHard0x00000000WriteErrorsRetryRecovered0x00000000SeekCount0xffffffffffffffffSeekErrors0xffffffffffffffffSpinCycles0x00000000SpinUpTime0x0000PerformanceTest10x0000PerformanceTest20xffffPerformanceTest30xffffPerformanceTest40xffffReallocationSectors0xffffffffReallocatedSectors0xffffffffDRQTimeOuts0xffffOtherTimeOuts0x0000DriveRebuildCount0(0x0000)SpinRetries65535(0xffff)RecoversFailedRead0x0000RecoversFailedWrite0x0000FormatErrors0x0000SelfTestFailures0xffffNotReadyFailures0x00000000RemapAbortFailures0xffffffffIRQDeglitchCount4294967295(0xffffffff)BusFaults0x00000016HotPlugCount1(0x00000001)TrackRewriteErrors0xffffWriteErrorsAfterRemap0x0000BackgroundFirmwareRevision0x000x000x000x000x000x000x000x00MediaFailures0x0000HardwareErrors0x0000AbortedCommandFailures0x0000SpinUpFailures0x0000BadTargetCount0(0x0000)PredictiveFailureErrors0x000000005.
另外,发现阵列卡固件,BIOS和iLO4固件均偏低,如下:iLO(iLOAdvancedLicense)iLO4v2.
00p67builtonJul302014SystemROM02/10/2014SlotControllerSerial#VersionVersionVersionRevisionRevision0P420i0014380300131606.
001.
9001.
90.
002.
002140综上日志分析,若排除人为拔盘的操作,可以定位主要是硬盘背板的原因导致的阵列失败,同时可以确认2块硬盘(bay2,bay7)有问题,与bay2同一RAID1组的bay5硬盘没有硬件错误,bay7是热备盘,所以如果更换硬盘背板解决连接稳定性后阵列数据是没有丢失的.
1.
更换硬盘背板,然后先拔掉bay2和bay7问题硬盘(拔掉这两个硬盘对阵列数据完整性没有影响);2.
重启机器,然后重新激活阵列后能进入系统,做好数据备份;3.
同时更换掉bay2,bay7问题硬盘,然后使用最新的SWBundle更新机器固件.
1.
从日志中找到阵列失败的时间点和具体硬盘如何组成的阵列对分析问题十分有帮助;2.
针对阵列、存储、硬盘类问题需要收集全AHS和ADU日志;3.
硬盘M&P的记录对分析硬盘是否有硬件问题以及硬盘背板是否正常非常有用.

RAKsmart秒杀服务器$30/月,洛杉矶/圣何塞/香港/日本站群特价

RAKsmart发布了9月份优惠促销活动,从9月1日~9月30日期间,爆款美国服务器每日限量抢购最低$30.62-$46/月起,洛杉矶/圣何塞/香港/日本站群大量补货特价销售,美国1-10Gbps大带宽不限流量服务器低价热卖等。RAKsmart是一家华人运营的国外主机商,提供的产品包括独立服务器租用和VPS等,可选数据中心包括美国加州圣何塞、洛杉矶、中国香港、韩国、日本、荷兰等国家和地区数据中心(...

spinservers:圣何塞10Gbps带宽服务器月付$109起,可升级1Gbps无限流量

spinservers是Majestic Hosting Solutions LLC旗下站点,主营国外服务器租用和Hybrid Dedicated等,数据中心在美国达拉斯和圣何塞机房。目前,商家针对圣何塞部分独立服务器进行促销优惠,使用优惠码后Dual Intel Xeon E5-2650L V3(24核48线程)+64GB内存服务器每月仅109美元起,提供10Gbps端口带宽,可以升级至1Gbp...

青果网络618:洛杉矶CN2 GIA/东京CN2套餐年付199元起,国内高防独服套餐66折

青果网络怎么样?青果网络隶属于泉州市青果网络科技有限公司,青果网络商家成立于2015年4月1日,拥有工信部颁发的全网IDC/ISP/IP-VPN资质,是国内为数不多具有IDC/ISP双资质的综合型云计算服务商。青果网络是APNIC和CNNIC地址分配联盟成员,泉州市互联网协会会员单位,信誉非常有保障。目前,青果网络商家正式开启了618云特惠活动,针对国内外机房都有相应的优惠。点击进入:青果网络官方...

神武连接服务器失败为你推荐
filezillaserverfilezilla server interface怎么填波音737起飞爆胎美国737MAX又紧急迫降,为什么它还在飞?大飞资讯伯乐资讯是什么公司抢米网怎么用小米商城可以快速抢到手机!大侠们 帮帮忙!宜人贷官网宜人财富怎么样?2828商机网2828商机网的信息准确吗,可信度高吗什么是通配符什么是模糊查询?独立访客百度统计中访客数(UV)什么意思joomla教程有谁能给一份详细的popsub特效教程---------论坛勋章个人论坛的勋章从哪里弄
绍兴服务器租用 qq云存储 国外服务器网站 京东云擎 免费个人博客 搜狗12306抢票助手 北京双线 太原网通测速平台 国外ip加速器 smtp虚拟服务器 丽萨 卡巴斯基官网下载 googlevoice 开心online 贵州电信 腾讯云平台 godaddy域名 symantec 建站行业 卡巴斯基免费下载 更多