硬盘神武连接服务器失败

神武连接服务器失败  时间:2021-04-14  阅读:()
知RAID固件升级SW_BundleAHS周锋2017-06-27发表某局点H3CFlexServerR390服务器阵列失败数据丢失的经验案例某局点一台H3CFlexServerR390服务器,安装有7块硬盘,其中6块硬盘做RAID10,1块硬盘配置成热备盘.
阵列失败,数据丢失,无法正常进入系统.
开机自检时能看到如下的告警信息:1792-Slot0DriveArray-ValidDataFoundinWrite-BackCache.
Datawillautomaticallybewrittentodrivearray.
1779-Slot0DriveArray-Replacementdrive(s)detectedORpreviouslyfaileddrive(s)nowappeartobeoperational:Port1I:Box2:Bay2Port2I:Box2:Bay5Logicaldrive(s)disabledduetopossibledataloss.
Select"F1"tocontinuewithlogicaldrive(s)disabledSelect"F2"toacceptdatalossandtore-enablelogicaldrive(s)(RESUME="F1"OR"F2"KEY)[default="F1"in45seconds]**TIMEDOUT**1716-Slot0DriveArray-UnrecoverableMediaErrorsDetectedonDrivesduringpreviousRebuildorBackgroundSurfaceAnalysis(ARM)scan.
Errorswillbefixedautomaticallywhenthesector(s)areoverwritten.
BackupandRestorerecommended.
分析日志发现问题如下:1.
IML记录有大量的介质错误,如下:Critical,1192,29197,0x0013,DriveArray,,,05/30/201709:10:00,4:InternalStorageEnclosureDeviceFailure(Bay5,Box2,Port2I,Slot0)Critical,1192,29231,0x0013,DriveArray,,,05/30/201709:10:00,5:InternalStorageEnclosureDeviceFailure(Bay2,Box2,Port1I,Slot0)Repaired,1192,29234,0x0013,DriveArray,,,05/30/201709:10:00,4:InternalStorageEnclosureDeviceFailure(Bay5,Box2,Port2I,Slot0)Repaired,1192,29274,0x0013,DriveArray,,,05/30/201709:10:00,5:InternalStorageEnclosureDeviceFailure(Bay2,Box2,Port1I,Slot0)Caution,1193,933,0x000A,POSTMessage,,,05/30/201711:03:00,6:POSTError:1792-SlotXDriveArray-ValidDataFoundinCacheModule.
Datawillautomaticallybewrittentodrivearray.
Caution,1193,934,0x000A,POSTMessage,,,05/30/201711:03:00,7:POSTError:1779-SlotXDriveArray-Replacementdrive(s)detectedORpreviouslyfaileddrive(s)nowappeartobeoperational.
Caution,1193,935,0x000A,POSTMessage,,,05/30/201711:03:00,8:POSTError:1716-SlotXDriveArray-UnrecoverableMediaErrorsDetectedonDrivesduringpreviousRebuildorBackgroundSurfaceAnalysis(ARM)scan.
Errorswillbefixedautomaticallywhenthesector(s)areoverwritten.
·2.
分析ADU日志能发现当前的阵列配置信息情况是使用P420i阵列卡将bay1-bay6硬盘配置RAID10,组建ArrayA,logicaldrive1;bay1和bay4;bay2和bay5;bay3和bay6组成RAID1组互为镜像,然后3个RAID1组再组成一个RAID0阵列.
bay7硬盘是做热备的,上面报错的bay2和bay5硬盘刚好在同一个RAID1组内,具体如下:BigDriveAssignmentMap0x3f0x000x000x000x000x000x000x000x000x000x000x000x000x000x000x00PositionDeviceStatus0PhysicalDrive(500GBSAS)1I:2:1Informational1PhysicalDrive(500GBSAS)1I:2:2Informational2PhysicalDrive(500GBSAS)1I:2:3Informational3PhysicalDrive(500GBSAS)1I:2:4Informational4PhysicalDrive(500GBSAS)2I:2:5Informational5PhysicalDrive(500GBSAS)2I:2:6InformationalFaultToleranceMode10(0x0002)SmartArrayP420iinEmbeddedSlot:SASArrayA:LogicalDrive1:Mirror/ParityGroupInformationPairedDrive0x00030x00040x00050x00000x00010x00020x00060x00070x00080x00090x000a0x000b0x000c0x000d0x000e0x000f0x00100x00110x00120x00130x00140x00150x00160x00170x00180x00190x001a0x001b0x001c0x001d0x001e0x001f0x00200x00210x00220x00230x00240x00250x00260x00270x00280x00290x002a0x002b0x002c0x002d0x002e0x002f0x00300x00310x00320x00330x00340x00350x00360x00370x00380x00390x003a0x003b0x003c0x003d0x003e0x003f0x00400x00410x00420x00430x00440x00450x00460x00470x00480x00490x004a0x004b0x004c0x004d0x004e0x004f0x00500x00510x00520x00530x00540x00550x00560x00570x00580x00590x005a0x005b0x005c0x005d0x005e0x005f0x00600x00610x00620x00630x00640x00650x00660x00670x00680x00690x006a0x006b0x006c0x006d0x006e0x006f0x00700x00710x00720x00730x00740x00750x00760x00770x00780x00790x007a0x007b0x007c0x007d0x007e0x007f0x00800x00810x00820x00830x00840x00850x00860x00870x00880x00890x008a0x008b0x008c0x008d0x008e0x008f0x00900x00910x00920x00930x00940x00950x00960x00970x00980x00990x009a0x009b0x009c0x009d0x009e0x009f0x00a00x00a10x00a20x00a30x00a40x00a50x00a60x00a70x00a80x00a90x00aa0x00ab0x00ac0x00ad0x00ae0x00af0x00b00x00b10x00b20x00b30x00b40x00b50x00b60x00b70x00b80x00b90x00ba0x00bb0x00bc0x00bd0x00be0x00bf0x00c00x00c10x00c20x00c30x00c40x00c50x00c60x00c70x00c80x00c90x00ca0x00cb0x00cc0x00cd0x00ce0x00cf0x00d00x00d10x00d20x00d30x00d40x00d50x00d60x00d70x00d80x00d90x00da0x00db0x00dc0x00dd0x00de0x00df0x00e00x00e10x00e20x00e30x00e40x00e50x00e60x00e70x00e80x00e90x00ea0x00eb0x00ec0x00ed0x00ee0x00ef0x00f00x00f10x00f20x00f30x00f40x00f50x00f60x00f70x00f80x00f90x00fa0x00fb0x00fc0x00fd0x00fe0x00ffPositionDeviceAssociationStatus0PhysicalDrive(500GBSAS)1I:2:1PhysicalDrive(500GBSAS)1I:2:4Informational1PhysicalDrive(500GBSAS)1I:2:2PhysicalDrive(500GBSAS)2I:2:5Informational2PhysicalDrive(500GBSAS)1I:2:3PhysicalDrive(500GBSAS)2I:2:6Informational3PhysicalDrive(500GBSAS)1I:2:4PhysicalDrive(500GBSAS)1I:2:1Informational4PhysicalDrive(500GBSAS)2I:2:5PhysicalDrive(500GBSAS)1I:2:2Informational5PhysicalDrive(500GBSAS)2I:2:6PhysicalDrive(500GBSAS)1I:2:3Informational6PhysicalDrive(500GBSAS)2I:2:7PhysicalDrive(500GBSAS)2I:2:7Informational3.
阵列失败的情况是bay5硬盘发现被拔掉,导致logicaldrive降级,不长时间bay2硬盘又有被拔掉的记录,由于bay2和bay5在同一个RAID1组内,同时和其他硬盘组成RAID10,所以导致阵列失败,逻辑驱动器失败,bay7这个热备盘也在随后被发现有拔除记录,具体如下:Critical,1192,29211,SmartArray,Physicaldriveremoved,,0x00,05/30/201709:10:03,[05/3010:45:21]Hot-plugdriveremoved,Port=2IBox=2Bay=5SN=9XF2L38300009411DFVHCritical,1192,29212,SmartArray,Physicaldrivefailure,,0x00,05/30/201709:10:03,[05/3010:45:21]Physicaldrivefailure,Port=2IBox=2Bay=5reason=0x14Caution,1192,29213,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:21]Statechange,logicaldrive0,newstate=DEGRADEDCaution,1192,29214,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:26]Statechange,logicaldrive0,newstate=NEEDS_REBUILDCaution,1192,29215,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:26]Statechange,logicaldrive0,newstate=REBUILDINGCaution,1192,29216,SmartArray,Physicaldriveinserted,,0x00,05/30/201709:10:03,[05/3010:45:43]Hot-plugdriveinserted,Port=2IBox=2Bay=5SN=9XF2L38300009411DFVHCaution,1192,29217,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:43]Statechange,logicaldrive0,newstate=NEEDS_REBUILDCritical,1192,29218,SmartArray,Physicaldriveremoved,,0x00,05/30/201709:10:03,[05/3010:45:43]Hot-plugdriveremoved,Port=1IBox=2Bay=2SN=9XF2L2JE000094141M37Critical,1192,29219,SmartArray,Physicaldrivefailure,,0x00,05/30/201709:10:03,[05/3010:45:43]Physicaldrivefailure,Port=1IBox=2Bay=2reason=0x14Caution,1192,29220,SmartArray,Logicaldriveexchangedmedia,,0x00,05/30/201709:10:03,[05/3010:45:43]Mediaexchangeddetected,logicaldrive0Caution,1192,29221,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:43]Statechange,logicaldrive0,newstate=FAILEDCaution,1192,29222,SmartArray,Rebuildcompletedespiteuncorrectablemediaerrors,,0x00,05/30/201709:10:03,[05/3010:45:45]RebuildURE,LDrv=0LBA=0x0005E3800-0x0005E4FFFCaution,1192,29239,SmartArray,Physicaldriveinserted,,0x00,05/30/201709:10:08,[05/3010:45:57]Hot-plugdriveinserted,Port=1IBox=2Bay=2SN=9XF2L2JE000094141M37Critical,1192,29314,SmartArray,Physicaldriveremoved,,0x00,05/30/201709:11:18,[05/3010:46:36]Hot-plugdriveremoved,Port=2IBox=2Bay=7SN=9XF2L2BM00009413GJFDCritical,1192,29315,SmartArray,Physicaldrivefailure,,0x00,05/30/201709:11:18,[05/3010:46:36]Physicaldrivefailure,Port=2IBox=2Bay=7reason=0x14Caution,1192,29316,SmartArray,Physicaldriveinserted,,0x00,05/30/201709:11:18,[05/3010:46:57]Hot-plugdriveinserted,Port=2IBox=2Bay=7SN=9XF2L2BM00009413GJFD4.
分析每块硬盘的M&P记录,发现2块硬盘(bay2,bay7)有读写/恢复错误,同时有指向硬盘背板的busfaults记录,1块硬盘(bay5)本身没有任何错误,只有busfaults记录,如下:SmartArrayP420iinEmbeddedSlot:InternalDriveCageatPort1I:Box2:PhysicalDrive(500GBSAS)1I:2:2:MonitorandPerformanceStatistics(SinceFactory)SerialNumber9XF2L2JE000094141M37FirmwareRevisionHPD8ProductRevisionHPMM0500FBFVQReferenceTime0x00156e40SectorsRead0x0000002195fb69f4ReadErrorsHard0x00000000ReadErrorsRetryRecovered0x00000000ReadErrorsECCCorrected0x0000000000000000SectorsWritten0x0000000078debd2bWriteErrorsHard0x00000000WriteErrorsRetryRecovered0x00000000SeekCount0xffffffffffffffffSeekErrors0xffffffffffffffffSpinCycles0x00000000SpinUpTime0x0000PerformanceTest10x0000PerformanceTest20xffffPerformanceTest30xffffPerformanceTest40xffffReallocationSectors0xffffffffReallocatedSectors0xffffffffDRQTimeOuts0xffffOtherTimeOuts0x0000DriveRebuildCount0(0x0000)SpinRetries65535(0xffff)RecoversFailedRead0x0002RecoversFailedWrite0x0000FormatErrors0x0000SelfTestFailures0xffffNotReadyFailures0x00000000RemapAbortFailures0xffffffffIRQDeglitchCount4294967295(0xffffffff)BusFaults0x00000016HotPlugCount1(0x00000001)TrackRewriteErrors0xffffWriteErrorsAfterRemap0x0000BackgroundFirmwareRevision0x000x000x000x000x000x000x000x00MediaFailures0x0000HardwareErrors0x0000AbortedCommandFailures0x0000SpinUpFailures0x0000BadTargetCount0(0x0000)PredictiveFailureErrors0x00000000SmartArrayP420iinEmbeddedSlot:InternalDriveCageatPort2I:Box2:PhysicalDrive(500GBSAS)2I:2:5:MonitorandPerformanceStatistics(SinceFactory)SerialNumber9XF2L38300009411DFVHFirmwareRevisionHPD8ProductRevisionHPMM0500FBFVQReferenceTime0x00156e40SectorsRead0x0000002193dd9f06ReadErrorsHard0x00000000ReadErrorsRetryRecovered0x00000000ReadErrorsECCCorrected0x0000000000000000SectorsWritten0x0000000078deb745WriteErrorsHard0x00000000WriteErrorsRetryRecovered0x00000000SeekCount0xffffffffffffffffSeekErrors0xffffffffffffffffSpinCycles0x00000000SpinUpTime0x0000PerformanceTest10x0000PerformanceTest20xffffPerformanceTest30xffffPerformanceTest40xffffReallocationSectors0xffffffffReallocatedSectors0xffffffffDRQTimeOuts0xffffOtherTimeOuts0x0000DriveRebuildCount0(0x0000)SpinRetries65535(0xffff)RecoversFailedRead0x0000RecoversFailedWrite0x0000FormatErrors0x0000SelfTestFailures0xffffNotReadyFailures0x00000000RemapAbortFailures0xffffffffIRQDeglitchCount4294967295(0xffffffff)BusFaults0x00000016HotPlugCount1(0x00000001)TrackRewriteErrors0xffffWriteErrorsAfterRemap0x0000BackgroundFirmwareRevision0x000x000x000x000x000x000x000x00MediaFailures0x0000HardwareErrors0x0000AbortedCommandFailures0x0000SpinUpFailures0x0000BadTargetCount0(0x0000)PredictiveFailureErrors0x00000000SmartArrayP420iinEmbeddedSlot:InternalDriveCageatPort2I:Box2:PhysicalDrive(500GBSAS)2I:2:7:MonitorandPerformanceStatistics(SinceFactory)SerialNumber9XF2L2BM00009413GJFDFirmwareRevisionHPD8ProductRevisionHPMM0500FBFVQReferenceTime0x00156e40SectorsRead0x000000000004056fReadErrorsHard0x00000001ReadErrorsRetryRecovered0x00000000ReadErrorsECCCorrected0x0000000000000000SectorsWritten0x0000000000234999WriteErrorsHard0x00000000WriteErrorsRetryRecovered0x00000000SeekCount0xffffffffffffffffSeekErrors0xffffffffffffffffSpinCycles0x00000000SpinUpTime0x0000PerformanceTest10x0000PerformanceTest20xffffPerformanceTest30xffffPerformanceTest40xffffReallocationSectors0xffffffffReallocatedSectors0xffffffffDRQTimeOuts0xffffOtherTimeOuts0x0000DriveRebuildCount0(0x0000)SpinRetries65535(0xffff)RecoversFailedRead0x0000RecoversFailedWrite0x0000FormatErrors0x0000SelfTestFailures0xffffNotReadyFailures0x00000000RemapAbortFailures0xffffffffIRQDeglitchCount4294967295(0xffffffff)BusFaults0x00000016HotPlugCount1(0x00000001)TrackRewriteErrors0xffffWriteErrorsAfterRemap0x0000BackgroundFirmwareRevision0x000x000x000x000x000x000x000x00MediaFailures0x0000HardwareErrors0x0000AbortedCommandFailures0x0000SpinUpFailures0x0000BadTargetCount0(0x0000)PredictiveFailureErrors0x000000005.
另外,发现阵列卡固件,BIOS和iLO4固件均偏低,如下:iLO(iLOAdvancedLicense)iLO4v2.
00p67builtonJul302014SystemROM02/10/2014SlotControllerSerial#VersionVersionVersionRevisionRevision0P420i0014380300131606.
001.
9001.
90.
002.
002140综上日志分析,若排除人为拔盘的操作,可以定位主要是硬盘背板的原因导致的阵列失败,同时可以确认2块硬盘(bay2,bay7)有问题,与bay2同一RAID1组的bay5硬盘没有硬件错误,bay7是热备盘,所以如果更换硬盘背板解决连接稳定性后阵列数据是没有丢失的.
1.
更换硬盘背板,然后先拔掉bay2和bay7问题硬盘(拔掉这两个硬盘对阵列数据完整性没有影响);2.
重启机器,然后重新激活阵列后能进入系统,做好数据备份;3.
同时更换掉bay2,bay7问题硬盘,然后使用最新的SWBundle更新机器固件.
1.
从日志中找到阵列失败的时间点和具体硬盘如何组成的阵列对分析问题十分有帮助;2.
针对阵列、存储、硬盘类问题需要收集全AHS和ADU日志;3.
硬盘M&P的记录对分析硬盘是否有硬件问题以及硬盘背板是否正常非常有用.

Digital-vm80美元,1-10Gbps带宽日本/新加坡独立服务器

Digital-vm是一家成立于2019年的国外主机商,商家提供VPS和独立服务器租用业务,其中VPS基于KVM架构,提供1-10Gbps带宽,数据中心可选包括美国洛杉矶、日本、新加坡、挪威、西班牙、丹麦、荷兰、英国等8个地区机房;除了VPS主机外,商家还提供日本、新加坡独立服务器,同样可选1-10Gbps带宽,最低每月仅80美元起。下面列出两款独立服务器配置信息。配置一 $80/月CPU:E3-...

Kinponet是谁?Kinponet前身公司叫金宝idc 成立于2013年 开始代理销售美国vps。

在2014年发现原来使用VPS的客户需求慢慢的在改版,VPS已经不能满足客户的需求。我们开始代理机房的独立服务器,主推和HS机房的独立服务器。经过一年多的发展,我们发现代理的服务器配置参差不齐,机房的售后服务也无法完全跟上,导致了很多问题发生,对使用体验带来了很多的不便,很多客户离开了我们。经过我们慎重的考虑和客户的建议。我们在2015开始了重大的改变, 2015年,我们开始计划托管自己...

特网云(1050元),IP数5 个可用 IP (/29) ,美国高防御服务器 无视攻击

特网云特网云为您提供高速、稳定、安全、弹性的云计算服务计算、存储、监控、安全,完善的云产品满足您的一切所需,深耕云计算领域10余年;我们拥有前沿的核心技术,始终致力于为政府机构、企业组织和个人开发者提供稳定、安全、可靠、高性价比的云计算产品与服务。官方网站:https://www.56dr.com/ 10年老品牌 值得信赖 有需要的请联系======================特网云美国高防御...

神武连接服务器失败为你推荐
账号企业回复邮箱1013556608@qq.com重庆网络公司一九互联重庆本地的网约车平台有哪些?如何识别比较正规的网约车平台?filezillaserver如何使用filezilla server360公司迁至天津360公司前身是中国吗?现总裁是谁?360公司迁至天津天津360公司?360开户哪家好?360开户费多少?360推广怎么样?360效果怎么样?360和百度相比哪个更合适?颁发的拼音发字的多音字组词我爱e书网侯龙涛小说那里有下载的powerbydedecms如何去掉底部的 powered by dedecmsdedecms采集织梦后台怎么采集图片
最新代理服务器ip 5折 kdata wordpress技巧 网站保姆 xfce win8升级win10正式版 tk域名 商家促销 免费个人网站申请 架设服务器 admit的用法 北京双线 cdn加速是什么 电信虚拟主机 新世界服务器 根服务器 免备案cdn加速 腾讯服务器 forwarder 更多