-
确定故障原因
root@test # fmadm faulty --------------- ------------------------------------ -------------- --------- TIME EVENT-ID MSG-ID SEVERITY --------------- ------------------------------------ -------------- --------- Oct 30 22:11:10 8ad90d8d-3e4f-432d-a233-84f8627dcf10 DISK-8000-0X Major Problem Status : open Diag Engine : eft / 1.16 System Manufacturer : Oracle Corporation Name : SPARC T7-1 Part_Number : 34429612+1+1 Serial_Number : AK00373594 Host_ID : 86bb603a ---------------------------------------- Suspect 1 of 1 : Problem class : fault.io.disk.predictive-failure Certainty : 100% Affects : dev:///:devid=id1,sd@n5000cca02f1d2524//scsi_vhci/disk@g5000cca02f1d2524 Status : faulted but still providing degraded service FRU Status : faulty Location : "/SYS/DBP/HDD0" Manufacturer : HGST Name : H101860SFSUN600G Part_Number : HGST-H101860SFSUN600G Revision : A990 Serial_Number : 001619FJ0WNC--------03GJ0WNC Chassis Manufacturer : Oracle Corporation Name : SPARC T7-1 Part_Number : 34429612+1+1 Serial_Number : AK00373594 Description : SMART health-monitoring firmware reported that a disk failure is imminent. Response : A hot-spare disk may have been activated. Impact : It is likely that the continued operation of this disk will result in data loss. Action : Use 'fmadm faulty' to provide a more detailed view of this event. Please refer to the associated reference document at http://support.oracle.com/msg/DISK-8000-0X for the latest service procedures and policies regarding this diagnosis. root@test # more /var/adm/messages Nov 8 16:23:13 zbwl03 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci/disk@g5000cca02f1d2524 (sd5): Nov 8 16:23:13 zbwl03 Error for Command: read(10) Error Level: Informational Nov 8 16:23:13 zbwl03 scsi: [ID 243001 kern.info] Requested Block: 394604487 Error Block: 394604487 Nov 8 16:23:13 zbwl03 scsi: [ID 243001 kern.info] Vendor: HGST Serial Number: 1619FJ0WNC Nov 8 16:23:13 zbwl03 scsi: [ID 243001 kern.info] Sense Key: Soft_Error Nov 8 16:23:13 zbwl03 scsi: [ID 243001 kern.info] ASC: 0x5d (<vendor unique code 0x5d>), ASCQ: 0x90, FRU: 0x90 Nov 8 17:06:27 zbwl03 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci/disk@g5000cca02f1d2524 (sd5): Nov 8 17:06:27 zbwl03 Error for Command: read(10) Error Level: Informational Nov 8 17:06:27 zbwl03 scsi: [ID 243001 kern.info] Requested Block: 118860647 Error Block: 118860647 Nov 8 17:06:27 zbwl03 scsi: [ID 243001 kern.info] Vendor: HGST Serial Number: 1619FJ0WNC Nov 8 17:06:27 zbwl03 scsi: [ID 243001 kern.info] Sense Key: Soft_Error Nov 8 17:06:27 zbwl03 scsi: [ID 243001 kern.info] ASC: 0x5d (<vendor unique code 0x5d>), ASCQ: 0x90, FRU: 0x90 Nov 8 17:11:11 zbwl03 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci/disk@g5000cca02f1d2524 (sd5): Nov 8 17:11:11 zbwl03 Error for Command: read(10) Error Level: Informational Nov 8 17:11:11 zbwl03 scsi: [ID 243001 kern.info] Requested Block: 86417744 Error Block: 86417744 Nov 8 17:11:11 zbwl03 scsi: [ID 243001 kern.info] Vendor: HGST Serial Number: 1619FJ0WNC Nov 8 17:11:11 zbwl03 scsi: [ID 243001 kern.info] Sense Key: Soft_Error Nov 8 17:11:11 zbwl03 scsi: [ID 243001 kern.info] ASC: 0x5d (<vendor unique code 0x5d>), ASCQ: 0x90, FRU: 0x90 Nov 8 17:36:14 zbwl03 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci/disk@g5000cca02f1d2524 (sd5): Nov 8 17:36:14 zbwl03 Error for Command: read(10) Error Level: Informational Nov 8 17:36:14 zbwl03 scsi: [ID 243001 kern.info] Requested Block: 118915795 Error Block: 118915795 Nov 8 17:36:14 zbwl03 scsi: [ID 243001 kern.info] Vendor: HGST Serial Number: 1619FJ0WNC Nov 8 17:36:14 zbwl03 scsi: [ID 243001 kern.info] Sense Key: Soft_Error Nov 8 17:36:14 zbwl03 scsi: [ID 243001 kern.info] ASC: 0x5d (<vendor unique code 0x5d>), ASCQ: 0x90, FRU: 0x90 Nov 8 17:54:11 zbwl03 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci/disk@g5000cca02f1d2524 (sd5): Nov 8 17:54:11 zbwl03 Error for Command: read(10) Error Level: Informational Nov 8 17:54:11 zbwl03 scsi: [ID 243001 kern.info] Requested Block: 86417744 Error Block: 86417744 Nov 8 17:54:11 zbwl03 scsi: [ID 243001 kern.info] Vendor: HGST Serial Number: 1619FJ0WNC Nov 8 17:54:11 zbwl03 scsi: [ID 243001 kern.info] Sense Key: Soft_Error Nov 8 17:54:11 zbwl03 scsi: [ID 243001 kern.info] ASC: 0x5d (<vendor unique code 0x5d>), ASCQ: 0x90, FRU: 0x90 Nov 8 19:11:49 zbwl03 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci/disk@g5000cca02f1d2524 (sd5): Nov 8 19:11:49 zbwl03 Error for Command: read(10) Error Level: Informational Nov 8 19:11:49 zbwl03 scsi: [ID 243001 kern.info] Requested Block: 211206686 Error Block: 211206686 Nov 8 19:11:49 zbwl03 scsi: [ID 243001 kern.info] Vendor: HGST Serial Number: 1619FJ0WNC Nov 8 19:11:49 zbwl03 scsi: [ID 243001 kern.info] Sense Key: Soft_Error Nov 8 19:11:49 zbwl03 scsi: [ID 243001 kern.info] ASC: 0x5d (<vendor unique code 0x5d>), ASCQ: 0x90, FRU: 0x90 Nov 8 20:16:45 zbwl03 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci/disk@g5000cca02f1d2524 (sd5): Nov 8 20:16:45 zbwl03 Error for Command: read(10) Error Level: Informational Nov 8 20:16:45 zbwl03 scsi: [ID 243001 kern.info] Requested Block: 213848167 Error Block: 213848167 Nov 8 20:16:45 zbwl03 scsi: [ID 243001 kern.info] Vendor: HGST Serial Number: 1619FJ0WNC Nov 8 20:16:45 zbwl03 scsi: [ID 243001 kern.info] Sense Key: Soft_Error Nov 8 20:16:45 zbwl03 scsi: [ID 243001 kern.info] ASC: 0x5d (<vendor unique code 0x5d>), ASCQ: 0x90, FRU: 0x90 Nov 8 20:21:55 zbwl03 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci/disk@g5000cca02f1d2524 (sd5): Nov 8 20:21:55 zbwl03 Error for Command: read(10) Error Level: Informational Nov 8 20:21:55 zbwl03 scsi: [ID 243001 kern.info] Requested Block: 209782185 Error Block: 209782185 Nov 8 20:21:55 zbwl03 scsi: [ID 243001 kern.info] Vendor: HGST Serial Number: 1619FJ0WNC Nov 8 20:21:55 zbwl03 scsi: [ID 243001 kern.info] Sense Key: Soft_Error Nov 8 20:21:55 zbwl03 scsi: [ID 243001 kern.info] ASC: 0x5d (<vendor unique code 0x5d>), ASCQ: 0x90, FRU: 0x90 Nov 8 20:27:08 zbwl03 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci/disk@g5000cca02f1d2524 (sd5): Nov 8 20:27:08 zbwl03 Error for Command: read(10) Error Level: Informational Nov 8 20:27:08 zbwl03 scsi: [ID 243001 kern.info] Requested Block: 213848182 Error Block: 213848182 Nov 8 20:27:08 zbwl03 scsi: [ID 243001 kern.info] Vendor: HGST Serial Number: 1619FJ0WNC Nov 8 20:27:08 zbwl03 scsi: [ID 243001 kern.info] Sense Key: Soft_Error Nov 8 20:27:08 zbwl03 scsi: [ID 243001 kern.info] ASC: 0x5d (<vendor unique code 0x5d>), ASCQ: 0x90, FRU: 0x90 Nov 8 20:37:29 zbwl03 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci/disk@g5000cca02f1d2524 (sd5): Nov 8 20:37:29 zbwl03 Error for Command: read(10) Error Level: Informational Nov 8 20:37:29 zbwl03 scsi: [ID 243001 kern.info] Requested Block: 211019254 Error Block: 211019254 Nov 8 20:37:29 zbwl03 scsi: [ID 243001 kern.info] Vendor: HGST Serial Number: 1619FJ0WNC Nov 8 20:37:29 zbwl03 scsi: [ID 243001 kern.info] Sense Key: Soft_Error Nov 8 20:37:29 zbwl03 scsi: [ID 243001 kern.info] ASC: 0x5d (<vendor unique code 0x5d>), ASCQ: 0x90, FRU: 0x90 Nov 8 20:42:42 zbwl03 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci/disk@g5000cca02f1d2524 (sd5): Nov 8 20:42:42 zbwl03 Error for Command: read(10) Error Level: Informational Nov 8 20:42:42 zbwl03 scsi: [ID 243001 kern.info] Requested Block: 211955388 Error Block: 211955388 Nov 8 20:42:42 zbwl03 scsi: [ID 243001 kern.info] Vendor: HGST Serial Number: 1619FJ0WNC Nov 8 20:42:42 zbwl03 scsi: [ID 243001 kern.info] Sense Key: Soft_Error Nov 8 20:42:42 zbwl03 scsi: [ID 243001 kern.info] ASC: 0x5d (<vendor unique code 0x5d>), ASCQ: 0x90, FRU: 0x90 Nov 8 20:58:15 zbwl03 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci/disk@g5000cca02f1d2524 (sd5): Nov 8 20:58:15 zbwl03 Error for Command: read(10) Error Level: Informational Nov 8 20:58:15 zbwl03 scsi: [ID 243001 kern.info] Requested Block: 213894761 Error Block: 213894761 Nov 8 20:58:15 zbwl03 scsi: [ID 243001 kern.info] Vendor: HGST Serial Number: 1619FJ0WNC Nov 8 20:58:15 zbwl03 scsi: [ID 243001 kern.info] Sense Key: Soft_Error Nov 8 20:58:15 zbwl03 scsi: [ID 243001 kern.info] ASC: 0x5d (<vendor unique code 0x5d>), ASCQ: 0x90, FRU: 0x90 Nov 8 21:08:38 zbwl03 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci/disk@g5000cca02f1d2524 (sd5): Serial Number: 1619FJ0WNC root@test # zpool status rpool pool: rpool state: DEGRADED status: One or more devices are unavailable in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or 'fmadm repaired', or replace the device with 'zpool replace'. Run 'zpool status -v' to see device specific details. scan: resilvered 191G in 18m17s with 0 errors on Wed Aug 31 20:15:31 2016 config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 c0t5000CCA02F1D2524d0 DEGRADED 0 0 0 c0t5000CCA02F1D7534d0 ONLINE 0 0 0 errors: No known data errors
2.具体操作步骤
root@test # zpool offline rpool c0t5000CCA02F1D2524d0 root@test # zpool status rpool pool: rpool state: DEGRADED status: One or more devices has been taken offline by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. scan: resilvered 191G in 18m17s with 0 errors on Wed Aug 31 20:15:31 2016 config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 c0t5000CCA02F1D2524d0 OFFLINE 0 0 0 c0t5000CCA02F1D7534d0 ONLINE 0 0 0 errors: No known data errors root@test # cfgadm -al Ap_Id Type Receptacle Occupant Condition c4 fc connected unconfigured unknown c5 fc-fabric connected configured unknown c5::2100000e1ecb1261 unknown connected unconfigured unknown c5::2100000e1ecc7100 disk connected configured unknown c5::2100000e1ecc8320 disk connected configured unknown c6 scsi-sas connected unconfigured unknown c7 scsi-sas connected configured unknown c7::w5000cca02f1d2524,0 disk-path connected configured unknown c8 scsi-sas connected configured unknown c8::w5000cca02f1d7535,0 disk-path connected configured unknown c9 fc connected unconfigured unknown c11 fc-fabric connected configured unknown c11::2100000e1ecb11f1 unknown connected unconfigured unknown c11::2100000e1ecc8350 disk connected configured unknown c11::2100000e1ecc84c0 disk connected configured unknown usb0/1 unknown empty unconfigured ok usb0/2 unknown empty unconfigured ok usb0/3 usb-hub connected configured ok usb0/3.1 usb-device connected configured ok usb0/3.2 usb-communications connected configured ok usb0/3.3 unknown empty unconfigured ok usb0/3.4 unknown empty unconfigured ok usb0/4 usb-hub connected configured ok usb0/4.1 unknown empty unconfigured ok usb0/4.2 usb-storage connected configured ok usb0/5 unknown empty unconfigured ok usb0/6 unknown empty unconfigured ok usb0/7 usb-hub connected configured ok usb0/7.1 unknown empty unconfigured ok usb0/7.2 unknown empty unconfigured ok usb0/7.3 unknown empty unconfigured ok usb0/7.4 unknown empty unconfigured ok usb0/8 usb-hub connected configured ok usb0/8.1 usb-device connected configured ok usb0/8.2 unknown empty unconfigured ok
记住位置编号,unconfigure故障磁盘
c7::w5000cca02f1d2524,0 root@test # cfgadm -c unconfigure c7::w5000cca02f1d2524,0
拨出故障硬盘,插入新的磁盘
root@test # cfgadm -al Ap_Id Type Receptacle Occupant Condition c4 fc connected unconfigured unknown c5 fc-fabric connected configured unknown c5::2100000e1ecb1261 unknown connected unconfigured unknown c5::2100000e1ecc7100 disk connected configured unknown c5::2100000e1ecc8320 disk connected configured unknown c6 scsi-sas connected unconfigured unknown c7 scsi-sas connected configured unknown c7::w5000cca02f1091b9,0 disk-path connected configured unknown c8 scsi-sas connected configured unknown c8::w5000cca02f1d7535,0 disk-path connected configured unknown c9 fc connected unconfigured unknown c11 fc-fabric connected configured unknown c11::2100000e1ecb11f1 unknown connected unconfigured unknown c11::2100000e1ecc8350 disk connected configured unknown c11::2100000e1ecc84c0 disk connected configured unknown usb0/1 unknown empty unconfigured ok usb0/2 unknown empty unconfigured ok usb0/3 usb-hub connected configured ok usb0/3.1 usb-device connected configured ok usb0/3.2 usb-communications connected configured ok usb0/3.3 unknown empty unconfigured ok usb0/3.4 unknown empty unconfigured ok usb0/4 usb-hub connected configured ok usb0/4.1 unknown empty unconfigured ok usb0/4.2 usb-storage connected configured ok usb0/5 unknown empty unconfigured ok usb0/6 unknown empty unconfigured ok usb0/7 usb-hub connected configured ok usb0/7.1 unknown empty unconfigured ok usb0/7.2 unknown empty unconfigured ok usb0/7.3 unknown empty unconfigured ok usb0/7.4 unknown empty unconfigured ok usb0/8 usb-hub connected configured ok usb0/8.1 usb-device connected configured ok usb0/8.2 unknown empty unconfigured ok root@test # cfgadm -c configure c7::w5000cca02f1091b9,0 root@test # format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c0t5000CCA02F1091B8d0 <HGST-H101860SFSUN600G-A990-558.91GB> solaris /scsi_vhci/disk@g5000cca02f1091b8 /dev/chassis/SYS/DBP/HDD0/disk 1. c0t5000CCA02F1D7534d0 <HGST-H101860SFSUN600G-A990-558.91GB> /scsi_vhci/disk@g5000cca02f1d7534 /dev/chassis/SYS/DBP/HDD1/disk 2. c2t0d0 <MICRON-eUSB DISK-1112 cyl 246 alt 0 hd 255 sec 63> /pci@300/pci@2/usb@0/hub@4/storage@2/disk@0,0 /dev/chassis/SYS/MB/EUSB-DISK/disk 3. c0t6000B08414B303033373335353600004d0 <Oracle-Oracle FS1-2-6203-800.83GB> /scsi_vhci/ssd@g6000b08414b303033373335353600004 4. c0t6000B08414B303033373335353600005d0 <Oracle-Oracle FS1-2-6203-800.83GB> /scsi_vhci/ssd@g6000b08414b303033373335353600005 5. c0t6000B08414B303033373335353600003d0 <Oracle-Oracle FS1-2-6203-2.00TB> /scsi_vhci/ssd@g6000b08414b303033373335353600003 6. c0t6000B08414B303033373335353600000d0 <Oracle-Oracle FS1-2-6203-2.00TB> /scsi_vhci/ssd@g6000b08414b303033373335353600000 7. c0t6000B08414B303033373335353600001d0 <Oracle-Oracle FS1-2-6203-2.00TB> /scsi_vhci/ssd@g6000b08414b303033373335353600001 8. c0t6000B08414B303033373335353600002d0 <Oracle-Oracle FS1-2-6203-2.00TB> /scsi_vhci/ssd@g6000b08414b303033373335353600002 9. c0t6000B08414B303033373335353600006d0 <Oracle-Oracle FS1-2-6203 cyl 575 alt 2 hd 128 sec 32> /scsi_vhci/ssd@g6000b08414b303033373335353600006 Specify disk (enter its number): ^C root@test # zpool help replace usage: replace [-f] <pool> <device> [new-device] root@test # zpool replace rpool c0t5000CCA02F1D2524d0 c0t5000CCA02F1091B8d0 Make sure to wait until resilver is done before rebooting. root@zbwl03 # zpool status rpool pool: rpool state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function in a degraded state. action: Wait for the resilver to complete. Run 'zpool status -v' to see device specific details. scan: resilver in progress since Tue Nov 14 10:38:00 2017 74.5G scanned out of 165G at 2.87G/s, 31s to go 0 resilvered config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 replacing-0 DEGRADED 0 0 0 c0t5000CCA02F1D2524d0 OFFLINE 0 0 0 c0t5000CCA02F1091B8d0 DEGRADED 0 0 0 (resilvering) c0t5000CCA02F1D7534d0 ONLINE 0 0 0 errors: No known data errors root@test # zpool status rpool pool: rpool state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function in a degraded state. action: Wait for the resilver to complete. Run 'zpool status -v' to see device specific details. scan: resilver in progress since Tue Nov 14 10:38:00 2017 165G scanned 34.4G resilvered at 183M/s, 20.91% done, 12m08s to go config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 replacing-0 DEGRADED 14 0 0 c0t5000CCA02F1D2524d0 OFFLINE 0 0 0 c0t5000CCA02F1091B8d0 DEGRADED 0 0 0 (resilvering) c0t5000CCA02F1D7534d0 ONLINE 0 0 0 errors: No known data errors root@test # zpool status rpool pool: rpool state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function in a degraded state. action: Wait for the resilver to complete. Run 'zpool status -v' to see device specific details. scan: resilver in progress since Tue Nov 14 10:38:00 2017 165G scanned 161G resilvered at 157M/s, 97.98% done, 21s to go config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 replacing-0 DEGRADED 14 0 0 c0t5000CCA02F1D2524d0 OFFLINE 0 0 0 c0t5000CCA02F1091B8d0 DEGRADED 0 0 0 (resilvering) c0t5000CCA02F1D7534d0 ONLINE 0 0 0 errors: No known data errors root@test # zpool status rpool pool: rpool state: DEGRADED scan: resilvered 165G in 19m26s with 0 errors on Tue Nov 14 10:57:26 2017 config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 replacing-0 DEGRADED 14 0 0 c0t5000CCA02F1D2524d0 OFFLINE 0 0 0 c0t5000CCA02F1091B8d0 ONLINE 0 0 0 c0t5000CCA02F1D7534d0 ONLINE 0 0 0 errors: No known data errors root@test # zpool status rpool pool: rpool state: ONLINE scan: resilvered 165G in 19m26s with 0 errors on Tue Nov 14 10:57:26 2017 config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c0t5000CCA02F1091B8d0 ONLINE 0 0 0 c0t5000CCA02F1D7534d0 ONLINE 0 0 0 errors: No known data errors root@test #
原创文章,作者:kepupublish,如若转载,请注明出处:https://blog.ytso.com/182929.html