Let me first explain my hard drive setup:
Six SSDs. I have bought maximum of two at the time. All are set as eSATA (hotswap) because those all reside in 5.25" bay hotswap cage. The cage itself is a simple passtrough device. It only has indicator leds for each drive. And them to work the drive needs to support an activity led.
Code: Select all
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 1 238.5G 0 disk
├─sda1 8:1 1 512M 0 part
├─sda2 8:2 1 3.5G 0 part
│ └─md126 9:126 0 17.4G 0 raid5
└─sda3 8:3 1 234.5G 0 part
sdb 8:16 1 111.8G 0 disk
├─sdb1 8:17 1 512M 0 part
├─sdb2 8:18 1 3.5G 0 part
│ └─md126 9:126 0 17.4G 0 raid5
└─sdb3 8:19 1 107.8G 0 part
sdc 8:32 1 489.1G 0 disk
├─sdc1 8:33 1 512M 0 part
├─sdc2 8:34 1 3.5G 0 part
│ └─md126 9:126 0 17.4G 0 raid5
└─sdc3 8:35 1 485.1G 0 part
sdd 8:48 1 447.1G 0 disk
├─sdd1 8:49 1 512M 0 part
├─sdd2 8:50 1 3.5G 0 part
│ └─md126 9:126 0 17.4G 0 raid5
└─sdd3 8:51 1 443.1G 0 part
sde 8:64 1 447.1G 0 disk
├─sde1 8:65 1 512M 0 part
├─sde2 8:66 1 3.5G 0 part
│ └─md126 9:126 0 17.4G 0 raid5
└─sde3 8:67 1 443.1G 0 part
sdf 8:80 1 447.1G 0 disk
├─sdf1 8:81 1 512M 0 part
├─sdf2 8:82 1 3.5G 0 part
│ └─md126 9:126 0 17.4G 0 raid5
└─sdf3 8:83 1 443.1G 0 part- First partition of each device was a /boot partition on mdraid1. I'm not sure if I have lost that raid stack... All the drives appers as spares now. At first boot, at least, the partition was avalable
- md126 is/was my swap partition on raid5 for hibernate image. I reformatted it to ext4 to make test. I dumped data from /dev/urandom to (almost) fill the partition. The data going in and out from the partition had the same md5sum (dropping write cache in between). While I did that I didn't receive any errors. However eralier srubbing the raid device did produce errors on at least four sata busses
- The last, third, partition of each device is btrfs filesystem. Reading and writing to it works. Although I've been mounting it ro since I started to investigate this problem. I have backups there also which I have backupped further into my server.
I enounter lots of ata errors. But somehow things do still work. The only exception was that I lost the (six drive) raid1 array as spares. Before when that raid1 array worked it usually got stuck when trying to umount it, but the system didn't froze. I haven't tried to assemble it yet. The data might be there, but I also have backups. Also it's only /boot.
Here's some more information:
Motherboard: ASRock 970M Pro3
Code: Select all
Linux livecd 4.5.2-aufs-r1 #1 SMP Sun Jul 3 17:17:11 UTC 2016 x86_64 AMD FX(tm)-8350 Eight-Core Processor AuthenticAMD GNU/LinuxCode: Select all
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx0 port B) (rev 02)
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD/ATI] RD990 I/O Memory Management Unit (IOMMU)
00:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port B)
00:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port D)
00:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port H)
00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40)
00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller (rev 42)
00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) (rev 40)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC host controller (rev 40)
00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI Bridge (rev 40)
00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller
00:15.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB700/SB800/SB900 PCI to PCI bridge (PCIE port 0)
00:15.3 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB900 PCI to PCI bridge (PCIE port 3)
00:16.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:16.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 5
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Fiji [Radeon R9 FURY / NANO Series] (rev ca)
01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Fiji HDMI/DP Audio Controller
02:00.0 InfiniBand: Mellanox Technologies MT25204 [InfiniHost III Lx HCA] (rev 20)
03:00.0 USB controller: Etron Technology, Inc. EJ188/EJ198 USB 3.0 Host Controller
06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)Code: Select all
[ 1.723018] ata1: SATA max UDMA/133 abar m1024@0xfeb0b000 port 0xfeb0b100 irq 19
[ 1.723021] ata2: SATA max UDMA/133 abar m1024@0xfeb0b000 port 0xfeb0b180 irq 19
[ 1.723023] ata3: SATA max UDMA/133 abar m1024@0xfeb0b000 port 0xfeb0b200 irq 19
[ 1.723025] ata4: SATA max UDMA/133 abar m1024@0xfeb0b000 port 0xfeb0b280 irq 19
[ 1.723027] ata5: SATA max UDMA/133 abar m1024@0xfeb0b000 port 0xfeb0b300 irq 19
[ 1.723029] ata6: SATA max UDMA/133 abar m1024@0xfeb0b000 port 0xfeb0b380 irq 19
[ 2.178802] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 2.179794] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 2.179810] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 2.179825] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 2.179842] ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 2.180125] ata3.00: supports DRM functions and may not be fully accessible
[ 2.180279] ata1.00: ATA-9: SAMSUNG SSD 830 Series, CXM03B1Q, max UDMA/133
[ 2.180281] ata1.00: 500118192 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
[ 2.180484] ata3.00: ATA-10: Crucial_CT525MX300SSD1, M0CR040, max UDMA/133
[ 2.180486] ata3.00: 1025610768 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
[ 2.180559] ata1.00: configured for UDMA/133
[ 2.180792] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 2.181071] ata6.00: ATA-11: KINGSTON SUV400S37480G, 0C3FD6SD, max UDMA/133
[ 2.181073] ata6.00: 937703088 sectors, multi 1: LBA48 NCQ (depth 31/32), AA
[ 2.181197] ata3.00: supports DRM functions and may not be fully accessible
[ 2.181452] ata6.00: configured for UDMA/133
[ 2.182044] ata3.00: configured for UDMA/133
[ 2.186906] ata2.00: ATA-8: KINGSTON SV300S37A120G, 600ABBF0, max UDMA/133
[ 2.186908] ata2.00: 234441648 sectors, multi 1: LBA48 NCQ (depth 31/32), AA
[ 2.186991] ata5.00: ATA-8: KINGSTON SV300S37A480G, 605ABBF2, max UDMA/133
[ 2.186993] ata5.00: 937703088 sectors, multi 1: LBA48 NCQ (depth 31/32), AA
[ 2.187450] ata4.00: ATA-8: KINGSTON SV300S37A480G, 605ABBF2, max UDMA/133
[ 2.187452] ata4.00: 937703088 sectors, multi 1: LBA48 NCQ (depth 31/32), AA
[ 2.192499] ata5.00: configured for UDMA/133
[ 2.192933] ata4.00: configured for UDMA/133
[ 2.193052] ata2.00: configured for UDMA/133
[ 2.220890] ata3.00: Enabling discard_zeroes_data
[ 2.221124] ata3.00: Enabling discard_zeroes_data
[ 2.221476] ata3.00: Enabling discard_zeroes_data
[ 14.640192] ata6.00: exception Emask 0x10 SAct 0x1800000 SErr 0x400000 action 0x6 frozen
[ 14.640194] ata6.00: irq_stat 0x08000000, interface fatal error
[ 14.640196] ata6: SError: { Handshk }
[ 14.640199] ata6.00: failed command: WRITE FPDMA QUEUED
[ 14.640202] ata6.00: cmd 61/80:b8:00:bc:81/00:00:0b:00:00/40 tag 23 ncq 65536 out
[ 14.640204] ata6.00: status: { DRDY }
[ 14.640206] ata6.00: failed command: WRITE FPDMA QUEUED
[ 14.640208] ata6.00: cmd 61/80:c0:80:bc:81/00:00:0b:00:00/40 tag 24 ncq 65536 out
[ 14.640210] ata6.00: status: { DRDY }
[ 14.640212] ata6: hard resetting link
[ 15.096160] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 15.096836] ata6.00: configured for UDMA/133
[ 15.096843] ata6: EH complete
[ 15.108166] ata6.00: exception Emask 0x10 SAct 0x6 SErr 0x400000 action 0x6 frozen
[ 15.108168] ata6.00: irq_stat 0x08000000, interface fatal error
[ 15.108169] ata6: SError: { Handshk }
[ 15.108171] ata6.00: failed command: WRITE FPDMA QUEUED
[ 15.108174] ata6.00: cmd 61/80:08:00:bc:81/00:00:0b:00:00/40 tag 1 ncq 65536 out
[ 15.108176] ata6.00: status: { DRDY }
[ 15.108177] ata6.00: failed command: WRITE FPDMA QUEUED
[ 15.108180] ata6.00: cmd 61/80:10:80:bc:81/00:00:0b:00:00/40 tag 2 ncq 65536 out
[ 15.108181] ata6.00: status: { DRDY }
[ 15.108184] ata6: hard resetting link
[ 15.564138] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 15.564811] ata6.00: configured for UDMA/133
[ 15.564815] ata6: EH completeI don't want to admit it but I think it's the SATA controller... Swapping back the old motherbord in would take some time... I wish I had some kind of test bench but I don't.
Finally: the whole dmesg from one boot.


