For that i use an external bay (icybox/dock) where one can easily switch drives. I used the eSata connector for that to avoid the usb-bridge controller (which would limit drive size).
This worked well the past years. Since some days i no longer get the kernel event entries in dmesg which indicate a new drive is recognized and what was the assigned device node (e.g. /dev/sdh) if i power on the bay+drive.
So the kernel seems not to regonize them any more.
As this was happening with two different external bays and multiple drives i had the suspicion that the controller was not ok.
Interestingly if i have them powered on during boot they are recognized fine and ok till i power them off. After another power-up they are not picked up any more again.
Of course i don't want to reboot my server just to connect in my backup drives.
I now added a additional controller with eSata ports (ASM 1601) - no change - also works during boot but not if powered up during operation.
Furthermore i booted and older kernel (5.6.10 from may and 5.4.6 from dez 19) but also there no luck - also with 5.4.60 which i rebuild based of an old config from 5.3 line.
As everything is recognized fine in bios / when available during boot i'm curious what i can do to narrow it down?
Two controllers, three different eSata cables, multiple drives, two different bays and all was ok some weeks ago (not sure what changed or what could influence at that level)
Next i took the drive & bay and connected it to an sata port with a sata-esata cable on my workstation (which has no esata)
There i could see it detected after issuing
This also does not help on the affected machine.echo 0 0 0 | tee /sys/class/scsi_host/host*/scan
Also all SATA controllers are in AHCI mode
In my kernel (5.8.5 gentoo-sources) i have the following active for SATA and AHCIlspci -k | grep SATA -A2
00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset 6 port SATA AHCI Controller (rev 06)
Subsystem: Micro-Star International Co., Ltd. [MSI] 5 Series/3400 Series Chipset 6 port SATA AHCI Controller
Kernel driver in use: ahci
02:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 01)
Subsystem: ASMedia Technology Inc. ASM1062 Serial ATA Controller
Kernel driver in use: ahci
03:00.0 SATA controller: JMicron Technology Corp. JMB363 SATA/IDE Controller (rev 03)
Subsystem: Micro-Star International Co., Ltd. [MSI] JMB363 SATA/IDE Controller
Kernel driver in use: ahci
03:00.1 IDE interface: JMicron Technology Corp. JMB363 SATA/IDE Controller (rev 03)
Subsystem: Micro-Star International Co., Ltd. [MSI] JMB363 SATA/IDE Controller
Kernel driver in use: pata_jmicron
What could be checked? Any powermgt parameters, any specific hotplug enable parameter i'm not aware off?CONFIG_SATA_HOST=y
CONFIG_SATA_PMP=y
CONFIG_SATA_AHCI=y
CONFIG_SATA_MOBILE_LPM_POLICY=0
Where there changes that might require that hotplug capable devices need to be marked somehow?
[edit 1]
after booting with active power on one external disk i did first part of a backup to a disk (ST3000DM001) umounted and suspended via hdparm. Powered off the bay replaced the disk and repowered it.
This time the change was recognized - it seems it still suspected the old disk somehow as there is this model number mismatch showing.
Then i did my backup to that disk without problems as well.[ 1303.747281] ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 1303.748862] ata8.00: configured for UDMA/133
[ 1337.578270] ata8: SATA link down (SStatus 0 SControl 300)
[ 1350.057299] ata8: softreset failed (device not ready)
[ 1358.268484] ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 1358.268678] ata8.00: model number mismatch 'ST3000DM001-1CH166' != 'HGST HDN726060ALE614'
[ 1358.268680] ata8.00: revalidation failed (errno=-19)
[ 1364.084689] ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 1364.087478] ata8.00: model number mismatch 'ST3000DM001-1CH166' != 'HGST HDN726060ALE614'
[ 1364.087482] ata8.00: revalidation failed (errno=-19)
[ 1364.087484] ata8.00: disabled
[ 1369.716820] ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 1369.732524] ata8.00: ATA-9: HGST HDN726060ALE614, APGNW7JH, max UDMA/133
[ 1369.732528] ata8.00: 11721045168 sectors, multi 0: LBA48 NCQ (depth 32), AA
[ 1369.740322] ata8.00: configured for UDMA/133
[ 1369.742544] sd 7:0:0:0: rejecting I/O to offline device
[ 1369.742580] ata8.00: detaching (SCSI 7:0:0:0)
[ 1369.743047] sd 7:0:0:0: [sdf] Synchronizing SCSI cache
[ 1369.743324] sd 7:0:0:0: [sdf] Stopping disk
[ 1371.180793] ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 1373.583959] ata8.00: configured for UDMA/133
[ 1373.586405] scsi 7:0:0:0: Direct-Access ATA HGST HDN726060AL W7JH PQ: 0 ANSI: 5
[ 1373.586935] sd 7:0:0:0: Attached scsi generic sg6 type 0
[ 1373.587069] sd 7:0:0:0: [sdf] 11721045168 512-byte logical blocks: (6.00 TB/5.46 TiB)
[ 1373.587075] sd 7:0:0:0: [sdf] 4096-byte physical blocks
[ 1373.587114] sd 7:0:0:0: [sdf] Write Protect is off
[ 1373.587119] sd 7:0:0:0: [sdf] Mode Sense: 00 3a 00 00
[ 1373.587183] sd 7:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 1373.649062] sd 7:0:0:0: [sdf] Attached SCSI disk
I again did umount, put disk to sleep and powered off the bay. Waited 2 min. Then swapped disk and powered on.
Now i see that disk is removed and shows ata8 being detached.
After that it does not pick up anything on that connection any more
If i rescan again with2599.672085] ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 2602.395455] ata8.00: configured for UDMA/133
[ 2650.311862] ata8: SATA link down (SStatus 0 SControl 300)
[ 2655.735768] ata8: SATA link down (SStatus 0 SControl 300)
[ 2661.368147] ata8: SATA link down (SStatus 0 SControl 300)
[ 2661.368158] ata8.00: disabled
[ 2661.368205] ata8.00: detaching (SCSI 7:0:0:0)
[ 2661.369011] sd 7:0:0:0: [sdf] Synchronizing SCSI cache
[ 2661.369045] sd 7:0:0:0: [sdf] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 2661.369047] sd 7:0:0:0: [sdf] Stopping disk
[ 2661.369056] sd 7:0:0:0: [sdf] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
"echo 0 0 0 | tee /sys/class/scsi_host/host*/scan"
ata8 is not recognized.
[ 3034.736534] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 3034.859621] ata1.00: configured for UDMA/100
[ 3035.504330] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 3035.517568] ata2.00: configured for UDMA/133
[ 3035.824586] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 3035.826588] ata3.00: supports DRM functions and may not be fully accessible
[ 3035.827059] ata3.00: NCQ Send/Recv Log not supported
[ 3035.827677] ata3.00: supports DRM functions and may not be fully accessible
[ 3035.828083] ata3.00: NCQ Send/Recv Log not supported
[ 3035.828571] ata3.00: configured for UDMA/133
[ 3036.144401] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 3036.145643] ata4.00: configured for UDMA/133
[ 3036.456628] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 3037.379898] ata5.00: configured for UDMA/133
[ 3037.688311] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 3037.691591] ata6.00: configured for UDMA/133
[ 3038.154029] ata10: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 3038.155287] ata10.00: configured for UDMA/133


