Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Assistance Kernel & Hardware
  • Search

hdd dead?

Kernel not recognizing your hardware? Problems with power management or PCMCIA? What hardware is compatible with Gentoo? See here. (Only for kernels supported by Gentoo.)
Post Reply
Advanced search
45 posts
  • 1
  • 2
  • Next
Author
Message
DaggyStyle
Watchman
Watchman
User avatar
Posts: 5969
Joined: Wed Mar 22, 2006 6:57 am

hdd dead?

  • Quote

Post by DaggyStyle » Fri Jul 29, 2022 5:03 pm

so, my media hdd is dead...
here are the boot dmesg:

Code: Select all

[    2.933889] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[    2.933890] ata5.00: irq_stat 0x40000001
[    2.933891] ata5.00: failed command: READ DMA
[    2.933894] ata5.00: cmd c8/00:08:80:70:7b/00:00:00:00:00/e0 tag 8 dma 4096 in
                        res 51/04:08:80:70:7b/00:00:00:00:00/e0 Emask 0x1 (device error)
[    2.933894] ata5.00: status: { DRDY ERR }
[    2.933895] ata5.00: error: { ABRT }
[    2.934027] ata5.00: configured for UDMA/133 (device error ignored)
[    2.934038] sd 4:0:0:0: [sdb] tag#8 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
[    2.934039] sd 4:0:0:0: [sdb] tag#8 Sense Key : Illegal Request [current] 
[    2.934041] sd 4:0:0:0: [sdb] tag#8 Add. Sense: Unaligned write command
[    2.934043] sd 4:0:0:0: [sdb] tag#8 CDB: Read(10) 28 00 00 7b 70 80 00 00 08 00
[    2.934045] blk_update_request: I/O error, dev sdb, sector 8089728 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[    2.934050] ata5: EH complete
[    2.935279] nct6775: Found NCT6791D or compatible chip at 0x2e:0x290
[    2.936810] sr 0:0:0:0: [sr0] scsi3-mmc drive: 48x/12x writer dvd-ram cd/rw xa/form2 cdda tray
[    2.936812] cdrom: Uniform CD-ROM driver Revision: 3.20
[    2.940622] tun: Universal TUN/TAP device driver, 1.6
[    2.947774] VFIO - User Level meta-driver version: 0.3
[    2.950883] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[    2.950887] ata5.00: irq_stat 0x40000001
[    2.950892] ata5.00: failed command: READ DMA
[    2.950901] ata5.00: cmd c8/00:08:80:70:7b/00:00:00:00:00/e0 tag 2 dma 4096 in
                        res 51/04:08:80:70:7b/00:00:00:00:00/e0 Emask 0x1 (device error)
[    2.950904] ata5.00: status: { DRDY ERR }
[    2.950907] ata5.00: error: { ABRT }
[    2.951023] ata5.00: configured for UDMA/133 (device error ignored)
[    2.951029] ata5: EH complete
[    2.958913] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[    2.958917] ata5.00: irq_stat 0x40000001
[    2.958921] ata5.00: failed command: READ DMA
[    2.958931] ata5.00: cmd c8/00:08:80:70:7b/00:00:00:00:00/e0 tag 22 dma 4096 in
                        res 51/04:08:80:70:7b/00:00:00:00:00/e0 Emask 0x1 (device error)
[    2.958934] ata5.00: status: { DRDY ERR }
[    2.958937] ata5.00: error: { ABRT }
[    2.959235] ata5.00: configured for UDMA/133 (device error ignored)
[    2.959260] ata5: EH complete
[    2.963016] vfio_pci: add [8086:a170[ffffffff:ffffffff]] class 0x000000/00000000
[    2.963034] vfio-pci 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[    2.966911] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[    2.966915] ata5.00: irq_stat 0x40000001
[    2.966919] ata5.00: failed command: READ DMA
[    2.966929] ata5.00: cmd c8/00:08:80:70:7b/00:00:00:00:00/e0 tag 1 dma 4096 in
                        res 51/04:08:80:70:7b/00:00:00:00:00/e0 Emask 0x1 (device error)
[    2.966932] ata5.00: status: { DRDY ERR }
[    2.966935] ata5.00: error: { ABRT }
[    2.967295] ata5.00: configured for UDMA/133 (device error ignored)
[    2.967320] ata5: EH complete
[    2.968524] sr 0:0:0:0: Attached scsi CD-ROM sr0
[    2.974927] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[    2.974931] ata5.00: irq_stat 0x40000001
[    2.974935] ata5.00: failed command: READ DMA
[    2.974945] ata5.00: cmd c8/00:08:80:70:7b/00:00:00:00:00/e0 tag 3 dma 4096 in
                        res 51/04:08:80:70:7b/00:00:00:00:00/e0 Emask 0x1 (device error)
[    2.974948] ata5.00: status: { DRDY ERR }
[    2.974950] ata5.00: error: { ABRT }
[    2.975241] ata5.00: configured for UDMA/133 (device error ignored)
[    2.975266] ata5: EH complete
[    2.976044] vfio_pci: add [8086:5912[ffffffff:ffffffff]] class 0x000000/00000000
[    2.983910] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[    2.983914] ata5.00: irq_stat 0x40000001
[    2.983918] ata5.00: failed command: READ DMA
[    2.983928] ata5.00: cmd c8/00:08:80:70:7b/00:00:00:00:00/e0 tag 5 dma 4096 in
                        res 51/04:08:80:70:7b/00:00:00:00:00/e0 Emask 0x1 (device error)
[    2.983931] ata5.00: status: { DRDY ERR }
[    2.983934] ata5.00: error: { ABRT }
[    2.984226] ata5.00: configured for UDMA/133 (device error ignored)
[    2.984251] ata5: EH complete
[    2.990917] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[    2.990920] ata5.00: irq_stat 0x40000001
[    2.990925] ata5.00: failed command: READ DMA
[    2.990935] ata5.00: cmd c8/00:08:80:70:7b/00:00:00:00:00/e0 tag 7 dma 4096 in
                        res 51/04:08:80:70:7b/00:00:00:00:00/e0 Emask 0x1 (device error)
[    2.990937] ata5.00: status: { DRDY ERR }
[    2.990940] ata5.00: error: { ABRT }
[    2.991251] ata5.00: configured for UDMA/133 (device error ignored)
[    2.991280] sd 4:0:0:0: [sdb] tag#7 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
[    2.991286] sd 4:0:0:0: [sdb] tag#7 Sense Key : Illegal Request [current] 
[    2.991291] sd 4:0:0:0: [sdb] tag#7 Add. Sense: Unaligned write command
[    2.991295] sd 4:0:0:0: [sdb] tag#7 CDB: Read(10) 28 00 00 7b 70 80 00 00 08 00
[    2.991301] blk_update_request: I/O error, dev sdb, sector 8089728 op 0x0:(READ) flags 0x0 phys_seg 4 prio class 0
[    2.991306] Buffer I/O error on dev sdb, logical block 4044864, async page read
[    2.991309] Buffer I/O error on dev sdb, logical block 4044865, async page read
[    2.991330] ata5: EH complete
smartctl -a /dev/sdb

Code: Select all

smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.76-gentoo-r1] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     ST3000DM001
Serial Number:    ZA500H9A
LU WWN Device Id: 5 000c50 07a7155cd
Firmware Version: CC25
User Capacity:    137,438,952,960 bytes [137 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    7200 rpm
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s
Local Time is:    Fri Jul 29 19:52:18 2022 IDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Read SMART Data failed: scsi error badly formed scsi parameters

=== START OF READ SMART DATA SECTION ===
SMART Status command failed: scsi error badly formed scsi parameters
SMART overall-health self-assessment test result: UNKNOWN!
SMART Status, Attributes and Thresholds cannot be read.

Read SMART Log Directory failed: scsi error badly formed scsi parameters

Read SMART Error Log failed: scsi error badly formed scsi parameters

Read SMART Self-test Log failed: scsi error badly formed scsi parameters

Selective Self-tests/Logging not supported
the hdd is 3T, what are the chances it is the cable?
Only two things are infinite, the universe and human stupidity and I'm not sure about the former - Albert Einstein
Top
mike155
Advocate
Advocate
Posts: 4438
Joined: Fri Sep 17, 2010 11:33 pm
Location: Frankfurt, Germany

  • Quote

Post by mike155 » Fri Jul 29, 2022 5:13 pm

It could be a connection problem. Can you try to attach the drive to a different computer with a different cable? Or at least to a different SATA port with a different cable?
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56087
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Fri Jul 29, 2022 5:16 pm

DaggyStyle,

That looks like the HDD is connected over USB and the USB/HDD bridge chip does not support the full SCSI command set.
Can you connect it to the motherboard?
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
DaggyStyle
Watchman
Watchman
User avatar
Posts: 5969
Joined: Wed Mar 22, 2006 6:57 am

  • Quote

Post by DaggyStyle » Fri Jul 29, 2022 6:19 pm

mike155 wrote:It could be a connection problem. Can you try to attach the drive to a different computer with a different cable? Or at least to a different SATA port with a different cable?
that was my intention, I thought I can
NeddySeagoon wrote:DaggyStyle,

That looks like the HDD is connected over USB and the USB/HDD bridge chip does not support the full SCSI command set.
Can you connect it to the motherboard?
it is, it's connected to the mb with sata cable to one of the sata ports
Only two things are infinite, the universe and human stupidity and I'm not sure about the former - Albert Einstein
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56087
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Fri Jul 29, 2022 6:38 pm

DaggyStyle,

The smart data says

Code: Select all

User Capacity:    137,438,952,960 bytes [137 GB]
DaggyStyle wrote:... the hdd is 3T ...
They can't both be right. By a strange coincidence, 137G is all you can address under some very old BIOSes.
Very old kernels needed the kernel parameter hdd=stroke to turn on 48 bit LBA after the BIOS failed to provide support.

You could also have turned on the Host Protected Area and pun most the space into that. You can't use it there.

Anyway.Until you can see the entire drive all bets are off.

Pastebin your entire dmesg.
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
DaggyStyle
Watchman
Watchman
User avatar
Posts: 5969
Joined: Wed Mar 22, 2006 6:57 am

  • Quote

Post by DaggyStyle » Fri Jul 29, 2022 7:51 pm

here: https://dpaste.com/6VTC3JUEZ
I don't know if it is turned one, I didn't do anything, it happened while streaming a file from the device.
Only two things are infinite, the universe and human stupidity and I'm not sure about the former - Albert Einstein
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56087
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Fri Jul 29, 2022 8:02 pm

DaggyStyle,

Code: Select all

[    0.680246] ata5.00: failed to read native max address (err_mask=0x1)
[    0.680257] ata5.00: HPA support seems broken, skipping HPA handling
That failed to read native max address is game over.

I've not seen that before. That means that the kernel can't tell how big the HDD is.

Code: Select all

[    0.727050] ata5.00: status: { DRDY ERR }
suggests the drive never becomes ready.
Try a different SATA data cable.
First on the same motherboard SATA port, then on a different SATA motherboard port.

Nothing matters until that failed to read native max address error goes away.
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
DaggyStyle
Watchman
Watchman
User avatar
Posts: 5969
Joined: Wed Mar 22, 2006 6:57 am

  • Quote

Post by DaggyStyle » Sun Jul 31, 2022 5:41 pm

I've extracted the disk and inserted it into sata to usb case, the device is visible however there are issues reported, see:

Code: Select all

# smartctl -a /dev/sdh
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-5.18.15-gentoo] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST3000DM001-1ER166
Serial Number:    ZA500H9A
LU WWN Device Id: 5 000c50 07a7155cd
Firmware Version: CC25
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database 7.3/5387
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Jul 31 20:39:32 2022 IDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (   80) seconds.
Offline data collection
capabilities:                    (0x73) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 315) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x1085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   119   099   006    Pre-fail  Always       -       205953736
  3 Spin_Up_Time            0x0003   098   093   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   099   099   020    Old_age   Always       -       1089
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   062   060   030    Pre-fail  Always       -       1627599
  9 Power_On_Hours          0x0032   037   037   000    Old_age   Always       -       55446
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       403
183 Runtime_Bad_Block       0x0032   098   098   000    Old_age   Always       -       2
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   099   000    Old_age   Always       -       2 2 3
189 High_Fly_Writes         0x003a   093   093   000    Old_age   Always       -       7
190 Airflow_Temperature_Cel 0x0022   072   048   045    Old_age   Always       -       28 (Min/Max 24/28)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       755
193 Load_Cycle_Count        0x0032   067   067   000    Old_age   Always       -       67397
194 Temperature_Celsius     0x0022   028   052   000    Old_age   Always       -       28 (0 17 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       18503h+11m+42.001s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       28041850244
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       89685898172

SMART Error Log Version: 1
ATA Error Count: 2
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2 occurred at disk power-on lifetime: 21562 hours (898 days + 10 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  04 51 00 00 00 00 00  Error: ABRT

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  00 00 00 00 00 00 00 ff      07:13:04.285  NOP [Abort queued commands]
  b0 d4 00 82 4f c2 00 00      07:12:43.664  SMART EXECUTE OFF-LINE IMMEDIATE
  b0 d0 01 00 4f c2 00 00      07:12:43.581  SMART READ DATA
  ec 00 01 00 00 00 00 00      07:12:43.576  IDENTIFY DEVICE
  ec 00 01 00 00 00 00 00      07:12:43.575  IDENTIFY DEVICE

Error 1 occurred at disk power-on lifetime: 21553 hours (898 days + 1 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  04 51 00 00 00 00 00  Error: ABRT

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  00 00 00 00 00 00 00 ff      00:05:54.302  NOP [Abort queued commands]
  b0 d4 00 81 4f c2 00 00      00:05:33.750  SMART EXECUTE OFF-LINE IMMEDIATE
  b0 d0 01 00 4f c2 00 00      00:05:33.694  SMART READ DATA
  ec 00 01 00 00 00 00 00      00:05:33.689  IDENTIFY DEVICE
  ec 00 01 00 00 00 00 00      00:05:33.688  IDENTIFY DEVICE

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     55446         -
# 2  Extended captive    Interrupted (host reset)      90%     21562         -
# 3  Short captive       Interrupted (host reset)      70%     21553         -
# 4  Short offline       Completed without error       00%     21553         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
any incites?
Only two things are infinite, the universe and human stupidity and I'm not sure about the former - Albert Einstein
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56087
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Sun Jul 31, 2022 6:21 pm

DaggyStyle,

Code: Select all

User Capacity:    3,000,592,982,016 bytes [3.00 TB] 
That's a good start.

Code: Select all

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE 
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0 
  9 Power_On_Hours          0x0032   037   037   000    Old_age   Always       -       55446
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0 
All that says is the drive is old. 55446 hours.

The errors look like problems communicating with the host at power up.

Code: Select all

Error 2 occurred at disk power-on lifetime: 21562 hours 
They are old and have not recurred.

Code: Select all

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     55446        -
That's your recent short test. The other tests are old, around the time of the two errors in the log.

Run the long test. If it passes, it's probably OK for now. I get nervous around 60000 running hours.
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
eccerr0r
Watchman
Watchman
Posts: 10239
Joined: Thu Jul 01, 2004 6:51 pm
Location: almost Mile High in the USA
Contact:
Contact eccerr0r
Website

  • Quote

Post by eccerr0r » Mon Aug 01, 2022 4:28 pm

I've been getting those illegal write commands which are indicative of OS problems, did you switch kernels recently or perhaps didn't notice the failed commands until now?

BTW, sigh. Just had a 200GB disk buy the farm. It gave sector errors and SMART soon after reported "24 hours to live"... I copied everything off of it to another disk and sure enough, a few days later, it fails to detect.

SMART was smart this time around. Probably the first time actually. I have another disk that SMART reports "24 hours to live" and the disk is still working, sort of...
Intel Core i7 2700K/Radeon Firepro W2100/24GB DDR3/800GB SSD
What am I supposed watching?
Top
DaggyStyle
Watchman
Watchman
User avatar
Posts: 5969
Joined: Wed Mar 22, 2006 6:57 am

  • Quote

Post by DaggyStyle » Mon Aug 01, 2022 7:19 pm

NeddySeagoon wrote:DaggyStyle,

Code: Select all

User Capacity:    3,000,592,982,016 bytes [3.00 TB] 
That's a good start.

Code: Select all

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE 
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0 
  9 Power_On_Hours          0x0032   037   037   000    Old_age   Always       -       55446
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0 
All that says is the drive is old. 55446 hours.

The errors look like problems communicating with the host at power up.

Code: Select all

Error 2 occurred at disk power-on lifetime: 21562 hours 
They are old and have not recurred.

Code: Select all

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     55446        -
That's your recent short test. The other tests are old, around the time of the two errors in the log.

Run the long test. If it passes, it's probably OK for now. I get nervous around 60000 running hours.
I cannot complete the long test because the usb hub keeps reseting,,. I''l have to try do it from a windows system unfortunately..
eccerr0r wrote:I've been getting those illegal write commands which are indicative of OS problems, did you switch kernels recently or perhaps didn't notice the failed commands until now?

BTW, sigh. Just had a 200GB disk buy the farm. It gave sector errors and SMART soon after reported "24 hours to live"... I copied everything off of it to another disk and sure enough, a few days later, it fails to detect.

SMART was smart this time around. Probably the first time actually. I have another disk that SMART reports "24 hours to live" and the disk is still working, sort of...
so, the system didn't changed kernel in ages, in fact I'm planing to replace the os with a different flavor.
Only two things are infinite, the universe and human stupidity and I'm not sure about the former - Albert Einstein
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56087
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Mon Aug 01, 2022 8:11 pm

DaggyStyle,

As long as the HDD does not power down, you don't need a data link during the test.
You start the test then come back after its complete and read the answer.

There is another way but it involves the outside word, not just the drive internals.
dd the content of the drive to /dev/null. Use bs=1M so you don't die of old age waiting.
If that works the drive, interface and cables are good.
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
DaggyStyle
Watchman
Watchman
User avatar
Posts: 5969
Joined: Wed Mar 22, 2006 6:57 am

  • Quote

Post by DaggyStyle » Tue Aug 02, 2022 9:16 pm

NeddySeagoon wrote:DaggyStyle,

As long as the HDD does not power down, you don't need a data link during the test.
You start the test then come back after its complete and read the answer.

There is another way but it involves the outside word, not just the drive internals.
dd the content of the drive to /dev/null. Use bs=1M so you don't die of old age waiting.
If that works the drive, interface and cables are good.
tried the latter suggestion, after 17534.9 seconds, the read ended with no errors in dmesg.
this means the hdd is good and my issue is either the the port on the original board, the cable or both.

thanks for the help All!
Only two things are infinite, the universe and human stupidity and I'm not sure about the former - Albert Einstein
Top
DaggyStyle
Watchman
Watchman
User avatar
Posts: 5969
Joined: Wed Mar 22, 2006 6:57 am

  • Quote

Post by DaggyStyle » Sat Aug 13, 2022 7:36 am

update,

I've inserted the hdd back to the case, replaced both sata cable and the sata port.
still getting errors, see:

Code: Select all

[    0.000000] BIOS-e820: [mem 0x0000000047f9a000-0x0000000047ffefff] ACPI data
[    0.006169] ACPI: SSDT 0x0000000047F52850 00036D (v01 SataRe SataTabl 00001000 INTL 20120913)
[    0.094662] Memory: 31518464K/32429532K available (12299K kernel code, 2374K rwdata, 2588K rodata, 1080K init, 1152K bss, 910812K reserved, 0K cma-reserved)
[    0.112372] MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.
[    0.112372] TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.
[    0.167925] libata version 3.00 loaded.
[    0.368795] ata1: SATA max UDMA/133 abar m2048@0xdf04b000 port 0xdf04b100 irq 128
[    0.368797] ata2: SATA max UDMA/133 abar m2048@0xdf04b000 port 0xdf04b180 irq 128
[    0.368799] ata3: SATA max UDMA/133 abar m2048@0xdf04b000 port 0xdf04b200 irq 128
[    0.368801] ata4: SATA max UDMA/133 abar m2048@0xdf04b000 port 0xdf04b280 irq 128
[    0.368803] ata5: SATA max UDMA/133 abar m2048@0xdf04b000 port 0xdf04b300 irq 128
[    0.368805] ata6: SATA max UDMA/133 abar m2048@0xdf04b000 port 0xdf04b380 irq 128
[    0.678960] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    0.678999] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    0.679032] ata6: SATA link down (SStatus 4 SControl 300)
[    0.679065] ata2: SATA link down (SStatus 4 SControl 300)
[    0.679097] ata5: SATA link down (SStatus 4 SControl 300)
[    0.680227] ata1.00: ATAPI: HL-DT-ST DVDRAM GH24NSD1, LW00, max UDMA/133
[    0.681736] ata1.00: configured for UDMA/133
[    0.686747] ata3.00: ATA-9: TS120GSSD220S, R0510A0, max UDMA/133
[    0.686761] ata3.00: 234441648 sectors, multi 1: LBA48 NCQ (depth 32), AA
[    0.705233] ata3.00: configured for UDMA/133
[    2.594837] ata4: COMRESET failed (errno=-32)
[    2.594849] ata4: reset failed (errno=-32), retrying in 8 secs
[   13.031545] ata4: COMRESET failed (errno=-32)
[   13.031558] ata4: reset failed (errno=-32), retrying in 8 secs
[   23.291916] ata4: COMRESET failed (errno=-32)
[   23.291928] ata4: reset failed (errno=-32), retrying in 33 secs
[   60.642426] ata4: COMRESET failed (errno=-32)
[   60.642439] ata4: reset failed, giving up
[   62.873960] ata4: COMRESET failed (errno=-32)
[   62.873972] ata4: reset failed (errno=-32), retrying in 8 secs
[   73.088958] ata4: SATA link down (SStatus 1 SControl 300)
[   74.200021] EXT4-fs (sda3): mounted filesystem with ordered data mode. Opts: (null)
[   74.201225] Write protecting the kernel read-only data: 18432k
[   74.201565] Freeing unused kernel image (text/rodata gap) memory: 2036K
[   74.201799] Freeing unused kernel image (rodata/data gap) memory: 1508K
[   75.629917] ata4: COMRESET failed (errno=-32)
[   75.629922] ata4: reset failed (errno=-32), retrying in 8 secs
[   79.084262] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[   79.335596] L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.
[   85.834889] ata4: COMRESET failed (errno=-32)
[   85.834892] ata4: reset failed (errno=-32), retrying in 8 secs
[   96.104940] ata4: SATA link down (SStatus 1 SControl 300)
[   98.391910] ata4: COMRESET failed (errno=-32)
[   98.391912] ata4: reset failed (errno=-32), retrying in 8 secs
[  109.127898] ata4: SATA link down (SStatus 1 SControl 300)
[  111.401917] ata4: COMRESET failed (errno=-32)
[  111.401924] ata4: reset failed (errno=-32), retrying in 8 secs
[  121.782947] ata4: COMRESET failed (errno=-32)
[  121.782955] ata4: reset failed (errno=-32), retrying in 8 secs
[  131.833965] ata4: SATA link down (SStatus 1 SControl 300)
[  134.262950] ata4: COMRESET failed (errno=-32)
[  134.262957] ata4: reset failed (errno=-32), retrying in 8 secs
[  144.202950] ata4: COMRESET failed (errno=-32)
[  144.202957] ata4: reset failed (errno=-32), retrying in 8 secs
[  154.550950] ata4: COMRESET failed (errno=-32)
[  154.550957] ata4: reset failed (errno=-32), retrying in 33 secs
[  191.791960] ata4: COMRESET failed (errno=-32)
[  191.791968] ata4: reset failed, giving up
[  194.187956] ata4: COMRESET failed (errno=-32)
[  194.187962] ata4: reset failed (errno=-32), retrying in 8 secs
[  204.160953] ata4: COMRESET failed (errno=-32)
[  204.160960] ata4: reset failed (errno=-32), retrying in 8 secs
[  214.376954] ata4: COMRESET failed (errno=-32)
[  214.376961] ata4: reset failed (errno=-32), retrying in 33 secs
[  253.285955] ata4: COMRESET failed (errno=-32)
[  253.285963] ata4: reset failed, giving up
[  255.648955] ata4: COMRESET failed (errno=-32)
[  255.648962] ata4: reset failed (errno=-32), retrying in 8 secs
[  266.085955] ata4: COMRESET failed (errno=-32)
[  266.085962] ata4: reset failed (errno=-32), retrying in 8 secs
can it be the power cable?
Only two things are infinite, the universe and human stupidity and I'm not sure about the former - Albert Einstein
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56087
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Sat Aug 13, 2022 11:19 am

DaggyStyle,

The power cable is unlikely.

The kernel can see the drive, or it wouldn't be sending the COMRESET command to be told

Code: Select all

[   75.629917] ata4: COMRESET failed (errno=-32)
[   75.629922] ata4: reset failed (errno=-32), retrying in 8 secs 
It could be the electronics board on the drive, or possibly the drive failing to spin up.
In the case of the electronics board, you would need a replacement from an *identical* drive.
The drive failing to spin up can be felt. Hold the drive in your hand then plug the power cable in.
That's safe as SATA is hot pluggable.
You will feel the spinup if it happens.
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
DaggyStyle
Watchman
Watchman
User avatar
Posts: 5969
Joined: Wed Mar 22, 2006 6:57 am

  • Quote

Post by DaggyStyle » Sat Aug 13, 2022 12:28 pm

there are two additional sata devices, a sdd and a cdrom, both seems to work ok
I wonder, how it can be working with the eternal case and not when it is in the case?
Only two things are infinite, the universe and human stupidity and I'm not sure about the former - Albert Einstein
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56087
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Sat Aug 13, 2022 12:43 pm

DaggyStyle,
DaggyStyle wrote:I wonder, how it can be working with the eternal case and not when it is in the case?
Ahhh ...
Once, a long time ago, I had a USB HDD that had problems. I took it out of the case and tried it in the PC.
Well it had conventional SATA connectors, so why not.

However, the USB/SATA bridge was on the drive. The SATA data connector carried the USB signals. Eww.
dmesg was good enough to see that it was USB and refused to go further.

I wonder if your drive is like that?
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
DaggyStyle
Watchman
Watchman
User avatar
Posts: 5969
Joined: Wed Mar 22, 2006 6:57 am

  • Quote

Post by DaggyStyle » Sat Aug 13, 2022 1:37 pm

NeddySeagoon wrote:DaggyStyle,
DaggyStyle wrote:I wonder, how it can be working with the eternal case and not when it is in the case?
Ahhh ...
Once, a long time ago, I had a USB HDD that had problems. I took it out of the case and tried it in the PC.
Well it had conventional SATA connectors, so why not.

However, the USB/SATA bridge was on the drive. The SATA data connector carried the USB signals. Eww.
dmesg was good enough to see that it was USB and refused to go further.

I wonder if your drive is like that?
the case is an stlab sata to usb3 and the hdd is a seagate 3t one. not sure which model
Only two things are infinite, the universe and human stupidity and I'm not sure about the former - Albert Einstein
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56087
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Sat Aug 13, 2022 1:48 pm

DaggyStyle,

You would need to know the chip numbers from the PCB inside the USB case.
That will tell what they are. I would expect a power supply chip and a USB to SATA bridge chip.

The drive reports itself as

Code: Select all

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST3000DM001-1ER166
Serial Number:    ZA500H9A
LU WWN Device Id: 5 000c50 07a7155cd
Firmware Version: CC25
User Capacity:    3,000,592,982,016 bytes [3.00 TB] 
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
DaggyStyle
Watchman
Watchman
User avatar
Posts: 5969
Joined: Wed Mar 22, 2006 6:57 am

  • Quote

Post by DaggyStyle » Sat Aug 13, 2022 6:41 pm

NeddySeagoon wrote:DaggyStyle,

You would need to know the chip numbers from the PCB inside the USB case.
That will tell what they are. I would expect a power supply chip and a USB to SATA bridge chip.

The drive reports itself as

Code: Select all

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST3000DM001-1ER166
Serial Number:    ZA500H9A
LU WWN Device Id: 5 000c50 07a7155cd
Firmware Version: CC25
User Capacity:    3,000,592,982,016 bytes [3.00 TB] 
not sure I can get that but I'll try, I think the case is of local production
Only two things are infinite, the universe and human stupidity and I'm not sure about the former - Albert Einstein
Top
DaggyStyle
Watchman
Watchman
User avatar
Posts: 5969
Joined: Wed Mar 22, 2006 6:57 am

  • Quote

Post by DaggyStyle » Sun Aug 14, 2022 4:11 pm

what can be deduced if I replace the hdd with another one from another company?
if it works ok, it is the hdd, if not?
Only two things are infinite, the universe and human stupidity and I'm not sure about the former - Albert Einstein
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56087
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Sun Aug 14, 2022 4:16 pm

DaggyStyle,

If that works, you can be sure that the drives are different.

Yuo said that in works in its USB enclosure. That makes me think the USB/SATA bridge in included on the drive.
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
DaggyStyle
Watchman
Watchman
User avatar
Posts: 5969
Joined: Wed Mar 22, 2006 6:57 am

  • Quote

Post by DaggyStyle » Wed Aug 17, 2022 3:31 pm

not sure I follow, I cannot get anything regarding the case,
what I can do is try another drive:
  • with the same capacity, different vendor.
  • with the different capacity, same vendor.
  • with the different capacity, different vendor.
will any of this help?
Only two things are infinite, the universe and human stupidity and I'm not sure about the former - Albert Einstein
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56087
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Wed Aug 17, 2022 6:22 pm

DaggyStyle,

Replacing the HDD with a different drive may indicate something about everything except the drive that was replaced.
It says nothing about that drive. It may increase confidence that the drive that was replaced was faulty.

You said that the drive works in its USB enclosure.
Is that still correct?
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
DaggyStyle
Watchman
Watchman
User avatar
Posts: 5969
Joined: Wed Mar 22, 2006 6:57 am

  • Quote

Post by DaggyStyle » Thu Aug 18, 2022 3:31 pm

NeddySeagoon wrote:DaggyStyle,

Replacing the HDD with a different drive may indicate something about everything except the drive that was replaced.
It says nothing about that drive. It may increase confidence that the drive that was replaced was faulty.

You said that the drive works in its USB enclosure.
Is that still correct?
yup, thats correct
Only two things are infinite, the universe and human stupidity and I'm not sure about the former - Albert Einstein
Top
Post Reply

45 posts
  • 1
  • 2
  • Next

Return to “Kernel & Hardware”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic