Time to put this drive to rest?

Zucca · Posted: Fri Mar 16, 2018 7:55 am Post subject: Time to put this drive to rest?

My home server had a huge load spike. I went to investigate:

bunder · Bodhisattva Joined: 10 Apr 2004 Posts: 5934

one pending sector isn't really a whole lot to worry about.

i'd be more concerned that smart found one problem but btrfs found many consecutive errors.

theoretically you could try wiping the drive and keep using it, but when in doubt throw it out. :lol:

Zucca · Posted: Fri Mar 16, 2018 11:25 am Post subject:

mike155 · Posted: Fri Mar 16, 2018 11:42 am Post subject:

Zucca · Posted: Fri Mar 16, 2018 12:01 pm Post subject:

Strange.

All the other drives I have (five more) have Raw_Read_Error_Rate between 0 and 2.
With one exception of 6, which is also WD Blue 1TB. But it seems to be about half of the age of the other...

Also one of my drives, WB BLUE 2TB has Load_Cycle_Count of 230395, while on others its under 500.
_________________
..: Zucca :..
Gentoo IRC channels reside on Libera.Chat.
--

mike155 · Posted: Fri Mar 16, 2018 12:45 pm Post subject:

Unfortunately, many of the SMART parameters and values are mostly meaningless, because they are not standardized.

The only SMART parameters that seem to be useful to (pre-) detect a drive failure are: Reallocated_Sector_Ct and Current_Pending_Sector.

A high value for Load_Cycle_Count may indicate trouble. Look at the data sheet of your drive, the number of allowed load cycles should be specified. High values typically mean that the drive supports APM (Advanced Power Management). I try to avoid such drives, at least for servers. Use '/sbin/hdparm -B /dev/sdX' to check if your drive supports APM. If you want, you can disable APM using '/sbin/hdparm -B 255 /dev/sdX'. After you disabled APM, Load_Cycle_Count should stop rising.

EDIT: I just looked at the specification sheet of WD Blue 2TB drives. It specifies '300.000' load cycles. If your current value is 230395, you definitely should do something!

Zucca · Posted: Fri Mar 16, 2018 1:10 pm Post subject:

P.Kosunen · Guru Joined: 21 Nov 2005 Posts: 309 Location: Finland

frostschutz · Advocate Joined: 22 Feb 2005 Posts: 2977 Location: Germany

Jaglover · Posted: Fri Mar 16, 2018 5:14 pm Post subject:

You may want to do something like dd if=/dev/sdx of=/dev/null conv=noerror bs=1M, it will try and read every sector revealing all bad ones.
_________________
My Gentoo installation notes.
Please learn how to denote units correctly!

NeddySeagoon · Posted: Fri Mar 16, 2018 6:36 pm Post subject:

Zucca,

Ant P. · Watchman Joined: 18 Apr 2009 Posts: 6920

Here's the WD Green in my desktop for comparison -

Zucca · Posted: Fri Mar 16, 2018 8:03 pm Post subject:

Current_Pending_Sector is now at 0. Other critical numbers haven't changed.
I have done nothing yet. I'll wait till Monday/Tuesday for the new disks.

Meanwhile I start pulling out that one disk from the system... on software side of things, I mean. I have redundancy on all the data, so pulling one from the system isn't much of a task. It just takes some time to rebalance itself.
_________________
..: Zucca :..
Gentoo IRC channels reside on Libera.Chat.
--

NeddySeagoon · Posted: Sat Mar 17, 2018 12:24 pm Post subject:

Zucca,

If the reallocated sector count did not change, the drive read the sector and was happy with the result.

If the reallocated sector count has increased, the drive got a good read and moved the data.

The reallocated sector count is supposed to increase as the drive ages and data from difficult to read sectors is moved.
The pending sector count should always be zero. Thats a count or the sectors the drive knows it can't read.

A long test may be informative. The drive will read the entire data area without any IO.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.

Jaglover · Posted: Sat Mar 17, 2018 2:32 pm Post subject:

Zucca · Posted: Sat Mar 17, 2018 9:39 pm Post subject:

Zucca · Posted: Sun Mar 18, 2018 11:08 pm Post subject:

Finally.
I did full balancing of the btrfs pool. Started at 2018-03-17T22:40:04 and ended at 2018-03-19T00:40:57. I knew it would take some time, but I disragarded the warning. Silly me. :P
Next time I'll adjust the balancing filters. Anyway. This means I don't need to start using my backups at the moment. Everything's fine. Next I'll run the long smart tests.
_________________
..: Zucca :..
Gentoo IRC channels reside on Libera.Chat.
--