Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Solved: Which sdX drive choked - ext4fs on LUKS on MDRAID?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9679
Location: almost Mile High in the USA

PostPosted: Sun Mar 03, 2024 8:55 am    Post subject: Solved: Which sdX drive choked - ext4fs on LUKS on MDRAID? Reply with quote

I have a cryptsetup root on MDRAID5 on /dev/sda2 /dev/sdb2 /dev/sdc2 /dev/sdd2 /dev/sde2 and this is driving me nuts. Well, the setup isn't but determining read/write errors is.

I noticed:
Code:
m4a785 ~ # cp /usr/share/binutils-data/x86_64-pc-linux-gnu/2.41/locale/fr/LC_MESSAGES/gas.mo /dev/null
cp: error reading '/usr/share/binutils-data/x86_64-pc-linux-gnu/2.41/locale/fr/LC_MESSAGES/gas.mo': Input/output error

Okay, so there is a read error. But in dmesg:
Code:
[   27.579918] (udev-worker) (1184) used greatest stack depth: 12160 bytes left
[  627.804361] kworker/dying (265) used greatest stack depth: 11912 bytes left
[20057.809441] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 wri
ting to inode 67780700 starting block 2599101)
[20057.809453] Buffer I/O error on device dm-0, logical block 2599101
[20057.809461] Buffer I/O error on device dm-0, logical block 2599102
[20057.809463] Buffer I/O error on device dm-0, logical block 2599103
[20057.809464] Buffer I/O error on device dm-0, logical block 2599104
[20057.809466] Buffer I/O error on device dm-0, logical block 2599105
[20057.809467] Buffer I/O error on device dm-0, logical block 2599106
[20057.809468] Buffer I/O error on device dm-0, logical block 2599107
[20057.809470] Buffer I/O error on device dm-0, logical block 2599108
[20057.809471] Buffer I/O error on device dm-0, logical block 2599109
[20057.809472] Buffer I/O error on device dm-0, logical block 2599110
[20112.123937] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 67780692 starting block 2598822)
[20112.123949] buffer_io_error: 164 callbacks suppressed

First off these are write errors so unrelated to the read error. Second, I don't see any read errors and no indication of which /dev/sd[a-e]2 it's choking on for the read or the write?! Are underlying block device errors suppressed when running a ext4fs on a LUKS container over mdraid5?

---

The problem turned out to be the awareness that MDRAID superblock 1.2 keeps track of bad sectors and the errors are phantom errors - no block devices were tried due to its internal bad block list and hence no disks were displayed as the bad blocks weren't even tried to be read.
Clearing the bad block list was the ultimate solution -- but watch out, doing this at whim can cause silent data corruption.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?


Last edited by eccerr0r on Tue Mar 05, 2024 9:01 pm; edited 1 time in total
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54253
Location: 56N 3W

PostPosted: Sun Mar 03, 2024 11:22 am    Post subject: Reply with quote

eccerr0r,

It all depends ...

Code:
[20057.809453] Buffer I/O error on device dm-0, logical block 2599101
That's an unhappy raid set.
The underlying problem device(s) may have done sector reallocation, so the writes eventually succeeded but not fast enough to prevent the error report.

Can you post the output of
Code:
smartctl -x /dev/sd[a-e]

Any non zero pending sector count is a bad thing. The drive knows that it has sectors that it cannot read.

Code:
echo check > /sys/block/md0/md/sync_action

Before you do that look at the mismatch_cnt in
cat /sys/devices/virtual/block/md0/md/mismatch_cnt it should be zero both before and after the check.

There is also
Code:
echo repair > /sys/block/md0/md/sync_action
but be sure you have a backup first.
check says that there is something wrong. One element of the array cannot be read or can be read but the parity data at that point is not correct.
repair says to fix the parity data when check fails.

A check will encourage the sector reallocation mechanism too.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9679
Location: almost Mile High in the USA

PostPosted: Sun Mar 03, 2024 3:50 pm    Post subject: Reply with quote

Yeah I definitely know one or more of the disks are choking through reading SMART data but I find it odd the kernel isn't reporting underlying devices directly when they are accessed. Indeed it's possible it finally succeeded later on (perhaps it's the callbacks that's being suppressed) but surprised it doesn't report devices as they choke.

Also I would have seen the md subsystem reporting that it was trying to repair if it did find and fix an inconsistency.

BTW I think the problems I'm seeing with this array is not disks but rather a bad power supply's connectors. I've always hated the SATA connector as they are hard to repair (have to replace the connectors when they fail) and I never have any spare SATA connectors...

---

Yay!
Code:
# cat mismatch_cnt
88272840

md127 : active raid5 sda2[5] sdd2[4] sde2[7] sdc2[6] sdb2[1]
      1951965184 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]
      [=================>...]  check = 89.9% (438713184/487991296) finish=21.0min speed=38953K/sec
      bitmap: 2/4 pages [8KB], 65536KB chunk

The check did point out issues on sdb2 but I can't figure out why there are so many mismatches...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9679
Location: almost Mile High in the USA

PostPosted: Mon Mar 04, 2024 4:14 pm    Post subject: Reply with quote

The "Pending sectors" count on /dev/sdb was at a high of 52, after forcing a repair on the array it went down to 49.
I'm writing big zero files to the array and now it's down to 43. Reallocates is still 0.

Was able to clean two sata power connectors. One of them was completely useless, now it seems to be at least somewhat usable as that disk it's connected to is working.

Argh. The perils of trying to reuse equipment.

(The array is all 500G disks, surprisingly all the same size. A WD Green, a WD AV-Green, a WD Blue, a Seagate ES, and a Seagate 7200.14 4K-sector. The AV-Green seems to be the questionable one though the Seagate ES was initially causing issues prior to swapping SATA power connectors. I really wonder how much money Seagate saved by making those reduced height 3.5" drives...)
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9679
Location: almost Mile High in the USA

PostPosted: Mon Mar 04, 2024 7:38 pm    Post subject: Reply with quote

AHH I figured it out.

These bad blocks are not necessarily real ... or rather, they were real prior to remapping.
This is a mdraid 1.2 "issue." Problem is that md records bad blocks and ... well doesn't try to retry bad blocks anymore after it discovers one. It then handles them poorly.

Do any other RAID subsystems (dm-raid, btrfs-raid, ???) handle bad blocks better? This seems to imply to me that I need to wipe and recreate the RAID after fixing the PSU problem as the bad blocks aren't really bad anymore, but it won't try them again. Perhaps there's a way to undo these "bad" blocks but need to write the correct data into them...

Probably easier to wipe and redo :(
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3138

PostPosted: Mon Mar 04, 2024 11:31 pm    Post subject: Reply with quote

How 'bout failing and removing the drive with bad block and then adding it to raid set again?
500GB HDD should get fully resilvered in like 30 minutes.
_________________
Make Computing Fun Again
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9679
Location: almost Mile High in the USA

PostPosted: Tue Mar 05, 2024 1:02 am    Post subject: Reply with quote

It's actually more than 30 mins for some reason - some of my disks are slow, like 60-70MB/sec, though I do have some 90-120MB/sec units. I think it's more like 2 hours based on experience from looking at the 90 minute estimate at the beginning of the disk. However I need to add another 30% or so due to inner track slowdown.

I have two disks that have false bad blocks so that's 4 hours and I need to get that temporary disk back as it's a 500G 2.5" disk temporary disk, and want to make sure the array never runs in degraded mode...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54253
Location: 56N 3W

PostPosted: Tue Mar 05, 2024 10:06 am    Post subject: Reply with quote

eccerr0r,

You can run
Code:
mdadm --replace
if you can connect the extra drive while the raid set is not degraded.
That's a lot safer than resilvering frow a degraded set as the new drive is resilvered from any n-1 of the good drives.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
frostschutz
Advocate
Advocate


Joined: 22 Feb 2005
Posts: 2977
Location: Germany

PostPosted: Tue Mar 05, 2024 3:45 pm    Post subject: Reply with quote

If you replace / fail a drive in an array with mismatches, the data on the array will change. Suppose all data is \0, but the parity is \1 (a mismatch). If you rebuild / replace one drive, the data on it previously \0 will be rebuilt as \1.

If it's a benign mismatch (filesystem free space? trim/discard?) it won't matter but if it's actual data, then it's bye-bye data. It's very bad to have mismatches. RAID can't figure out what's right and wrong, at this point you'll have to verify file contents yourself. Preferably *before* you repair, rebuild and permanently "fix" mismatches the wrong way.

For mdadm's bad block list - causing read errors on md device, even if all underlying devices were replaced - you can --assemble with --update=no-bbl or --update=force-no-bbl. Same problem, you don't necessarily get correct data for these blocks afterwards.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9679
Location: almost Mile High in the USA

PostPosted: Tue Mar 05, 2024 4:52 pm    Post subject: Reply with quote

Yeah that was the concern that I had to deal with inconsistencies.
I first found all the files that had blocks that contained the bad blocks and got a list of them, saved them.
Then I got rid of those bad blocks by disabling the list and then re-enabled it.
Then I deleted all those bad files and recopied them from source.

I think I'm good now. equery check \* says my base install is still good, and the rest of the bad block files I restored from my main, and so far I'm not seeing any more bad block behavior.

Probably should do one more diff of this array and call it good. I think the main thing that needed to be done was cleaning those SATA power connectors and the hard drive is stable now. In fact that hard drive with 52 pending sectors now reports zero pending sectors and surprisingly zero reallocates...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54253
Location: 56N 3W

PostPosted: Tue Mar 05, 2024 6:32 pm    Post subject: Reply with quote

eccerr0r,

Quote:
In fact that hard drive with 52 pending sectors now reports zero pending sectors and surprisingly zero reallocates...

That means that those 52 sectors can be read now and the drive no longer wants to relocate them.

I don't know if drives try in place rewrites before they move the data.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum