View previous topic :: View next topic |
Author |
Message |
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2014
|
Posted: Tue Oct 02, 2018 4:53 pm Post subject: in ext4_free_inode:363: Corrupt filesystem |
|
|
I see Nicias had similar messages last year, so this might be related.
My system suddenly threw the following messages into syslog:
Code: | Oct 2 10:51:29 kernel: EXT4-fs error (device md127p2): ext4_free_inode:351: comm rm: bit already cleared for inode 1310730
Oct 2 10:51:29 kernel: EXT4-fs error (device md127p2) in ext4_free_inode:363: Corrupt filesystem
...
Oct 2 10:51:29 kernel: EXT4-fs error (device md127p2): ext4_lookup:1585: inode #1310723: comm rm: deleted inode referenced: 1310725
Oct 2 10:51:29 kernel: EXT4-fs error (device md127p2) in ext4_free_inode:363: Corrupt filesystem
...
Oct 2 10:51:30 kernel: EXT4-fs error (device md127p1): mb_free_blocks:1468: group 359, block 11763824:freeing already freed block (bit 112); block bitmap corrupt.
Oct 2 10:51:30 kernel: EXT4-fs error (device md127p1): ext4_mb_generate_buddy:756: group 359, block bitmap and bg descriptor inconsistent: 32342 vs 32343 free clusters
...
Oct 2 10:51:32 kernel: EXT4-fs error (device md127p1): __ext4_new_inode:1120: comm cupsd: failed to insert inode 2883672: doubly allocated?
Oct 2 10:51:33 kernel: EXT4-fs error: 3 callbacks suppressed
Oct 2 10:51:33 kernel: EXT4-fs error (device md127p2): ext4_lookup:1585: inode #1310723: comm X: deleted inode referenced: 1310725
Oct 2 10:51:33 kernel: EXT4-fs error (device md127p2): ext4_lookup:1585: inode #1310723: comm X: deleted inode referenced: 1310725
Oct 2 10:51:38 kernel: EXT4-fs error (device md127p1): ext4_mb_generate_buddy:756: group 10, block bitmap and bg descriptor inconsistent: 3072 vs 3071 free clusters
Oct 2 10:52:13 kernel: EXT4-fs error (device md127p4): ext4_mb_generate_buddy:756: group 1825, block bitmap and bg descriptor inconsistent: 29048 vs 29071 free clusters
Oct 2 10:52:13 kernel: EXT4-fs error (device md127p4): ext4_mb_generate_buddy:756: group 1808, block bitmap and bg descriptor inconsistent: 24302 vs 24301 free clusters
Oct 2 10:52:19 kernel: EXT4-fs error (device md127p4): ext4_lookup:1585: inode #2230590: comm kactivitymanage: deleted inode referenced: 2229340
... |
and so on, for the rest of my session, which was a couple of hours before I shut down.
I hadn't noticed at the time; instead, when I started the box again, fsck started doing lots of stuff, the sort of thing that makes you think something's corrupted the disk. I stopped to take a tar backup of the data on the affected drive, and a full system backup (all drives).
Now the weird bit is, as you'll have noticed above, this is a mapped drive; it's actually partition 4 in my 4-disk mdadm RAID-5 array. This partition is /var. Three other partitions make the rootfs (/), /home, and one I call "ephemera" which is /var/tmp, and a bind mount within /var/tmp as /tmp.
None of these other 3 partitions had any problem. I find that weird.
Having taken backups (and already having an incremental backup from last weekend), I felt safe enough to reboot in Single-user mode, and run against the drive. Err, no problems found. Despite getting warning messages while taking the tar backup that I was mounting a file system with errors. Weird.
I'm running kernel 4.14.65; it's an AMD Phenom 4-way system, Gentoo stable. Never shown a problem like this before. SMART says the drives are fine, but it would say that, wouldn't it.
Any thoughts anybody? _________________ Greybeard |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54300 Location: 56N 3W
|
Posted: Tue Oct 02, 2018 5:34 pm Post subject: |
|
|
Goverp,
smartctl -a for the affected drive .. or all the drives in the raid set may be useful.
What does /proc/mdstat say about the raid sets. Has a drive/partiton been dropped? _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2014
|
Posted: Tue Oct 02, 2018 7:55 pm Post subject: |
|
|
Hi Neddy.
As above, I doubt smartctl is of interest, as there are 4 partitions on the 4 disks in the array, and only this partition shows the problem. I looked at smartctl -H for all the drives, all happy.
For more detail, I looked at smartctl -a for each drive. No reallocations, no errors logged, one raw read error. I think the disks are fine!
I ran checkarray; it found a number of mismatches. All were in the dodgy partition. I've been playing with debugfs; all the mismatch sectors map to inode <7>. This is apparently a reserved/hidden inode,
EXT2_RESIZE_INO. It seems to be 400Mb, which looks a lot, but may make sense to someone who knows what the heck ext4 uses it for. The whole disk is just 43G.
I suspect (a) I should mdadm repair the disk, then (b) fsck -fp it, and hope everything is tidy.
There's something a bit suspect in the partition table. The one causing problems is partiion 1, which runs from sector 4 for a lot of sectors in md127. I'm not sure if that's vulnerable to being overwritten by the RAID superblocks. The array uses V1 superblocks, not 0.90. _________________ Greybeard |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54300 Location: 56N 3W
|
Posted: Tue Oct 02, 2018 8:44 pm Post subject: |
|
|
Goverp,
mdadm repair probably won't do anything. It checks the underlying raid components for consistency.
Here, the problem is with the filesystem on top of the raid. I suspect that the raid is self consistent.
The version 1 raid superblock is at the beginning of the volume. When you donate whole drives to a raid set, it starts where the MBR would be if you had one.
As long as you have not created a partition table on one of the drives belonging to the raid set, you should be good.
mdadm would not be able at include that drive in the raid set (I hope), so you would be in degraded mode but working properly at the filesystem level. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Jaglover Watchman
Joined: 29 May 2005 Posts: 8291 Location: Saint Amant, Acadiana
|
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2014
|
Posted: Tue Oct 02, 2018 9:16 pm Post subject: |
|
|
No, no mucking about creating partitions either on the real disks or in the RAID array.
I suspect the mismatches come from a problem earlier this year, which I eventually traces to a loose SATA cable to one drive. That drive dropped out; I sorted the cables and added the drive back into the array. IIUC that process uses the RAID bitmap, and as the inode <7> stuff isn't in normal use, presumably it wasn't processed. Anyway, I'll try repairing the array, which should clear the mismatches. If I get any further issues, I'll delete and recreate the partition from backup.
As I've just performed the checkarray, that will have read the entire surface of all the disks; no errors reported in SMART. _________________ Greybeard |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|