Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Ext4 fs corruption
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
gtbX
Tux's lil' helper
Tux's lil' helper


Joined: 11 Oct 2003
Posts: 99

PostPosted: Tue Aug 06, 2013 4:25 pm    Post subject: Ext4 fs corruption Reply with quote

I'm having an odd issue with one of the machines I remotely administer. Recently, it's /home partition started developing filesystem errors that prevent it from being mounted at boot. Instead it drops to a login screen, and I have to walk someone through logging in and running fsck -y on the partition. It seems to need it every time it reboots now. I tried reformatting the partition with
Code:
mke2fs -t ext4 -c -c /dev/sda7
to scan for bad blocks, but it didn't find any. I suppose it might be the superblock(?) that's bad, but I would think that would've been detected too.

So I have 2 questions:
1. What could be causing this/how to prevent it?
2. Can the init scripts be configured to keep booting, even if /home fails to mount, so that I can at least ssh into the box?
Back to top
View user's profile Send private message
eccerr0r
Advocate
Advocate


Joined: 01 Jul 2004
Posts: 3601
Location: USA

PostPosted: Tue Aug 06, 2013 5:40 pm    Post subject: Reply with quote

Remember that corruption doesn't necessarily come from the disk. Just like any other computer, garbage in, garbage out. Your CPU could be emitting garbage for the disk to write, or perhaps your RAM has amnesia causing your CPU to write bad data to the disk.

I would think that initscripts should keep on booting without home, but since ~/.ssh lives on home for many users, it would still be hard to ssh in (especially if root is disabled).
_________________
Intel Core i7 2700K@ 4.1GHz/HD3000 graphics/8GB DDR3/180GB SSD
What am I supposed to be advocating?
Back to top
View user's profile Send private message
gtbX
Tux's lil' helper
Tux's lil' helper


Joined: 11 Oct 2003
Posts: 99

PostPosted: Wed Aug 07, 2013 5:15 am    Post subject: Reply with quote

I think if it was a kernel or RAM issue, I'd see more problems than just this. Then again, I first saw this problem shortly after upgrading to gentoo-sources-3.8.13. I haven't had any issues with the root fs (also ext4), but it gets less I/O.

The init scripts fail at running fsck on /home, and drop to an emergency login: "Welcome to (none).(none)" or something. The hostname isn't even set yet. Conceivably, the network and sshd could be started, and I could login as root (via pubkey of course).

/etc/fstab:
Code:
/dev/disk/by-label/ROOT  /               ext4    noatime                 0 1
/dev/disk/by-label/HOME  /home           ext4    noatime                 0 2


I'll double check my kernel config, and see if dropping back to gentoo-sources-3.6.11 helps.
Back to top
View user's profile Send private message
gtbX
Tux's lil' helper
Tux's lil' helper


Joined: 11 Oct 2003
Posts: 99

PostPosted: Mon Aug 26, 2013 6:01 am    Post subject: Reply with quote

It does seem to be kernel-related, I just ran into the same problem on a different machine with the same kernel version. I rolled back the kernel on the original box to 3.6.11 and the problem seems to have gone away (I'll upgrade the kernel when I get to it in person). The second box has been upgraded to 3.10.7, I'll have to see if that helps
Back to top
View user's profile Send private message
eccerr0r
Advocate
Advocate


Joined: 01 Jul 2004
Posts: 3601
Location: USA

PostPosted: Mon Aug 26, 2013 4:37 pm    Post subject: Reply with quote

Though I had other issues with gentoo-sources-3.8.13 I have not seen the corruption issue on my ext4 machines.
_________________
Intel Core i7 2700K@ 4.1GHz/HD3000 graphics/8GB DDR3/180GB SSD
What am I supposed to be advocating?
Back to top
View user's profile Send private message
gtbX
Tux's lil' helper
Tux's lil' helper


Joined: 11 Oct 2003
Posts: 99

PostPosted: Mon Aug 26, 2013 7:49 pm    Post subject: Reply with quote

Crud, it happened again, this time on 3.6.11. Seems to start when there's an unclean shutdown. Running fsck manually has it remove some deleted inodes - nothing critical yet, but it's only a matter of time until valuable files get lost. I thought using a journalling fs was supposed to help with that? Maybe I'm just doing it wrong.
Back to top
View user's profile Send private message
eccerr0r
Advocate
Advocate


Joined: 01 Jul 2004
Posts: 3601
Location: USA

PostPosted: Mon Aug 26, 2013 8:43 pm    Post subject: Reply with quote

Uh... No. Even with a journalling filesystem, just shutting down the machine abruptly (like cutting power) is not proper.

Journalling filesystems will *help* but does not prevent corruption. A proper shutdown is still needed.

If you must have a system that can handle this, it can help more if cached writes are flushed to disk as quickly as possible. It will reduce performance but will help against corruption.
_________________
Intel Core i7 2700K@ 4.1GHz/HD3000 graphics/8GB DDR3/180GB SSD
What am I supposed to be advocating?
Back to top
View user's profile Send private message
trumee
Guru
Guru


Joined: 02 Mar 2003
Posts: 461
Location: London,UK

PostPosted: Wed Aug 28, 2013 7:55 pm    Post subject: Reply with quote

eccerr0r wrote:


If you must have a system that can handle this, it can help more if cached writes are flushed to disk as quickly as possible. It will reduce performance but will help against corruption.


How can i do this? It will be useful in situations when power failure is random.
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 5611
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Wed Aug 28, 2013 10:57 pm    Post subject: Reply with quote

trumee wrote:
eccerr0r wrote:


If you must have a system that can handle this, it can help more if cached writes are flushed to disk as quickly as possible. It will reduce performance but will help against corruption.


How can i do this? It will be useful in situations when power failure is random.


mount with commit=5 (should be the default no ? forcing nonetheless is safer)

or commit=10 (you could also try 20 to sync every 20 seconds)


or add data=journal as mount option - to force (ext3-like) full journalling mode - is it deprecated yet, btw ?
_________________
Unofficial minimal livecd x86/amd64 w/reiser4+truecrypt (by Neo2)
2.6.37.2_plus_v1: BFS, CFS,THP,compaction, zcache or TOI
Hardcore Linux user since 2004 :D
Back to top
View user's profile Send private message
trumee
Guru
Guru


Joined: 02 Mar 2003
Posts: 461
Location: London,UK

PostPosted: Thu Aug 29, 2013 10:07 pm    Post subject: Reply with quote

Is the commit option only for ext3? man mount indicates it as a suboption of ext3.

At the moment i am running ext4, but was wondering whether ext3 is safer choice for sudden power failures?
Back to top
View user's profile Send private message
eccerr0r
Advocate
Advocate


Joined: 01 Jul 2004
Posts: 3601
Location: USA

PostPosted: Thu Aug 29, 2013 10:41 pm    Post subject: Reply with quote

As the shorter commit times is just a hack to just help limit the damage, I cannot condone this as a "solution". Journalling filesystems are already helping the problem a bit as it is (unless you somehow disabled the journal) but it's still not right.

The question that's going in my head: Why is the power going out so frequently that such is needed?

If it's due to laziness, people will need to figure out how to shut down normally.
If it's due to unstable power, a UPS or perhaps a laptop configured to do a clean shutdown is highly recommended, this is a "proper" solution.

How frequent is frequent? Also what is the function of the machine, is it writing stuff to disk constantly? A disk that's merely just read most of the time should not suffer as much corruption from unclean shutdowns.

Remember, even with these faster commit options, if power goes out while committing, you will suffer problems as well.
_________________
Intel Core i7 2700K@ 4.1GHz/HD3000 graphics/8GB DDR3/180GB SSD
What am I supposed to be advocating?
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum