Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Strange problems
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2  
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
gabrielg
Tux's lil' helper
Tux's lil' helper


Joined: 16 Nov 2012
Posts: 134

PostPosted: Tue Nov 20, 2012 3:58 pm    Post subject: Reply with quote

FWIW, I only have noatime.

Also - did you check the health of your raids? Would be nice to get an output of /proc/mdstat to find out superblock version et al - hopefully the jump in kernels doesn't involve you doing something with mdadm.
Back to top
View user's profile Send private message
Tambor
n00b
n00b


Joined: 07 Apr 2005
Posts: 53
Location: Girona (CAT)

PostPosted: Tue Nov 20, 2012 5:01 pm    Post subject: Reply with quote

Code:

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty]
md1 : active raid1 sdb1[0] sda1[1]
      256896 blocks [2/2] [UU]
     
md3 : active raid1 sdb3[0] sda3[1]
      50010240 blocks [2/2] [UU]
     
md5 : active raid1 sdb5[0] sda5[1]
      50010240 blocks [2/2] [UU]
     
md6 : active raid1 sdb6[0] sda6[1]
      25005056 blocks [2/2] [UU]
     
md7 : active raid1 sdb7[0] sda7[1]
      50010240 blocks [2/2] [UU]
     
md8 : active raid1 sdb8[0] sda8[1]
      107731776 blocks [2/2] [UU]
     
md2 : active raid1 sdb2[0] sda2[1]
      10008384 blocks [2/2] [UU]
     
unused devices: <none>

Back to top
View user's profile Send private message
DaggyStyle
Watchman
Watchman


Joined: 22 Mar 2006
Posts: 5909

PostPosted: Tue Nov 20, 2012 9:18 pm    Post subject: Reply with quote

I find it strange in having raid1 per partition, the logical thing to do imho is to use all as one raid setup and use lvm on it.

here is my fstab:
Code:
/dev/md0                /boot           ext3            noauto,noatime,defaults 1 2
/dev/md2                /               reiserfs        noatime         0 1
/dev/extra/swap         none            swap            sw              0 0
/dev/dvdrw              /mnt/dvdrw      auto            noauto,rw       0 0
/dev/md1p3              /var            reiserfs        noatime         0 0
/dev/md1p4              /opt            reiserfs        noatime         0 0
/dev/md1p5              /usr            reiserfs        noatime         0 0
/dev/md1p2              /usr/portage-tree ext2          noatime         0 0
/dev/md1p1              /usr/portage-bins reiserfs      noatime         0 0
/dev/md1p6              /home           reiserfs        defaults        0 0
/dev/md1p7              /mnt/storage    xfs             defaults,rw     0 0
/dev/extra/share        /mnt/share      vfat            defaults,rw,users 0 0
/dev/extra/dev_and_utils /mnt/extra     reiserfs        defaults,rw     0 0
/dev/sdf1               /mnt/usb        auto            defaults,rw,users,noauto 0 0

same here, my root (raid1) has only noatime
_________________
Only two things are infinite, the universe and human stupidity and I'm not sure about the former - Albert Einstein
Back to top
View user's profile Send private message
gabrielg
Tux's lil' helper
Tux's lil' helper


Joined: 16 Nov 2012
Posts: 134

PostPosted: Wed Nov 21, 2012 11:51 am    Post subject: Reply with quote

In fairness, nodev and nosuid shouldn't be part of the problem, and in fact it should make the server a little bit more secure by setting those in /var (and /home, and /usr/local, and... :-) ).

Now, back to the problem - the raids seem to be healthy enough, and quite frankly I've run out of ideas.
My understanding is that the first (and perhaps main) impediment is that you can't write to /var, hence you don't get much logging, which is rather unfortunate.

Have you considered booting from a CD and diagnose? Basically:
- Boot up from a CD
- Mount your /dev/md3 somewhere
- Try to write something (touch test or what DaggyStyle suggested)
- See what happens in your /var/log

If you are in a hurry, you can probably even set up a new /var somewhere else:
- Boot up from a CD
- Create a large enough partition somewhere (or even use /)
- rsync your current /var in your /dev/md3 into the new /var
- Modify your fstab to point /var to the new device (or comment it out if you're using root)
- Reboot and see what happens.

Needless to say, this "CD" has to be a Linux one.
Back to top
View user's profile Send private message
Tambor
n00b
n00b


Joined: 07 Apr 2005
Posts: 53
Location: Girona (CAT)

PostPosted: Wed Nov 21, 2012 2:01 pm    Post subject: Reply with quote

It seams clear that the problem is /var. Because you can not write into the partition then you can not loggin, create new logs, ...
The problem is that this problem appears not when you boot the machine, and for instead in some hours or few days. Because booting the machine the logs are generated and you can create files on the /var.

Due to that and looking to "ps" output I noticed that the first process to become "defunct" are the syslog and the cron. Yesterday I rebooted again the machine with syslog-ng and vixie-cron dissabled. The worst thing now is that I don't have any feedback of what is happening on the machine. But people is working and the machine seams to be ok, in situation that crashed the machine before.

Let's see if things continues going Ok in order to be sure that the problem is caused by these two services.
Back to top
View user's profile Send private message
DaggyStyle
Watchman
Watchman


Joined: 22 Mar 2006
Posts: 5909

PostPosted: Wed Nov 21, 2012 2:12 pm    Post subject: Reply with quote

Tambor wrote:
It seams clear that the problem is /var. Because you can not write into the partition then you can not loggin, create new logs, ...
The problem is that this problem appears not when you boot the machine, and for instead in some hours or few days. Because booting the machine the logs are generated and you can create files on the /var.

Due to that and looking to "ps" output I noticed that the first process to become "defunct" are the syslog and the cron. Yesterday I rebooted again the machine with syslog-ng and vixie-cron dissabled. The worst thing now is that I don't have any feedback of what is happening on the machine. But people is working and the machine seams to be ok, in situation that crashed the machine before.

Let's see if things continues going Ok in order to be sure that the problem is caused by these two services.

maybe hd failure of one of the two?
_________________
Only two things are infinite, the universe and human stupidity and I'm not sure about the former - Albert Einstein
Back to top
View user's profile Send private message
gabrielg
Tux's lil' helper
Tux's lil' helper


Joined: 16 Nov 2012
Posts: 134

PostPosted: Wed Nov 21, 2012 2:30 pm    Post subject: Reply with quote

Tambor wrote:
The problem is that this problem appears not when you boot the machine, and for instead in some hours or few days.


Sorry... I didn't realize this.

So... another thing you can do is check SMART on the hard drives, owing to HD failure like DaggyStyle suggests? smartctl -a /dev/sda (and then sdb) should tell you something, although SMART has been known to not tell enough, depending on how good the HD manufacturer is.

Stopping syslog-ng shouldn't harm you if it isn't the problem, but won't tell you much if you run into the issue again.

Perhaps try to mount /var/log elsewhere, away from /dev/md3? The general idea being to keep logging happening to rule out that the issue is that partition.

Good luck!
Back to top
View user's profile Send private message
Tambor
n00b
n00b


Joined: 07 Apr 2005
Posts: 53
Location: Girona (CAT)

PostPosted: Wed Nov 21, 2012 2:39 pm    Post subject: Reply with quote

It is supose, that being the partition a RAID 1. If one of both fails, the other should still work without any problem.
Also we made a fsck.reiserfs on all the partitions and the filesystems were ok.
I can just try to execute "smartctl --all" to both harddrives. But the system has the smartd daemon running always and We didn't had any problem on these hard drives.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum