Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
ls causes segmentation fault on drbd + ocfs2
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
petero
n00b
n00b


Joined: 10 Jan 2013
Posts: 5

PostPosted: Thu Jan 10, 2013 5:44 pm    Post subject: ls causes segmentation fault on drbd + ocfs2 Reply with quote

Hello,
I've been trying to get drbd + ocfs2 working by following this guide: http://webcache.googleusercontent.com/search?q=cache:njQEooenU8cJ:en.gentoo-wiki.com/wiki/Active-active_DRBD_with_OCFS2+&cd=1&hl=en&ct=clnk

For the most part it works however when I create dir with symlinks to other dirs inside the drbd partition and then call
ls

on that dir it will end with segmentation fault (either that or sometimes it just hangs) and I find this in log

kernel: general protection fault: 0000 [#1] SMP
...
kernel: Pid: 17098, comm: ls Not tainted 3.5.7-gentoo #5 VMware, Inc. VMware Virtual Platform
...
kernel: Call Trace:
kernel: [<ffffffff8125388a>] ocfs2_fast_symlink_readpage+0xde/0x15c
kernel: [<ffffffff8108c494>] ? add_to_page_cache_lru+0x2f/0x39
kernel: [<ffffffff8108c602>] do_read_cache_page+0x8e/0x13c
kernel: [<ffffffff812537ac>] ? ocfs2_unblock_signals+0x1c/0x1c
kernel: [<ffffffff8108c6ea>] read_cache_page_async+0x17/0x19
kernel: [<ffffffff8108c6f5>] read_cache_page+0x9/0x13
kernel: [<ffffffff810c1127>] page_getlink.clone.29+0x28/0x82
kernel: [<ffffffff810c11a2>] page_follow_link_light+0x21/0x34
kernel: [<ffffffff810bfbce>] generic_readlink+0x3a/0x97
kernel: [<ffffffff810bacb1>] sys_readlinkat+0x76/0x94
kernel: [<ffffffff810bace5>] sys_readlink+0x16/0x18
kernel: [<ffffffff81488262>] system_call_fastpath+0x16/0x1b
kernel: Code: d1 48 8d 44 0a ff 40 38 30 74 0a 48 ff c8 48 39 d0 73 f3 31 c0 c9 c3 55 48 89 f8 48 89 e5 eb 03 48 ff c0 48 85 f6 74 08 48 ff ce <80> 38 00 75 f0 48 29 f8 c9 c3 55 31 c0 48 89 e5 eb 17 44 38 c1
kernel: RIP [<ffffffff812e6786>] strnlen+0x14/0x1e
kernel: RSP <ffff88031f8c1d08>
kernel: ---[ end trace add4a6818eca9284 ]---



my kernel is 3.5.7-gentoo x64

initially I was getting warnings that drbd version kernel space (8.3.13) doesn't match drbd tools user space (8.3.11) but I still continued installing and got it to the state when everything was working except of the symlinks
so then I tried installing 8.3.13 of drbd tools to see if it helps, and also different version of ocfs2 tools (finished installation on 1.8.2 later downgraded to 1.6.4) but none of those changes made any difference - ls is still crashing

My colleague has previously done the same installation on ubuntu using drbd 8.3.11-0ubuntu1 and ocfs2 1.6.3-4ubuntu1 and there it all works fine (our install steps were basically the same with the exception of gentoo vs. ubuntu specifics)
Could running kernel + user space drbd on version 8.3.11 (the same as the successful ubuntu install) help ? And how can I install lower than default version of drbd into kernel ? I am a total gentoo beginner so I have no clue what else to do ....

Any ideas how to further investigate this or how to fix it ?
Back to top
View user's profile Send private message
syn0ptik
Apprentice
Apprentice


Joined: 09 Jan 2013
Posts: 221

PostPosted: Fri Jan 11, 2013 5:01 am    Post subject: Reply with quote

try strace -f -o /tmp/out your_app
there mistakes in libc because
Code:
kernel: RIP [<ffffffff812e6786>] strnlen+0x14/0x1e

happened or kernel modules.
or switch those modules for ocfs
Back to top
View user's profile Send private message
petero
n00b
n00b


Joined: 10 Jan 2013
Posts: 5

PostPosted: Fri Jan 11, 2013 4:52 pm    Post subject: Reply with quote

syn0ptik wrote:
try strace -f -o /tmp/out your_app
there mistakes in libc because
Code:
kernel: RIP [<ffffffff812e6786>] strnlen+0x14/0x1e

happened or kernel modules.
or switch those modules for ocfs


Hey if you mean I should run
strace -f -o /tmp/out ls

so that's what I've just tried and: the ls doesn't crash that way instead it prints correctly the content of the folder to the console. The logged /tmp/out is quite long - should I post it here (even if ls didn't crash) ?

What do you mean by: "switch those modules for ocfs" ?
Back to top
View user's profile Send private message
petero
n00b
n00b


Joined: 10 Jan 2013
Posts: 5

PostPosted: Sat Jan 19, 2013 11:43 pm    Post subject: Reply with quote

So I updated world and installed new kernel (3.7.3) but the problem remains. What else can I do ? Should I report this as a bug or... ?
Back to top
View user's profile Send private message
syn0ptik
Apprentice
Apprentice


Joined: 09 Jan 2013
Posts: 221

PostPosted: Sun Jan 20, 2013 12:25 am    Post subject: Reply with quote

No, it for trace. It omit couple things when it runs.
Which command be crashed?
Back to top
View user's profile Send private message
randalla
n00b
n00b


Joined: 14 Oct 2008
Posts: 73
Location: Seattle, WA

PostPosted: Sun Jul 14, 2013 7:34 am    Post subject: Reply with quote

I'm sorry to bring this old post back up, but I ran into the same thing today. What I also found today was a fix for it:

http://comments.gmane.org/gmane.comp.file-systems.ocfs2.devel/8008

After applying the patch discussed there, I have not had any issues with symlinks on the ocfs2 partition.

Adam.
Back to top
View user's profile Send private message
petero
n00b
n00b


Joined: 10 Jan 2013
Posts: 5

PostPosted: Mon Aug 05, 2013 5:45 pm    Post subject: Reply with quote

Thx, randalla. I will give it a try.
Back to top
View user's profile Send private message
666threesixes666
Veteran
Veteran


Joined: 31 May 2011
Posts: 1235
Location: 42.68n 85.41w

PostPosted: Mon Aug 05, 2013 11:18 pm    Post subject: Reply with quote

upstream say

"Dok: issue is in ocfs2 not DRBD
Dok: the real question is 'do you really need a clustered filesystem?'
666threesixes666: what FS do you recommend for drbd? did you test jfs for it?
Dok: I have
Dok: jfs takes some voodoo to get working in rhel, but it works
Dok: DRBD is just a block device
Dok: the filesystem, as with anything else, really depends upon your expected use case
Dok: I like ext4 myself
Dok: it seems to be the most stable"

you really don't want to move backwards in versions. latest stable upstream is 8.4.3

(i seriously advise against this.....)
(as root)
Code:

echo ">=sys-cluster/drbd-8.3.12" >> /etc/portage/package.mask
emerge -av drbd


and that will put you on 8.3.11-r1
_________________
cat /etc/*-release
Funtoo Linux - baselayout 2.2.0
consider this warning no. 1
https://wiki.gentoo.org/index.php?title=Special:Contributions/666threesixes666&offset=&limit=500&target=666threesixes666
Back to top
View user's profile Send private message
petero
n00b
n00b


Joined: 10 Jan 2013
Posts: 5

PostPosted: Wed Aug 07, 2013 5:18 pm    Post subject: Reply with quote

OK, I can confirm that fix posted by randalla works for me as well.

On kernel 3.9.6 the patch is already included. (and everything works ok there out of the box)
So it seems that if you use reasonably new kernel you shouldn't run into this problem.

On the other node I have kernel 3.8.2 and I needed to manually edit fs/ocfs2/symlink.c
After that (after new kernel is built and applied), everything works fine on both nodes.

Thanks !
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum