Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
gentoo-sources-5.15.11 (LTS/stable) shows some nfs issues
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
gabrielg
Tux's lil' helper
Tux's lil' helper


Joined: 16 Nov 2012
Posts: 134

PostPosted: Wed Jan 12, 2022 3:07 pm    Post subject: gentoo-sources-5.15.11 (LTS/stable) shows some nfs issues Reply with quote

Hi, all,
Since running the kernel in the subject of this post, I started experiencing odd kernel errors like this one:
Code:

Jan 12 08:54:10 nana kernel: BUG: kernel NULL pointer dereference, address: 0000000000000110
Jan 12 08:54:10 nana kernel: #PF: supervisor read access in kernel mode
Jan 12 08:54:10 nana kernel: #PF: error_code(0x0000) - not-present page
Jan 12 08:54:10 nana kernel: PGD 0 P4D 0
Jan 12 08:54:10 nana kernel: Oops: 0000 [#1] SMP NOPTI
Jan 12 08:54:10 nana kernel: CPU: 1 PID: 2864 Comm: lockd Not tainted 5.15.11-gentoo-x86_64 #2
Jan 12 08:54:10 nana kernel: Hardware name: HP ProLiant MicroServer, BIOS O41     07/29/2011
Jan 12 08:54:10 nana kernel: RIP: 0010:vfs_lock_file+0x5/0x30
Jan 12 08:54:10 nana kernel: Code: a3 fe ff ff 4d 89 e1 e9 a4 fd ff ff 66 0f 1f 84 00 00 00 00 00 e8 2b 0d d7 ff 48 8b 7f 20 e9 f2 f5 ff ff 66 90 e8 1b 0d d7 ff <48> 8b 47 28 49 89 d0 48 8b 80 98 00 00 00 48 85 c0 74 05 e9 43 b8
Jan 12 08:54:10 nana kernel: RSP: 0018:ffff9d3640997c80 EFLAGS: 00010246
Jan 12 08:54:10 nana kernel: RAX: 7fffffffffffffff RBX: 00000000000000e8 RCX: 0000000000000000
Jan 12 08:54:10 nana kernel: RDX: ffff9d3640997c88 RSI: 0000000000000006 RDI: 00000000000000e8
Jan 12 08:54:10 nana kernel: RBP: ffff8b754767b400 R08: ffff8b7549dcf000 R09: ffff8b754bef1a00
Jan 12 08:54:10 nana kernel: R10: 0000000000000000 R11: 000000000000f000 R12: ffffffff9c34bfd0
Jan 12 08:54:10 nana kernel: R13: ffff8b76a518e7a8 R14: ffff8b7549d60c10 R15: ffff8b754767b400
Jan 12 08:54:10 nana kernel: FS:  0000000000000000(0000) GS:ffff8b7860500000(0000) knlGS:0000000000000000
Jan 12 08:54:10 nana kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 12 08:54:10 nana kernel: CR2: 0000000000000110 CR3: 000000010ffd4000 CR4: 00000000000006e0
Jan 12 08:54:10 nana kernel: Call Trace:
Jan 12 08:54:10 nana kernel:  <TASK>
Jan 12 08:54:10 nana kernel:  nlm_unlock_files+0x6e/0xb0
Jan 12 08:54:10 nana kernel:  ? _raw_spin_lock+0x5/0x20
Jan 12 08:54:10 nana kernel:  ? trace_hardirqs_on+0x35/0xd0
Jan 12 08:54:10 nana kernel:  ? __local_bh_enable_ip+0x44/0x80
Jan 12 08:54:10 nana kernel:  ? trace_hardirqs_on+0x35/0xd0
Jan 12 08:54:10 nana kernel:  ? mutex_lock+0x5/0x20
Jan 12 08:54:10 nana kernel:  ? nlmsvc_traverse_blocks+0x36/0x120
Jan 12 08:54:10 nana kernel:  nlm_traverse_files+0x14d/0x280
Jan 12 08:54:10 nana kernel:  nlmsvc_free_host_resources+0x17/0x30
Jan 12 08:54:10 nana kernel:  nlm_host_rebooted+0x23/0x90
Jan 12 08:54:10 nana kernel:  nlmsvc_proc_sm_notify+0xa1/0x110
Jan 12 08:54:10 nana kernel:  ? trace_hardirqs_on+0x35/0xd0
Jan 12 08:54:10 nana kernel:  ? nlmsvc_decode_reboot+0x95/0xc0
Jan 12 08:54:10 nana kernel:  nlmsvc_dispatch+0x89/0x180
Jan 12 08:54:10 nana kernel:  svc_process_common+0x399/0x640
Jan 12 08:54:10 nana kernel:  ? lockd_inet6addr_event+0xf0/0xf0
Jan 12 08:54:10 nana kernel:  ? set_grace_period+0xb0/0xb0
Jan 12 08:54:10 nana kernel:  svc_process+0xca/0xe0
Jan 12 08:54:10 nana kernel:  lockd+0x8f/0x130
Jan 12 08:54:10 nana kernel:  ? set_grace_period+0xb0/0xb0
Jan 12 08:54:10 nana kernel:  kthread+0x10e/0x130
Jan 12 08:54:10 nana kernel:  ? set_kthread_struct+0x40/0x40
Jan 12 08:54:10 nana kernel:  ret_from_fork+0x22/0x30
Jan 12 08:54:10 nana kernel:  </TASK>
Jan 12 08:54:10 nana kernel: Modules linked in: ecb xts dm_crypt dm_mod tun bridge stp llc ipt_REJECT nf_reject_ipv4 xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter iptable_mangle iptable_raw ip_tables radeon i2c_algo_bit drm_ttm_helper ttm kvm_amd drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt w83795 kvm fb_sys_fops uas cfbcopyarea drm usb_storage irqbypass tg3 drm_panel_orientation_quirks pata_atiixp libphy pcspkr i2c_piix4
Jan 12 08:54:10 nana kernel: CR2: 0000000000000110
Jan 12 08:54:10 nana kernel: ---[ end trace 6ac413c9433d0bd8 ]---
Jan 12 08:54:10 nana kernel: RIP: 0010:vfs_lock_file+0x5/0x30
Jan 12 08:54:10 nana kernel: Code: a3 fe ff ff 4d 89 e1 e9 a4 fd ff ff 66 0f 1f 84 00 00 00 00 00 e8 2b 0d d7 ff 48 8b 7f 20 e9 f2 f5 ff ff 66 90 e8 1b 0d d7 ff <48> 8b 47 28 49 89 d0 48 8b 80 98 00 00 00 48 85 c0 74 05 e9 43 b8
Jan 12 08:54:10 nana kernel: RSP: 0018:ffff9d3640997c80 EFLAGS: 00010246
Jan 12 08:54:10 nana kernel: RAX: 7fffffffffffffff RBX: 00000000000000e8 RCX: 0000000000000000
Jan 12 08:54:10 nana kernel: RDX: ffff9d3640997c88 RSI: 0000000000000006 RDI: 00000000000000e8
Jan 12 08:54:10 nana kernel: RBP: ffff8b754767b400 R08: ffff8b7549dcf000 R09: ffff8b754bef1a00
Jan 12 08:54:10 nana kernel: R10: 0000000000000000 R11: 000000000000f000 R12: ffffffff9c34bfd0
Jan 12 08:54:10 nana kernel: R13: ffff8b76a518e7a8 R14: ffff8b7549d60c10 R15: ffff8b754767b400
Jan 12 08:54:10 nana kernel: FS:  0000000000000000(0000) GS:ffff8b7860500000(0000) knlGS:0000000000000000
Jan 12 08:54:10 nana kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 12 08:54:10 nana kernel: CR2: 0000000000000110 CR3: 000000010ffd4000 CR4: 00000000000006e0


The system doesn't crash, but the NFS server certainly becomes unstable. The reboot process works until it times out stopping nfs/rpc processes and I have to power off the system completely.
Today I booted on the previous kernel (5.10.x) and it seems to be working well.
It's worth mentioning here that I'm using the NFS server for a number of things, including a time machine HFS+ sparsebundle for a mac (I know, I know, but it's work so it isn't like I have options). Triggering time machine operations seems to cause this breakage, while other Gentoo Linux clients play nicely.

Obligatory `emerge --info`: https://cloud.gagv.org.uk/s/TnfytcBARAifdkx
Obligatory kernel config: https://cloud.gagv.org.uk/s/FDT9gYfP5zbADb9

A brief internet search didn't yield too many results about this, but I admit it was very brief and took the time instead to post this here in case somebody else knows something I don't or can guide me a bit.

Thanks!


Gabriel
Back to top
View user's profile Send private message
mike155
Advocate
Advocate


Joined: 17 Sep 2010
Posts: 4438
Location: Frankfurt, Germany

PostPosted: Wed Jan 12, 2022 3:32 pm    Post subject: Reply with quote

It seems you use a Gentoo kernel. You could open a bug at https://bugs.gentoo.org. Maybe one of the Gentoo kernel developers will be able to help you.

At least for testing purposes, I would switch to the latest vanilla kernel 5.15.14. If you see the error there, you could ask one of the kernel maintainers for help. They are interested in bugs like the one you reported and they will be able to help you.


Last edited by mike155 on Wed Jan 12, 2022 3:37 pm; edited 1 time in total
Back to top
View user's profile Send private message
alamahant
Advocate
Advocate


Joined: 23 Mar 2019
Posts: 3882

PostPosted: Wed Jan 12, 2022 3:34 pm    Post subject: Reply with quote

Do you see something similar in your config
Code:

CONFIG_NFS_FS=m
CONFIG_NFS_V2=m
CONFIG_NFS_V3=m
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=m
CONFIG_NFS_SWAP=y
CONFIG_NFS_V4_1=y
CONFIG_NFS_V4_2=y
CONFIG_PNFS_FILE_LAYOUT=m
CONFIG_PNFS_BLOCK=m
CONFIG_PNFS_FLEXFILE_LAYOUT=m
CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
CONFIG_NFS_V4_1_MIGRATION=y
CONFIG_NFS_V4_SECURITY_LABEL=y
CONFIG_NFS_FSCACHE=y
CONFIG_NFS_USE_KERNEL_DNS=y

CONFIG_NFSD=m
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
CONFIG_NFSD_PNFS=y
CONFIG_NFSD_BLOCKLAYOUT=y
CONFIG_NFSD_SCSILAYOUT=y
CONFIG_NFSD_V4_2_INTER_SSC=y
CONFIG_NFSD_V4_SECURITY_LABEL=y
CONFIG_NFS_ACL_SUPPORT=m
CONFIG_NFS_COMMON=y
CONFIG_NFS_V4_2_SSC_HELPER=y

?
What does grep -i nfs /var/log/messages show?
_________________
:)
Back to top
View user's profile Send private message
gabrielg
Tux's lil' helper
Tux's lil' helper


Joined: 16 Nov 2012
Posts: 134

PostPosted: Sun Jan 30, 2022 11:38 am    Post subject: Reply with quote

mike155: thanks for the advice - I have tried that but yielded not good results. Indeed, I found another issue that makes it harder to fix this one: https://bugs.gentoo.org/show_bug.cgi?id=832367

alamahant: my config is very similar:
Code:

CONFIG_KERNFS=y
CONFIG_NFS_FS=m
CONFIG_NFS_V2=m
CONFIG_NFS_V3=m
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=m
CONFIG_NFS_V4_1=y
CONFIG_NFS_V4_2=y
CONFIG_PNFS_FILE_LAYOUT=m
CONFIG_PNFS_BLOCK=m
CONFIG_PNFS_FLEXFILE_LAYOUT=m
CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
CONFIG_NFS_V4_SECURITY_LABEL=y
CONFIG_NFS_FSCACHE=y
CONFIG_NFS_USE_KERNEL_DNS=y
CONFIG_NFSD=y
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
CONFIG_NFSD_PNFS=y
CONFIG_NFSD_BLOCKLAYOUT=y
CONFIG_NFSD_SCSILAYOUT=y
CONFIG_NFSD_FLEXFILELAYOUT=y
CONFIG_NFS_ACL_SUPPORT=y
CONFIG_NFS_COMMON=y


Things like migration and SELinux I would prefer to have off, unless you really believe that this could be the issue? I am trying to compile the daemon as a module now, you never know...

Quote:
What does grep -i nfs /var/log/messages show?

I will get that next time I try this, which might be a while (see bug above).
Back to top
View user's profile Send private message
gabrielg
Tux's lil' helper
Tux's lil' helper


Joined: 16 Nov 2012
Posts: 134

PostPosted: Sun Feb 13, 2022 9:32 am    Post subject: [SOLVED] gentoo-sources-5.15.11 (LTS/stable) shows some nfs Reply with quote

Kernel 5.15.19 doesn't present this issue, so I'll mark this as solved.
As a side note, NFSD as a module caused me a few problems since it ignored sysctl's ports for lockd until I restarted the module, so it went back to being compiled in the kernel for simplicity :)
Back to top
View user's profile Send private message
gabrielg
Tux's lil' helper
Tux's lil' helper


Joined: 16 Nov 2012
Posts: 134

PostPosted: Tue Feb 15, 2022 10:22 pm    Post subject: Reply with quote

Correction: I've seen the issue again :) Opened bug: https://bugs.gentoo.org/833438
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum