View previous topic :: View next topic |
Author |
Message |
ozcircuit n00b
Joined: 03 Sep 2023 Posts: 34
|
Posted: Fri Apr 05, 2024 9:25 am Post subject: Kernel BUG 6.6.21 |
|
|
Hi all,
It looks like some misconfiguration in /etc/ntp.conf cause kernel to panic when switching to runlevel 0 :
Code: | user@localhost ~ $ cat /etc/ntp.conf
server 0.gentoo.pool.ntp.org
server 1.gentoo.pool.ntp.org
server 2.gentoo.pool.ntp.org
server 3.gentoo.pool.ntp.org
server 127.127.1.0
fudge 127.127.1.0 stratum 10
# Default configuration:
# - Allow only time queries, at a limited rate, sending KoD when in excess.
restrict default nomodify nopeer noquery limited kod
restrict 127.0.0.1
#disable monitor
Apr 5 10:30:18 localhost init: Switching to runlevel: 0
Apr 5 10:30:18 localhost init: Trying to re-exec init
Apr 5 10:30:18 localhost start-stop-daemon: Will stop /usr/libexec/openrc-settingsd
Apr 5 10:30:18 localhost /etc/init.d/openrc-settingsd[13471]: start-stop-daemon: no matching processes found
Apr 5 10:30:18 localhost start-stop-daemon: Will stop /usr/sbin/nullmailer-send
Apr 5 10:30:18 localhost start-stop-daemon: Will stop PID 4332
Apr 5 10:30:18 localhost start-stop-daemon: Sending signal 15 to PID 4332
Apr 5 10:30:18 localhost start-stop-daemon: Will stop /usr/sbin/ntpd
Apr 5 10:30:18 localhost start-stop-daemon: Will stop PID 4304
Apr 5 10:30:18 localhost start-stop-daemon: Sending signal 15 to PID 4304
Apr 5 10:30:18 localhost ntpd[4304]: ntpd exiting on signal 15 (Terminated)
Apr 5 10:30:18 localhost ntpd[4304]: 127.127.1.0 local addr 127.0.0.1 -> <null>
Apr 5 10:30:17 localhost kernel: BUG: kernel NULL pointer dereference, address: 000000000000093f
Apr 5 10:30:17 localhost kernel: #PF: supervisor read access in kernel mode
Apr 5 10:30:17 localhost kernel: #PF: error_code(0x0000) - not-present page
Apr 5 10:30:17 localhost kernel: PGD 0 P4D 0
Apr 5 10:30:17 localhost kernel: Oops: 0000 [#1] PREEMPT SMP PTI
Apr 5 10:30:17 localhost kernel: CPU: 0 PID: 13533 Comm: rmdir Tainted: P O 6.6.21-gentoo-x86_64 #3
Apr 5 10:30:17 localhost kernel: Hardware name:
Apr 5 10:30:17 localhost kernel: RIP: 0010:rb_first+0xb/0x30
Apr 5 10:30:17 localhost kernel: Code: 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 8b 07 48 85 c0 74 18 48 89 c2 <48> 8b 40 10 48 85 c0 75 f4 48 89 d0 31 d2 31 ff c3 cc cc cc cc 31
Apr 5 10:30:17 localhost kernel: RSP: 0018:ffffc90001c0fcd0 EFLAGS: 00010202
Apr 5 10:30:17 localhost kernel: RAX: 000000000000092f RBX: ffff88817dd6d180 RCX: 0000000000000000
Apr 5 10:30:17 localhost kernel: RDX: 000000000000092f RSI: 0000000000000000 RDI: ffff888105ad8718
Apr 5 10:30:17 localhost kernel: RBP: ffff88817dd6d480 R08: 0000000000000000 R09: 0000000000000000
Apr 5 10:30:17 localhost kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Apr 5 10:30:17 localhost kernel: R13: ffff888105ad8718 R14: 0000000000000002 R15: 0000000000000000
Apr 5 10:30:17 localhost kernel: FS: 00007f1541e09740(0000) GS:ffff888226c00000(0000) knlGS:0000000000000000
Apr 5 10:30:17 localhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 5 10:30:17 localhost kernel: CR2: 000000000000093f CR3: 00000001843b4005 CR4: 00000000001706f0
Apr 5 10:30:17 localhost kernel: Call Trace:
Apr 5 10:30:17 localhost kernel: <TASK>
Apr 5 10:30:17 localhost kernel: ? __die+0x1f/0x70
Apr 5 10:30:17 localhost kernel: ? page_fault_oops+0x17d/0x4b0
Apr 5 10:30:17 localhost kernel: ? exc_page_fault+0x7b/0x190
Apr 5 10:30:17 localhost kernel: ? asm_exc_page_fault+0x22/0x30
Apr 5 10:30:17 localhost kernel: ? rb_first+0xb/0x30
Apr 5 10:30:17 localhost kernel: simple_xattrs_free+0x25/0x90
Apr 5 10:30:17 localhost kernel: kernfs_put.part.0+0x60/0x150
Apr 5 10:30:17 localhost kernel: evict+0xc4/0x1c0
Apr 5 10:30:17 localhost kernel: __dentry_kill+0xd3/0x170
Apr 5 10:30:17 localhost kernel: shrink_dentry_list+0x6e/0x140
Apr 5 10:30:17 localhost kernel: shrink_dcache_parent+0xcc/0x120
Apr 5 10:30:17 localhost kernel: vfs_rmdir+0xac/0x230
Apr 5 10:30:17 localhost kernel: do_rmdir+0x17f/0x1c0
Apr 5 10:30:17 localhost kernel: __x64_sys_rmdir+0x3e/0x80
Apr 5 10:30:17 localhost kernel: do_syscall_64+0x5c/0x90
Apr 5 10:30:17 localhost kernel: ? __count_memcg_events+0x41/0xa0
Apr 5 10:30:17 localhost kernel: ? count_memcg_events.constprop.0+0x26/0x50
Apr 5 10:30:17 localhost kernel: ? handle_mm_fault+0x9e/0x370
Apr 5 10:30:17 localhost kernel: ? do_user_addr_fault+0x31b/0x660
Apr 5 10:30:17 localhost kernel: ? exc_page_fault+0x7b/0x190
Apr 5 10:30:17 localhost kernel: entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Apr 5 10:30:17 localhost kernel: RIP: 0033:0x7f1541efa42b
Apr 5 10:30:17 localhost kernel: Code: f0 ff ff 73 01 c3 48 8b 0d fa a9 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 54 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 c9 a9 0c 00 f7 d8
Apr 5 10:30:17 localhost kernel: RSP: 002b:00007ffe74931f28 EFLAGS: 00000246 ORIG_RAX: 0000000000000054
Apr 5 10:30:17 localhost kernel: RAX: ffffffffffffffda RBX: 00007ffe74932128 RCX: 00007f1541efa42b
Apr 5 10:30:17 localhost kernel: RDX: 00007f1541fcc820 RSI: 00007ffe74932976 RDI: 00007ffe74932976
Apr 5 10:30:17 localhost kernel: RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000
Apr 5 10:30:17 localhost kernel: R10: 0000000000000007 R11: 0000000000000246 R12: 000055f1c30ee101
Apr 5 10:30:17 localhost kernel: R13: 0000000000000001 R14: 00007ffe74932976 R15: 000055f1c30f1a98
Apr 5 10:30:17 localhost kernel: </TASK>
Apr 5 10:30:17 localhost kernel: Modules linked in: cmac ccm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device 8021q garp mrp stp llc ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog xt_limit xt_addrtype xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip6table_filter ip6_tables iptable_filter ip_tables nvidia_drm(PO) nvidia_modeset(PO) x86_pkg_temp_thermal intel_powerclamp nvidia(PO) coretemp mt76x0e mt76x0_common kvm_intel snd_hda_codec_realtek mt76x02_lib mt76 snd_hda_codec_hdmi snd_hda_codec_generic kvm iTCO_wdt snd_hda_intel iTCO_vendor_support mac80211 snd_intel_dspcfg at24 irqbypass snd_hda_codec regmap_i2c cfg80211 crct10dif_pclmul intel_rapl_msr ghash_clmulni_intel snd_hwdep sha512_ssse3 asus_nb_wmi asus_wmi snd_hda_core mei_hdcp snd_pcm rapl intel_cstate ledtrig_audio platform_profile rfkill intel_uncore joydev pcspkr libarc4 snd_timer snd processor_thermal_device_pci_legacy processor_thermal_device processor_thermal_rfim processor_thermal_mbox efi_pstore serio_raw
Apr 5 10:30:17 localhost kernel: processor_thermal_rapl intel_rapl_common int3400_thermal int3402_thermal soundcore intel_pch_thermal acpi_thermal_rel intel_soc_dts_iosf int340x_thermal_zone mei_me i2c_i801 lpc_ich mei mfd_core i2c_smbus mxm_wmi asus_wireless vboxnetflt(O) vboxnetadp(O) vboxdrv(O) efivarfs ext4 mbcache jbd2 dm_crypt trusted asn1_encoder dm_mod sd_mod t10_pi sr_mod crc64_rocksoft cdrom crc64 r8169 crc32_pclmul crc32c_intel ahci realtek mdio_devres libahci libphy ehci_pci ehci_hcd xhci_pci xhci_pci_renesas xhci_hcd
Apr 5 10:30:17 localhost kernel: CR2: 000000000000093f
Apr 5 10:30:17 localhost kernel: ---[ end trace 0000000000000000 ]---
Apr 5 10:30:17 localhost kernel: pstore: backend (efi_pstore) writing error (-5)
Apr 5 10:30:17 localhost kernel: RIP: 0010:rb_first+0xb/0x30
Apr 5 10:30:17 localhost kernel: Code: 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 8b 07 48 85 c0 74 18 48 89 c2 <48> 8b 40 10 48 85 c0 75 f4 48 89 d0 31 d2 31 ff c3 cc cc cc cc 31
Apr 5 10:30:17 localhost kernel: RSP: 0018:ffffc90001c0fcd0 EFLAGS: 00010202
Apr 5 10:30:17 localhost kernel: RAX: 000000000000092f RBX: ffff88817dd6d180 RCX: 0000000000000000
Apr 5 10:30:17 localhost kernel: RDX: 000000000000092f RSI: 0000000000000000 RDI: ffff888105ad8718
Apr 5 10:30:17 localhost kernel: RBP: ffff88817dd6d480 R08: 0000000000000000 R09: 0000000000000000
Apr 5 10:30:17 localhost kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Apr 5 10:30:17 localhost kernel: R13: ffff888105ad8718 R14: 0000000000000002 R15: 0000000000000000
Apr 5 10:30:17 localhost kernel: FS: 00007f1541e09740(0000) GS:ffff888226c00000(0000) knlGS:0000000000000000
Apr 5 10:30:17 localhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 5 10:30:17 localhost kernel: CR2: 000000000000093f CR3: 00000001843b4005 CR4: 00000000001706f0 |
Thanks for your help !
[Moderator edit: added [code] tags to preserve output layout. -- pietinger] |
|
Back to top |
|
|
Hu Administrator
Joined: 06 Mar 2007 Posts: 22578
|
Posted: Fri Apr 05, 2024 12:31 pm Post subject: |
|
|
This is not a panic. This is a BUG, which is at least sometimes recoverable. If it were a panic, you would not get any log files written with the error text.
The error happened for rmdir, not ntpd, and happened one second before ntpd terminated. I see you are using proprietary and out-of-tree modules. Is the problem reproducible on an untainted kernel? Can you narrow down exactly what service change triggers this? |
|
Back to top |
|
|
ozcircuit n00b
Joined: 03 Sep 2023 Posts: 34
|
Posted: Fri Apr 05, 2024 2:01 pm Post subject: Kernel BUG 6.6.21 |
|
|
Nvidia proprietary drivers are blacklisted from now on, will try to reproduce so tune in to find out which service causes It. |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2167
|
Posted: Fri Apr 05, 2024 3:11 pm Post subject: |
|
|
Hu wrote: | ... The error happened for rmdir, not ntpd ... |
Be interesting to know what filesystem is in use. _________________ Greybeard |
|
Back to top |
|
|
ozcircuit n00b
Joined: 03 Sep 2023 Posts: 34
|
Posted: Fri Apr 05, 2024 3:23 pm Post subject: Kernel BUG 6.6.21 |
|
|
Ext4 .. switching back to nouveau |
|
Back to top |
|
|
ozcircuit n00b
Joined: 03 Sep 2023 Posts: 34
|
Posted: Mon Apr 08, 2024 6:39 am Post subject: Kernel BUG 6.6.21 |
|
|
Looks like issue was triggered by a popup window with faulty nvidia drivers |
|
Back to top |
|
|
ozcircuit n00b
Joined: 03 Sep 2023 Posts: 34
|
Posted: Sun Apr 21, 2024 6:59 am Post subject: [Solved}Kernel BUG 6.6.21 |
|
|
Hi,
Issue has desappearded, however nouveau kernel logs drivers section shows that "module with unavailable key is rejected" twice. Kernel is secure booted, kernel modules are signed.
Is there a way to find on which modules that are ?
[ 38.992570] nouveau 0000:04:00.0: fb: 2048 MiB DDR3
[ 38.992589] nouveau 0000:04:00.0: bus: MMIO read of 00000000 FAULT at 6013d4 [ PRIVRING ]
[ 40.088537] nouveau 0000:04:00.0: DRM: VRAM: 2048 MiB
[ 40.088540] nouveau 0000:04:00.0: DRM: GART: 1048576 MiB
[ 40.088543] nouveau 0000:04:00.0: DRM: Pointer to TMDS table not found
[ 40.088544] nouveau 0000:04:00.0: DRM: DCB version 4.0
[ 40.091027] nouveau 0000:04:00.0: DRM: MM: using COPY for buffer copies
[ 40.091197] [drm] Initialized nouveau 1.4.0 20120801 for 0000:04:00.0 on minor 1
[ 40.091338] nouveau 0000:04:00.0: [drm] No compatible format found
[ 40.091340] nouveau 0000:04:00.0: [drm] Cannot find any crtc or sizes
[ 41.593269] Loading of module with unavailable key is rejected
[ 41.740088] Loading of module with unavailable key is rejected
==> upgraded to 6.6.30 |
|
Back to top |
|
|
|