Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Updating kernel to 5.19 causes system to hang on boot
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
tiredcsstudent
n00b
n00b


Joined: 09 Jun 2022
Posts: 12

PostPosted: Sun Aug 14, 2022 6:06 am    Post subject: Updating kernel to 5.19 causes system to hang on boot Reply with quote

I'm trying to update to kernel 5.19 so I can take advantage of some specific new features, but both choosing the default options for the new kernel config and manually choosing them (which turned out even worse I didn't even get to loading the initramfs doing this) causes the system to hang to after loading the initramfs. Normally I would suspect that the system is initializing correctly and something with the framebuffer is broken, but I had no issue with this before. I know a lot of code was changed in regards to DRM, is there maybe an option I'm overlooking I need to enable now? Guess I'm just asking for a bit of a sanity check here. I'm using nvidia drivers with the kernel-open use flag, and heres the .config currently: https://pastebin.com/kNJsvuHd
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54098
Location: 56N 3W

PostPosted: Sun Aug 14, 2022 9:24 am    Post subject: Reply with quote

tiredcsstudent,

The loading kernel and loading initrd messages both come from grub before the kernel is started.
If the initrd message is missing, thats OK as long as you don't need an initrd to boot.

To check your kernel config we need your
Code:
lspci -nnk
to check it against and we need to know the filesystems that you use.

Also do you use Device Mapper, Sofware RAID, or encrypted root ?
Any/all of those mandate an initrd.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
tiredcsstudent
n00b
n00b


Joined: 09 Jun 2022
Posts: 12

PostPosted: Sun Aug 14, 2022 6:39 pm    Post subject: Reply with quote

I do not use any of those at the moment, my file system is ext4, and heres the lspci output:
Code:
00:00.0 PCI bridge [0604]: Intel Corporation Device [8086:4c43] (rev 01)
   Kernel driver in use: icl_uncore
00:01.0 PCI bridge [0604]: Intel Corporation Device [8086:4c01] (rev 01)
   Kernel driver in use: pcieport
00:02.0 VGA compatible controller [0300]: Intel Corporation RocketLake-S GT1 [UHD Graphics 750] [8086:4c8a] (rev 04)
   DeviceName: Onboard - Video
   Subsystem: Dell Device [1028:09c5]
00:04.0 Signal processing controller [1180]: Intel Corporation Device [8086:4c03] (rev 01)
   DeviceName: Onboard - Other
   Subsystem: Dell Device [1028:09c5]
00:08.0 System peripheral [0880]: Intel Corporation Device [8086:4c11] (rev 01)
   DeviceName: Onboard - Other
   Subsystem: Dell Device [1028:09c5]
00:12.0 Signal processing controller [1180]: Intel Corporation Comet Lake PCH Thermal Controller [8086:06f9]
   DeviceName: Onboard - Other
   Subsystem: Dell Device [1028:09c5]
00:14.0 USB controller [0c03]: Intel Corporation Comet Lake USB 3.1 xHCI Host Controller [8086:06ed]
   DeviceName: Onboard - Other
   Subsystem: Dell Device [1028:09c5]
   Kernel driver in use: xhci_hcd
00:14.2 RAM memory [0500]: Intel Corporation Comet Lake PCH Shared SRAM [8086:06ef]
   DeviceName: Onboard - Other
   Subsystem: Dell Device [1028:09c5]
00:14.3 Network controller [0280]: Intel Corporation Comet Lake PCH CNVi WiFi [8086:06f0]
   DeviceName: Onboard - Ethernet
   Subsystem: Rivet Networks Device [1a56:1652]
00:15.0 Serial bus controller [0c80]: Intel Corporation Comet Lake PCH Serial IO I2C Controller #0 [8086:06e8]
   DeviceName: Onboard - Other
   Subsystem: Dell Device [1028:09c5]
00:16.0 Communication controller [0780]: Intel Corporation Comet Lake HECI Controller [8086:06e0]
   DeviceName: Onboard - Other
   Subsystem: Dell Device [1028:09c5]
00:17.0 SATA controller [0106]: Intel Corporation Comet Lake SATA AHCI Controller [8086:06d2]
   DeviceName: Onboard - SATA
   Subsystem: Dell Device [1028:09c5]
   Kernel driver in use: ahci
00:1b.0 PCI bridge [0604]: Intel Corporation Comet Lake PCI Express Root Port #21 [8086:06ac] (rev f0)
   Subsystem: Intel Corporation Device [8086:7270]
   Kernel driver in use: pcieport
00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:06bc] (rev f0)
   Subsystem: Dell Device [1028:09c5]
   Kernel driver in use: pcieport
00:1f.0 ISA bridge [0601]: Intel Corporation H470 Chipset LPC/eSPI Controller [8086:0684]
   DeviceName: Onboard - Other
   Subsystem: Dell Device [1028:09c5]
00:1f.3 Audio device [0403]: Intel Corporation Device [8086:f1c8]
   DeviceName: Onboard - Sound
   Subsystem: Dell Device [1028:09c5]
00:1f.4 SMBus [0c05]: Intel Corporation Comet Lake PCH SMBus Controller [8086:06a3]
   DeviceName: Onboard - Other
   Subsystem: Dell Device [1028:09c5]
00:1f.5 Serial bus controller [0c80]: Intel Corporation Comet Lake PCH SPI Controller [8086:06a4]
   DeviceName: Onboard - Other
   Subsystem: Dell Device [1028:09c5]
02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104 [GeForce RTX 3060 Ti Lite Hash Rate] [10de:2489] (rev a1)
   Subsystem: Dell Device [1028:c976]
02:00.1 Audio device [0403]: NVIDIA Corporation GA104 High Definition Audio Controller [10de:228b] (rev a1)
   Subsystem: Dell Device [1028:c976]
03:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD PC SN810 / Black SN850 NVMe SSD [15b7:5011] (rev 01)
   Subsystem: Sandisk Corp WD PC SN810 / Black SN850 NVMe SSD [15b7:5011]
   Kernel driver in use: nvme
04:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. Device [10ec:2600] (rev 21)
   Subsystem: Rivet Networks Device [1a56:2600]
   Kernel driver in use: r8169
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54098
Location: 56N 3W

PostPosted: Sun Aug 14, 2022 8:30 pm    Post subject: Reply with quote

tiredcsstudent,

Code:
CONFIG_EFI_PARTITION=y
CONFIG_NVME_CORE=y
CONFIG_BLK_DEV_NVME=y

CONFIG_DRM_I915=y
CONFIG_INTEL_PCH_THERMAL=m

CONFIG_WLAN_VENDOR_INTEL=y
Thats all good.
Code:
#CONFIG_MFD_INTEL_LPSS_PCI is not set
is required for your
Code:
 00:15.0 Serial bus controller [0c80]: Intel Corporation Comet Lake PCH Serial IO I2C Controller #0 [8086:06e8]

Code:
# CONFIG_INTEL_MEI_ME is not set
Leave that off unless you are in a corporate environment, where IT want remote control.


Code:
CONFIG_SND_HDA_INTEL=y
and all the CODECS too. Good.
Code:
CONFIG_I2C_I801=y
CONFIG_EXT4_FS=y
CONFIG_EXT4_USE_FOR_EXT2=y

CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y

# CONFIG_DRM_SIMPLEDRM is not set
CONFIG_FB_VESA=y
CONFIG_FB_EFI=y
CONFIG_FB_SIMPLE=y


That all looks good except one.


Turn off all of
Code:
CONFIG_PCCARD=y
CONFIG_PCMCIA=y
CONFIG_PCMCIA_LOAD_CIS=y
CONFIG_CARDBUS=y

#
# PC-card bridges
#
CONFIG_YENTA=y
CONFIG_YENTA_O2=y
CONFIG_YENTA_RICOH=y
CONFIG_YENTA_TI=y
CONFIG_YENTA_ENE_TUNE=y
CONFIG_YENTA_TOSHIBA=y
# CONFIG_PD6729 is not set
# CONFIG_I82092 is not set
CONFIG_PCCARD_NONSTATIC=y

The hardware went away about 15 years ago.

I think it should boot and show you a console. While you are in the chroot, can you set up sshd to start at boot?
You may be able to log in from another system an grab dmesg?
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
tiredcsstudent
n00b
n00b


Joined: 09 Jun 2022
Posts: 12

PostPosted: Sun Aug 14, 2022 9:58 pm    Post subject: Reply with quote

No dice on the console, still hangs after the initramfs, I can ssh into it though, and I got the dmesg output. I already have a suspicion of what might be the problem though:
Code:
NVRM objClInitPcieChipset: *** Chipset Setup Function Error!


Heres the dmesg output: https://pastebin.com/UnDC45jf
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54098
Location: 56N 3W

PostPosted: Sun Aug 14, 2022 10:09 pm    Post subject: Reply with quote

tiredcsstudent,

There is no attempt to start any console drivers there.

I was expecting to see it start on the EFI console, that hand over to a inteldrmfb, which I thought might be broken and the handover happen so fast that the EFI frambeffer didn't actually output anything.

Have you ever used a text editor on the kernel .config file?
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
tiredcsstudent
n00b
n00b


Joined: 09 Jun 2022
Posts: 12

PostPosted: Sun Aug 14, 2022 10:18 pm    Post subject: Reply with quote

To make any actual edits no, to look and search through it yes I have
Back to top
View user's profile Send private message
tiredcsstudent
n00b
n00b


Joined: 09 Jun 2022
Posts: 12

PostPosted: Mon Aug 15, 2022 3:31 am    Post subject: Reply with quote

I figured having two experimental things going on at once wasn't super helpful in regards to trying to diagnose the issue, so I removed the kernel-open use flag from the nvidia drivers. I still hang after the initramfs but now I consistently get a single pixel tall horizontal line of green and purple about two seconds after the screen the freezes. Same place every time. I have no idea if that is helpful in any regard but it seemed worth mentioning since it appears to be a consistent trait.

I also got slightly different dmesg output: https://pastebin.com/37Fstgba. I'm not sure what to really make of this to be honest, I may try and install nouveau tomorrow to see if its just the nvidia driver in general or something wrong with my system in particular
Back to top
View user's profile Send private message
kucklehead
Tux's lil' helper
Tux's lil' helper


Joined: 13 Oct 2020
Posts: 102

PostPosted: Mon Aug 15, 2022 11:25 am    Post subject: Reply with quote

I had something similar, and a couple cpu pins were bent. I would probably do what you said before and check your board,monitor,pins,etc for any damage

Hope this helps
Back to top
View user's profile Send private message
tiredcsstudent
n00b
n00b


Joined: 09 Jun 2022
Posts: 12

PostPosted: Mon Aug 15, 2022 3:07 pm    Post subject: Reply with quote

I'll see if I have time later to double check but considering I've been daily driving my system for a while now with no issue and only having this happen after I update I think it's unlikely to be hardware defects. That, and before I had ssh up and running I was using a fedora live cd to mount and copy files off the drive to upload to pastebin and had no issues there
Back to top
View user's profile Send private message
logrusx
Veteran
Veteran


Joined: 22 Feb 2018
Posts: 1451

PostPosted: Mon Aug 15, 2022 6:13 pm    Post subject: Reply with quote

@tiredcsstudent, I'd suggest taking your config from 5.18 and do `make olddefconfig` on the 5.19. Don't forget to regenerate any initramfs you might need and also don't forget to make modules_install. I'd suggest not to install the nvidia drivers on the new kernel. Just rm /lib/modules/5.19 before making modules_install and initramf genration and skip installing nvidia drivers for now. They are not needed for a framebuffer console. They don't even support it.

Once you get going, you can do your experimental stuff. Don't forget to keep at least last known good config though.

Regards,
Georgi
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum