Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Gentoo install ISO boot with 'Kernel Panic' [ROOT CAUSED]
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Installing Gentoo
View previous topic :: View next topic  
Author Message
Sławomir Gąsiorowski
n00b
n00b


Joined: 21 Jul 2004
Posts: 50
Location: Poland

PostPosted: Sat Apr 27, 2024 11:22 am    Post subject: Gentoo install ISO boot with 'Kernel Panic' [ROOT CAUSED] Reply with quote

Hi all !

When I boot minimal ISO (install-amd64-minimal-20240421T170413Z.iso) builded on top of 6.6.21-gentoo-x86_64 kernel it fails with Kernel Panic just after udev activation (sometimes later). Here are screenshots:

https://ibb.co/zhpWQZg
https://ibb.co/D5Vy11t

My config is:
Core i7 12700K with MSI Z690 DDR4 Motherboard, 32GB of DDR4 3200 RAM, 2xNVMe Lexar discs, 1xSATA disc.

I tried all available minimal and admin ISO images. All of them mostly fail with that Kernel Panic. Sometimes just hang during boot and sometimes they can even coot and everything seems to be fine. I tried with nosmp, also tried to boot on only one core (disabled all but one in BIOS) but without effect.

I also tried Funtoo installation ISO and it boots everytime (it uses linux kernel 5.X.X). Windows 11 Professional I have also works correctly. I just wanted to update my Gentoo installation from scrath. Previously I installed from ISO builded on the top of kernel 5.5.x and my old installation worked really fine. Unfortunately I deleted old ISO - my bad. Is there any archive of old Gentoo install ISO ?

The second question - does anybody have older minimal ISO image ? I remember that when I installed Gentoo on that system with kernel 5.5.X it was really stable.

Thanks in advance :-)
_________________
Slawomir Gasiorowski
email: sgasiorowski@gmail.com


Last edited by Sławomir Gąsiorowski on Wed May 01, 2024 7:02 pm; edited 2 times in total
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54300
Location: 56N 3W

PostPosted: Sat Apr 27, 2024 12:07 pm    Post subject: Reply with quote

Sławomir Gąsiorowski,

I can host any or all of
Code:
 16K -rw-r--r--  1 roy  roy    15K May 16  2021 livedvd-amd64-gentoo-nomultilib-20200902.iso.CONTENTS
5.3M -rw-r--r--  1 roy  roy   5.3M May 16  2021 livedvd-amd64-gentoo-nomultilib-20200902.iso.CONTENTS-squashfs.gz
4.0K -rw-r--r--  1 roy  roy    973 May 16  2021 livedvd-amd64-gentoo-nomultilib-20200902.iso.DIGESTS
4.2G -rw-r--r--  1 roy  roy   4.2G May 16  2021 livedvd-amd64-gentoo-nomultilib-20200902.iso
188M -rw-r--r--  1 root root  188M Sep  5  2021 stage3-amd64-nomultilib-openrc-20210905T170549Z.tar.xz
433M -rw-r--r--  1 roy  roy   432M Sep  6  2021 install-amd64-minimal-20210829T170531Z.iso
284M -rw-r--r--  1 roy  users 284M Dec 26  2021 install-alpha-minimal-20210728T195334Z.iso
556M -rw-r--r--  1 roy  users 556M Dec 28  2021 livecd-alpha-installer-2006.1.iso
4.8G -rw-r--r--  1 roy  users 4.8G Apr  9  2022 livegui-amd64-20220403T220339Z.iso
454M -rw-r--r--  1 roy  users 454M Apr 28  2022 install-arm64-minimal-20220424T234808Z.iso
454M -rw-r--r--  1 roy  users 454M May 22  2022 install-arm64-minimal-20220515T234802Z.iso
3.7G -rw-r--r--  1 roy  users 3.7G May 23  2023 livegui-amd64-20230101T164658Z.iso
205M -rw-r--r--  1 roy  users 205M Jul  2  2023 stage3-armv6j-openrc-20230701T201658Z.tar.xz
3.3G -rw-r--r--  1 roy  users 3.3G Aug 15  2023 livegui-amd64-20230806T163139Z.iso
and more too :)

but before we go there, Fatal Exception in Interrupt suggests a hardware bug, in that its not dealing with IRQs properly.
You may be able to fix that when you build your own kernel.

Meanwhile there are some kernel command line options you can try.
They are listed on the help screens attached tosoem of the Fx keys.
nomsi and irq=poll come to mind.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Sławomir Gąsiorowski
n00b
n00b


Joined: 21 Jul 2004
Posts: 50
Location: Poland

PostPosted: Sat Apr 27, 2024 6:53 pm    Post subject: Reply with quote

Thank you for response. I tried some kernel options, but none of them helped (irq-pool, noapic, nolapic, nohotplug, nosata, nosmp ...). Installer sporadically boots fine, but mostly it fails. I tried Debian Live DVD 12.5 that was builded on top of the linux kernel 6.1.x and it runs great. I don't think it's a hardware problem. I'm going to use Debian Live USB to install Gentoo and I will try with different kernels versions and config. I will put my feedback here.
_________________
Slawomir Gasiorowski
email: sgasiorowski@gmail.com
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54300
Location: 56N 3W

PostPosted: Sat Apr 27, 2024 7:00 pm    Post subject: Reply with quote

Sławomir Gąsiorowski,

That sounds like a plan.
You can probably put the Debian kernel under your Gentoo install as a get-U-going measure too.

That's the kernel, initrd and modules, not just the kernel.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Sławomir Gąsiorowski
n00b
n00b


Joined: 21 Jul 2004
Posts: 50
Location: Poland

PostPosted: Sun Apr 28, 2024 10:44 am    Post subject: Reply with quote

Hi, have very interesting feedback. First of all It turned out that Debian Live USB I use was build on top of kernel 6.x and I was able to reproduce exactly the same Kernel error and it is related with NVMe driver. Problem starts reproducing when I start working on partition, just after mounting it. And no matter what filesystem I used. After all I was able to reproduce it very often always just after nvme partition mount or later during massive I/O activity on nvme partition.

The tool smartctl shows no errors and disk seems to be in good condition. Additionaly I performed extended SMART test and also no errors found. I decided to test it under Windows 11. I created NTFS parttition and was able to fill it with data 100% then copy, delete it and fill again. No problems observed.

In my opinion linux 6.x.x has some kind of reggression in NVMe driver maybe strictly related only to model I use: Lexar NM620 512GB 2280 PCI-E 3.0. @NeddySeagoon can you share with me some Gentoo install iso from 2022/2023 that contains linux 5.x ? It's a pity that Gentoo don't host older instalation iso...
_________________
Slawomir Gasiorowski
email: sgasiorowski@gmail.com
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54300
Location: 56N 3W

PostPosted: Sun Apr 28, 2024 11:07 am    Post subject: Reply with quote

Sławomir Gąsiorowski,

Help yourself.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Sławomir Gąsiorowski
n00b
n00b


Joined: 21 Jul 2004
Posts: 50
Location: Poland

PostPosted: Sun Apr 28, 2024 12:55 pm    Post subject: Reply with quote

I have another finiding and probably root caused this issue. I tried some old funtoo install iso with 5.18.x kernel and again I reproduced this problem !!! So definitely reggression was introduced by something else. I bet that this is BIOS update, so I made a BIOS downgrade to some very old version from feb 2022 and it looks that it is !!! So far I don't observe problems. Before BIOS downgrade the problem was even during partition mount or stage file unpacking - hangs and kernel crash.

Thanks for the link, but first I will try to complete my installation using latest ISO installer with BIOS downgraded :-)
_________________
Slawomir Gasiorowski
email: sgasiorowski@gmail.com
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54300
Location: 56N 3W

PostPosted: Sun Apr 28, 2024 2:16 pm    Post subject: Reply with quote

Sławomir Gąsiorowski,

That's a good idea. That server has a lot of old stuff on it. Including all my distfiles back to mid 2006.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Sławomir Gąsiorowski
n00b
n00b


Joined: 21 Jul 2004
Posts: 50
Location: Poland

PostPosted: Sun Apr 28, 2024 9:37 pm    Post subject: Reply with quote

Hi !

I can officialy confirm. The problem with kernel crashing was caused by one of BIOS updates. I performed successfull Gentoo install using old BIOS 7D25v12 (E7D25IMS.120) from Feb 2022. So far I don't know which BIOS udpate introduced this reggression, but I will find it :-) It looks that setup like mine was not tested by MSI enough or was never validated under Linux :evil:

This is definitely not a defect in the Linux Kernel. The defect is in the BIOS and should be fixed by MSI. I will try to report this defect to them with better evidences than one screenshot. Be careful when updating the BIOS if you have setup like mine: MSI PRO Z690-A DDR4 with Core i7 12700K and NVMe disk Lexar NM620 and want to use Linux.

Please mark this thread as [Solved]
_________________
Slawomir Gasiorowski
email: sgasiorowski@gmail.com
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54300
Location: 56N 3W

PostPosted: Mon Apr 29, 2024 10:03 am    Post subject: Reply with quote

Sławomir Gąsiorowski,

Its the does it boot Windows?
Ship it!

school of BIOS quality control.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Hu
Administrator
Administrator


Joined: 06 Mar 2007
Posts: 21706

PostPosted: Mon Apr 29, 2024 2:51 pm    Post subject: Reply with quote

Sławomir Gąsiorowski wrote:
Please mark this thread as [Solved]
You can (and usually should) do this yourself, by editing the title of the opening post.
Back to top
View user's profile Send private message
Sławomir Gąsiorowski
n00b
n00b


Joined: 21 Jul 2004
Posts: 50
Location: Poland

PostPosted: Mon Apr 29, 2024 4:35 pm    Post subject: Gentoo install ISO not boot with 'Kernel Panic' [ROOTCAUSED] Reply with quote

Have more info from my testing. I narrowed the problem to my 512GB Lexar NM620 512GB M.2 2280 PCI-E x4 Gen3 NVMe. I have second one but 1TB. In theory these are the same discs but with different capacity. HwInfo shows also that there are different controllers on them and different versions of firmware. Unfortunately Lexar not provided any Firmware update for their NM620 disc series.

The failing one is: Lexar NM620 512GB M.2 2280 PCI-E x4 Gen3 NVMe (LNM620X512G-RNNNG) -> Innogrit Shasta+ IG5216 PCIe 3.0 x4 NVMe 1.4 4-channel SSD Controller (Original Device Name: Shenzhen Longsys Electronics, Device ID: 5216)
The passing one is: Lexar NM620 1TB M.2 2280 PCI-E x4 Gen3 NVMe (LNM620X001T-RNNNG) -> Shenzhen Longsys Electronics, Device ID: 1D97 (Original Device Name is the same)

So when I unplugged this 512GB disc Gentoo installer ISO boots from USB without any problem. In kernel logs I see that it detected 1TB NVMe from Lexar. When 512GB NVMe is present kernel can even immediately crash during drivers loading. I will check in linux kernel, maybe there are some "magic" knobs for that NVMe controller ?

Anyway I'm setting this thread as [ROOTCAUSED]. I'm going to update this thread when I will find any interesting info.
_________________
Slawomir Gasiorowski
email: sgasiorowski@gmail.com
Back to top
View user's profile Send private message
Sławomir Gąsiorowski
n00b
n00b


Joined: 21 Jul 2004
Posts: 50
Location: Poland

PostPosted: Wed May 01, 2024 5:49 pm    Post subject: Reply with quote

Hi all !

Have some very interesting results of my investigation. I was almost close to give up when I decided to just try how gentoo kernel binary works... So I installed one and IT WORKS !!!
I tried two stable gentoo-kernel-bin:

1. gentoo-kernel-bin-6.1.87 -> PASS
2. gentoo-kernel-bin-6.6.28 -> PASS

So I decided to just use that config from 6.6.28 binary distribution (/proc/config.gz) to reuse it on gentoo-sources-6.6.21 and it produces kernel that boots and is stable. After that I decided to make mrproper and use x86_64_defconfig from gentoo-sources-6.6.21 and after little tuning like NVMe and filesystems compilation into kernel binary it also gives me fully functional system !!!

So finally I have my Gentoo with manually compiled kernel from 6.6.21 sources fully functional with latest MSI BIOS for my MSI Z690 motherboard :-)

I also had a fruitfull discussion with my friend that actually works for Solidigm (previously for Intel like me) and he analyzed my kernel crash and pointed that it looks like some timeout in RCU (Read Copy Update feature). They are familiar with such errors. Most probably the real rootcasue is that Lexar NVMe controller or firmware not meet specifications and after BIOS update and after RCU code refactor in linux kernel some NVMe that don't meet specification may crash...

The last stage of my investigation is to find which kernel option caused that problem. Probably we are experiencing some time racing conditions in on the edge of NVMe and kernel handshake logic. When I will find something interesting I will update this thread.
_________________
Slawomir Gasiorowski
email: sgasiorowski@gmail.com
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54300
Location: 56N 3W

PostPosted: Wed May 01, 2024 7:14 pm    Post subject: Reply with quote

Sławomir Gąsiorowski,

Try a faulty kernel with
Code:
rcutree.use_softirq=0
on the kernel command line.

I have an arm64 server that exhibits the RCU problem you describe.
Nothing after kernel 5.15.x will boot without it. At least, nothing I've tried.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Sławomir Gąsiorowski
n00b
n00b


Joined: 21 Jul 2004
Posts: 50
Location: Poland

PostPosted: Thu May 02, 2024 10:48 am    Post subject: Reply with quote

I tried this option and as a result I got a bunch of other different errors. So this is not a solution in my case.
_________________
Slawomir Gasiorowski
email: sgasiorowski@gmail.com
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Installing Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum