Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Assistance Kernel & Hardware
  • Search

Kernel 6.16.* has sporadic new starts

Kernel not recognizing your hardware? Problems with power management or PCMCIA? What hardware is compatible with Gentoo? See here. (Only for kernels supported by Gentoo.)
Post Reply
Advanced search
12 posts • Page 1 of 1
Author
Message
mv
Watchman
Watchman
User avatar
Posts: 6795
Joined: Wed Apr 20, 2005 12:12 pm

Kernel 6.16.* has sporadic new starts

  • Quote

Post by mv » Sat Aug 30, 2025 7:38 am

After I upgraded to kernel 6.16 I experience sporadic new starts.
It happens on two different machines, though on one with much higher frequency (1/week vs. 1/several hours). There is a higher probability that it happens during unpacking of huge files or with other memory-exhausting activities (though usually not during compilation even with memory pressure).

When it happens, wayland just freezes until the reboot comes. In the logs (standard logs, I am not using systemd) there seems to be written just a block if 0 characters. That's all I can see.

It happens so far with 6.16.{0,1,2}. Downgrading to 6.15.6 seems to solve the problem.

How could I debug this?

Pastebin with kernel config
Top
mv
Watchman
Watchman
User avatar
Posts: 6795
Joined: Wed Apr 20, 2005 12:12 pm

  • Quote

Post by mv » Sun Sep 07, 2025 6:15 pm

It seems that 6.16.4 was running stable for a while on both systems, though it might be that I was just lucky. With 6.16.5-r1, I had again several reboots (even on the less broken system) in almost 1/2 hour periods (though I had an untypical usage pattern at that time).

I recompiled the so-far latest 6.15 (6.15.11), and this seems to run stable.

So I am rather sure that the problem exists on both systems exactly since 6.16.0, but the reboots are too sporadic to do any reasonable binary search on the cause.

Unless I can get the system to display or log something instead of rebooting, I see no way of debugging the issue.
Top
sam_
Developer
Developer
User avatar
Posts: 2816
Joined: Fri Aug 14, 2020 12:33 am

  • Quote

Post by sam_ » Sun Sep 07, 2025 6:30 pm

"New starts" mean "restarts" here?

You can try setting up pstore to save logs on crashes, or use netconsole.
Top
logrusx
Advocate
Advocate
User avatar
Posts: 3533
Joined: Thu Feb 22, 2018 2:29 pm

Re: Kernel 6.16.* has sporadic new starts

  • Quote

Post by logrusx » Sun Sep 07, 2025 6:32 pm

mv wrote:sporadic new starts.
Is this some kind of a machine translation? What does sporadic new starts mean?

Best Regards,
Georgi
Top
mv
Watchman
Watchman
User avatar
Posts: 6795
Joined: Wed Apr 20, 2005 12:12 pm

Re: Kernel 6.16.* has sporadic new starts

  • Quote

Post by mv » Sun Sep 07, 2025 6:57 pm

logrusx wrote:
mv wrote:sporadic new starts.
Is this some kind of a machine translation? What does sporadic new starts mean?

Best Regards,
Georgi
By "sporadic new starts" I mean that the machines reboot in unpredictable (usually long) intervals without any clear cause.

The only things which the reboots have in common is that there should have occurred some sort of harddisk access, and usually (though not always) there is some continuous memory usage in the background by some processes.

In some cases, the reboots happened when I started to watch a movie (or sometimes though less frequently also in the middle of a movie) or when I uncompressed or recompressed something. If it happens, the system completely halts for 10-20 seconds (when watching a movie, the last ~2 seconds of sound are replayed in a loop) and then starts the reboot. The logs contain a lengthy string of 0-bytes before the new boot is logged, but nothing else.
Top
mv
Watchman
Watchman
User avatar
Posts: 6795
Joined: Wed Apr 20, 2005 12:12 pm

  • Quote

Post by mv » Sun Sep 07, 2025 7:11 pm

sam_ wrote:"New starts" mean "restarts" here?
Yes.
You can try setting up pstore to save logs on crashes, or use netconsole.
Netconsole is not easy for me to set up as I lack a second physical machine nearby. Thanks for the hint about pstore which I did not know before. My harddisk is fully partitioned, but perhaps I can try to mount a file on an existing filesystem as pstore or re-purpose my swap partition.
Top
logrusx
Advocate
Advocate
User avatar
Posts: 3533
Joined: Thu Feb 22, 2018 2:29 pm

  • Quote

Post by logrusx » Sun Sep 07, 2025 7:13 pm

Can you try what Sam suggested?

I would also express doubts about your memory stability. Have you run memtest recently?

Also 6.16.5 is out, you can try it. I wouldn't be surprised if this is some kind of a bug. For several lines of kernels my laptop wasn't able to resume from S3 sleep.

Best Regards,
Georgi
Top
mv
Watchman
Watchman
User avatar
Posts: 6795
Joined: Wed Apr 20, 2005 12:12 pm

  • Quote

Post by mv » Sun Sep 07, 2025 7:15 pm

logrusx wrote:I would also express doubts about your memory stability.
This was also my first idea when it happened on one machine, only. But then it started to happen on the second as well, and on both machines returning to 6.15.* solves the problem.
Can you try what Sam suggested?
Yep, I will do. But it will take time. During week, I simply have no time for such things and only rarely on weekend.
Top
logrusx
Advocate
Advocate
User avatar
Posts: 3533
Joined: Thu Feb 22, 2018 2:29 pm

  • Quote

Post by logrusx » Sun Sep 07, 2025 8:31 pm

If I had so little time, I wouldn't waste it chasing kernel bugs. Spare yourself the trouble, use 6.15 out even 6.12 and let people with more time and perhaps more knowledge deal with it. It's most likely a bug that's not going to stay unnoticed.

Best Regards,
Georgi
Top
mv
Watchman
Watchman
User avatar
Posts: 6795
Joined: Wed Apr 20, 2005 12:12 pm

  • Quote

Post by mv » Sat Sep 13, 2025 7:10 am

logrusx wrote:If I had so little time, I wouldn't waste it chasing kernel bugs. Spare yourself the trouble, use 6.15 out even 6.12 and let people with more time and perhaps more knowledge deal with it. It's most likely a bug that's not going to stay unnoticed.
It is still not fixed, even in 6.16.7. I am afraid that the bug is triggered only in connection with some kernel option or some hardware in both of my machines which is not so commonly used. Also the last time when something similar happened, it was not fixed until I reported it. However, the last time the machine rebooted immediately at the start so that I could do a binary search and track down the culprit patch (at this time, it was in some harddisk driver, although I have a standard disk - I have still no idea why this did not byte many other people as well).
Anyway, this weekend I will probably still not have any time to debug.
Top
mv
Watchman
Watchman
User avatar
Posts: 6795
Joined: Wed Apr 20, 2005 12:12 pm

  • Quote

Post by mv » Sat Oct 11, 2025 8:30 pm

Still no fix in 6.17.1. The last working version is 6.15.11.

I think that I reduced it to one or more of these 3 kernel options:
  1. CONFIG_RANDSTRUCT_FULL=y (that is: CONFIG_GCC_PLUGIN_RANDSTRUCT=y)
  2. CONFIG_KSTACK_ERASE=y (that is: CONFIG_GCC_PLUGIN_STACKLEAK=y)
  3. CONFIG_BUG_ON_DATA_CORRUPTION=y
though the lack of a crash unfortunately can just mean that I did not manage to trigger the bug.

Actually, Linus wanted to kick out the randstruct plugin as it sometimes caused crashes, so this is my top-candidate for the culprits. However, security-wise I guess that it would be a bad idea to run a production system without it. For similar reasons, I would not want to remove any of the other two options from my kernel .
Top
mv
Watchman
Watchman
User avatar
Posts: 6795
Joined: Wed Apr 20, 2005 12:12 pm

  • Quote

Post by mv » Sat Oct 25, 2025 4:20 pm

Latest update: The bug seemed to be “triggered” by

Code: Select all

CONFIG_BUG_ON_DATA_CORRUPTION=y
Of course, this means that the bug is not actually triggered by this but just hidden by non-checking for the bug.

In 6.17.3 the bug did still occur with this option. In 6.17.5, the bug seems to be fixed (that is, so far it did not occur even with this option); the latter might also be related with the upgrade to gcc-15.2.1_p20251018.

Of course, the two optimistic comments here (that the problem does not occur with only this option disabled and with the newest kernel and gcc) can just mean that the problem did not occur yet in these cases. However, things did run well for a while in both cases.
Top
Post Reply

12 posts • Page 1 of 1

Return to “Kernel & Hardware”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic