Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
kexec ignores me!
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2011

PostPosted: Sun Apr 14, 2024 4:17 pm    Post subject: kexec ignores me! Reply with quote

Reading the wiki and the kernel docs, I think I'd like to use "kexec" to speed rebooting my system when I'm playing with my initramfs "init" script.
So I installed kexec-tools and edited /etc/kexec.conf (the ellipses are not in the file, I just removed some booring lines!):
Code:
# Kernel image pathname, relative from /boot.
KNAME="vmlinuz.new"
...
# --reuse-cmdline
#   Use the current boot command line
...
KEXEC_OPT_ARGS="--reuse-cmdline"

and /etc/conf.d/kexec:
Code:
# Load kexec kernel image into memory during shutdown instead of bootup
# (default: yes)
LOAD_DURING_SHUTDOWN="yes"
...
# Kernel image pathname, relative from BOOTPART.
...
KNAME="vmlinuz.new"
...
# Do not try to mount /boot
DONT_MOUNT_BOOT="yes"


and started the kexec service.
Then I tried rebooting with
Code:
reboot

which from what I've read in the forums, and in the wiki, should automagically reboot using kexec. My rc.log file shows:
Code:
rc shutdown logging started at Sun Apr 14 16:47:09 2024
local                     | * Stopping local ...
...
alsasound                 | * Storing ALSA Mixer Levels ...
kexec                     | * Using kernel image /boot/vmlinuz.new for kexec ... [ ok ]
...
swap                      | * Deactivating swap devices ...
...
localmount                | *   Unmounting /home ...
 [ ok ]
udev                      | * Stopping udev ...
 [ ok ]

rc shutdown logging stopped at Sun Apr 14 16:47:10 2024

So it looks like the kexec service was invoked OK and didn't throw any errors.

But, of course, or I wouldn't be writing this, I got a common-or-garden reboot though BIOS and GRUB, with time for a sip of coffee or two, rather than the lightning fast reboot I was hoping for.

Yes, my kernel has:
Code:
CONFIG_KEXEC_CORE=y
CONFIG_KEXEC=y
CONFIG_KEXEC_FILE=y
# CONFIG_KEXEC_SIG is not set

_________________
Greybeard
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Sun Apr 14, 2024 6:16 pm    Post subject: Reply with quote

Don't you need to use
Code:
reboot -k
?
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2011

PostPosted: Sun Apr 14, 2024 7:02 pm    Post subject: Reply with quote

Not according to the comments in the forums. "reboot -k" says you can't do it in this runlevel, for all runlevels other than 6, which you normally reach via the shutdown command (or one of the many aliases involved here). You can do "reboot -kf", but that chops the system off at the knees, leaving open files and all sorts of nastiness.

<edit>I know the wiki says that, but it doesn't work, and there's a discussion in this forum article.

What's supposed to happen is that, with the kexec-tools package installed and a kexec-enabled kernel, under OpenRc, you start the "kexec" service and then do a normal shutdown. That service runs during shutdown, and a hook in OpenRc (and possibly in sysv-init stuff, I'm trying unsuccessfully to wade my way through the twisty-turney maze of code here) is supposed to to the actual kexec thing once the system is nicely in bed and asleep - i.e. run level 6.

One possibility that's occurred to me is that it might be my use of 'rc_parallel="YES" in /etc/rc.conf - maybe the hook can't tell if kexec is required. I need to make a fairly trivial test...
_________________
Greybeard
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Sun Apr 14, 2024 8:13 pm    Post subject: Reply with quote

So, what about your /etc/inittab?

Does level 6 reboot have the -k option?
Code:
...
si::sysinit:/sbin/openrc sysinit

# Further system initialization, brings up the boot runlevel.
rc::bootwait:/sbin/openrc boot

l0u:0:wait:/sbin/telinit u
l0:0:wait:/sbin/openrc shutdown
l0s:0:wait:/sbin/halt.sh
l1:1:wait:/sbin/openrc single
l2:2:wait:/sbin/openrc nonetwork
l3:3:wait:/sbin/openrc default
l4:4:wait:/sbin/openrc default
l5:5:wait:/sbin/openrc default
l6u:6:wait:/sbin/telinit u
l6:6:wait:/sbin/openrc reboot
l6r:6:wait:/sbin/reboot -dkn
#z6:6:respawn:/sbin/sulogin

# new-style single-user
su0:S:wait:/sbin/openrc single
...
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2011

PostPosted: Mon Apr 15, 2024 9:30 am    Post subject: Reply with quote

Yup, same list.
_________________
Greybeard
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Mon Apr 15, 2024 11:38 am    Post subject: Reply with quote

Could it be that kernel failed load into memory before reboot?

Is your vmlinuz.new on a partition that got unmounted by the time the kexec try to load into memory?
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2011

PostPosted: Mon Apr 15, 2024 2:22 pm    Post subject: Reply with quote

OK, after some tests, I find rather usefully that issuing:
Code:
openrc shutdown

does what I hoped (get into runlevel 6) rather than what I expected (either shutdown or reboot before I could do anything).
So at this point we've a command line shell running, everything stopped, only rootfs mounted and that read-only. So actually, a safe place to play.
Code:
kexec -l /boot/vmlinuz.new --reuse-cmdline

and variants to that effect work. (Adding the parameter "-d" gives a load of incomprehensible information to show it loaded something.)
So, there's a kernel loaded ready to reboot into. However:
Code:
kexec -e

does indeed reboot, but damnably into BIOS and GRUB. So something's not right with my kernel.
Same for:
Code:
kexec -f /boot/vmlinuz.new --reuse-cmdline

As an aside,
Code:
reboot -k

still says
Quote:
ERROR: using -k at this runlevel requires also -f
(You probably want instead to reboot normally and let your reboot
script, usually /etc/init.d/reboot, specify -k)

So &deity. alone know which runlevel it requires? None of the ones I can find!
_________________
Greybeard
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Mon Apr 15, 2024 2:58 pm    Post subject: Reply with quote

I think
Code:
openrc shutdown
Only switch OpenRC system to "soft" level, the init (pid 1) did not switch.

I assume your "reboot" came from sysvinit package. You can fool the "reboot" by
Code:
INIT_VERSION=1
RUNLEVEL=6
export INIT_VERSION RUNLEVEL
reboot -k
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21650

PostPosted: Mon Apr 15, 2024 3:10 pm    Post subject: Reply with quote

At the risk of diverting the thread, I want to note that for some initramfs testing, a throw-away qemu virtual machine with no attached network and no usable installed system can be a decent alternative. It can bring up the kernel and initramfs, run through the initramfs logic, and if you provide a dummy disk with appropriate partitions/LVM/LUKS (depending on what the initramfs wants), work through to where the initramfs tries to transfer control to the installed system. (It will likely fail at that point, but if you can reach the end of the initramfs, that may suffice to prove this is one you want to use for the real system.) This approach has the advantage of being very quick to iterate since it does not halt the original working system, so you can correct problems and rebuild readily. It has the disadvantage that you need to build some extra infrastructure to satisfy the initramfs's expectations about the environment.
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2011

PostPosted: Mon Apr 15, 2024 5:17 pm    Post subject: Reply with quote

Been reading the kexec kernel archives. There seems to be a regression in 6.7.6 on some AMD machines that introduces the observed behaviour. I'll pursue this avenue if poss.
_________________
Greybeard
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Mon Apr 15, 2024 7:00 pm    Post subject: Reply with quote

Goverp wrote:
Been reading the kexec kernel archives. There seems to be a regression in 6.7.6 on some AMD machines that introduces the observed behaviour. I'll pursue this avenue if poss.

Will Hu's suggestion using a VM for test help debugging this easier? if the VM is emulating a non-AMD cpu (but still x86/x86-64) would it help to give the definitive answer of where the problem lying?
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Mon Apr 15, 2024 7:13 pm    Post subject: Reply with quote

Goverp wrote:
Been reading the kexec kernel archives. There seems to be a regression in 6.7.6 on some AMD machines that introduces the observed behaviour. I'll pursue this avenue if poss.
Another thoughts, I saw in the kexec kernel archives suggest use early printk, do you think it will help?
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2011

PostPosted: Tue Apr 16, 2024 9:41 am    Post subject: Reply with quote

I'd have to install the various software layers for a VM. Early printk is probably the wrong end, I think the "kexec -c" call is failing (there's no output in syslog), and that's at the end of the reboot process, not the start of the next kernel.

The trouble is I've already spent a couple of days, say 8 hours, on this, to shave perhaps 20 seconds off my reboot time, so it already takes 1,440 reboots to recover the outlay. If I continue to work on this, it's for fame and glory, not profit!

<edit>Typo: I meant "kexec -e", not "kexec -c" above.
_________________
Greybeard


Last edited by Goverp on Thu Apr 18, 2024 6:55 pm; edited 1 time in total
Back to top
View user's profile Send private message
Zucca
Moderator
Moderator


Joined: 14 Jun 2007
Posts: 3347
Location: Rasi, Finland

PostPosted: Thu Apr 18, 2024 4:34 pm    Post subject: Reply with quote

I guess I could test if this works with openrc-init -based system.
_________________
..: Zucca :..
Gentoo IRC channels reside on Libera.Chat.
--
Quote:
I am NaN! I am a man!
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2011

PostPosted: Thu Apr 18, 2024 6:53 pm    Post subject: Reply with quote

Zucca wrote:
I guess I could test if this works with openrc-init -based system.

Thanks for the offer, but I doubt the init system is implicated. The "kexec -l" call works, and its debug output shows a kernel being loaded into storage. But an explicit "kexec -e" call from runlevel 6 reboots to BIOS not Linux for me. Of course, there's no output from that, or rather, it gets thrown away because in runlevel 6 there are no r/w resources!

I might try it with an older kernel to see if it is indeed the regression on AMD processors identified in the kexec mailing list. Been a bit too busy to try it over the last few days.
_________________
Greybeard
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Fri Apr 19, 2024 7:46 pm    Post subject: Reply with quote

Goverp,

Have you try to boot directly from /boot/vmlinuz.new? (I mean without kexec)? I don't recall if we eliminate that it just a bad kernel build case.

Do your system boot with UEFI?

It just occur to me that if your /boot/vmlinuz.new is a efi stub kernel, if that is true that may be the reason we are not able to kexec boot. I don't have evident yet (still research code) but I believe kexec does not support PE32 header, so it will not understand how to boot into a efi stub kernel.
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2011

PostPosted: Fri Apr 19, 2024 10:21 pm    Post subject: Reply with quote

pingtoo wrote:

Have you try to boot directly from /boot/vmlinuz.new? (I mean without kexec)? I don't recall if we eliminate that it just a bad kernel build case.

All the time - it's what I'm using at this very second.
Quote:

Do your system boot with UEFI?

Yes
Quote:

It just occur to me that if your /boot/vmlinuz.new is a efi stub kernel, if that is true that may be the reason we are not able to kexec boot. I don't have evident yet (still research code) but I believe kexec does not support PE32 header, so it will not understand how to boot into a efi stub kernel.

I'll give it a try without the EFI stub - it has one so that in extremis when GRUB breaks (as updates sometimes break it) I can boot with rEFInd or my BIOS's tools. I only added EFI support relatively recently when GRUB 2.06 AFAIR broke my system. (I should have left GRUB well alone, after all, it worked, what more is needed?)
_________________
Greybeard
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2011

PostPosted: Sat Apr 20, 2024 9:28 am    Post subject: Reply with quote

OK, tried the same kernel without EFIstub, still no joy. Again, kexec -l claims to load it OK, but kexec -e reboots to BIOS. I also tried with an older kernel, 6.6.8 or thereabouts (I'm on a different machine just now), though that one didn't have kexec support. It's not clear to me if kexec support is needed in the rebooting kernel or the rebooted kernel or both ...
Anyway. more debugging required.
_________________
Greybeard
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Sat Apr 20, 2024 2:52 pm    Post subject: Reply with quote

Goverp wrote:
OK, tried the same kernel without EFIstub, still no joy. Again, kexec -l claims to load it OK, but kexec -e reboots to BIOS. I also tried with an older kernel, 6.6.8 or thereabouts (I'm on a different machine just now), though that one didn't have kexec support. It's not clear to me if kexec support is needed in the rebooting kernel or the rebooted kernel or both ...
Anyway. more debugging required.
Will, I had high hope this is it, from source code, I am about 90% sure the kexec system does not support boot a EFI binary. but I am reading Linux source in github master branch. I don't know if Gentoo have patched to support boot EFI with kexec.

If we going do debug this I will need some detail information so I can compare it with kernel source code to see what steps were executed to understand if there are missing configuration.

So if you can share your .config for both kernel will be great. Also the load with debug as in
Code:
kexec -d -l ...
, Please also do
Code:
file /boot/*<kernel-image-name>*
so we can be sure each kernel image file format.
If it is possible please also share dmesg after kexec -l ..., There should be something show about how the running kernel react to the load syscall.

P.S. Let's name the two kernel. I will can the current running kernel as "A" and the one that to be booted as "B".

As far as I can tell only the "A" kernel require kexec support, however if you wish to use "B" (i.e. once you kexec into "B") to also kexec into "A" (or "C") you should have in "B"
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Sun Apr 21, 2024 12:01 am    Post subject: Reply with quote

After further study kernel code, Now I must correct my early statement about kexec not support EFI binary.

Actually kexec Do support EFI.

The only thing I have not yet to find out is how it work. All I can tell is it recognize a EFI binary in a bzImage format, I have just not yet find out how it find the kernel entry point in the bzImage, or if it is calling EFI service to perform reboot.

My apology for any confusion I made.

Goverp, I am sorry my suggestion lead to your extra work and time.

Govero, if you still willing to work with me, please continue do I suggest in my post for debugging. And I am OK and understand you feel last confident in me if you prefer not to.
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2011

PostPosted: Sun Apr 21, 2024 2:33 pm    Post subject: Reply with quote

Pingtoo,
No problem, I'm grateful for any help! I'll get the answers for you later - it's a good day for gardening today, so I'll be out in the sun for now.
_________________
Greybeard
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2011

PostPosted: Mon Apr 22, 2024 3:43 pm    Post subject: Reply with quote

OK, here's my config, as a pastebin,
and here's the output from
Code:
kexec -l /boot/vmlinuz.new --reuse-cmdline


vmlinuz.new is a symbolic link, and file says of it:
Quote:
/boot/vmlinuz-6.8.7-git.new: Linux kernel x86 boot executable bzImage, version 6.8.7-git (packager@ryzen) #213 SMP Mon Apr 22 15:38:48 BST 2024, RO-rootFS, swap_dev 0XD, Normal VGA


To sprinkle a little confusion over this, the kernel is compiled with clang and lto-thin and KCFLAGS="-march=native", but I built a clean kernel with pure gcc and no fancy KCFLAGS, still exactly the same result - kexec -e thinks for a bit, then reboots into BIOS and thence GRUB.

In all cases so far, A and B kernels are the same - i.e. the target of "kexec -l" is the kernel under which I'm running. I couldn't get any dmesg output from kexec -e, though I'll try again, and there's nothing in dmesg from kexec, or indeed anything from the reboot, though that's not surprising as nothing is writeable in runlevel 6.
_________________
Greybeard
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2011

PostPosted: Mon Apr 22, 2024 4:03 pm    Post subject: Reply with quote

I tried a minor experiment, running "kexec -de" in single-user mode ("openrc single"). That causes a reboot rather than issuing a warning message saying "don't do that in this run mode", but (a) it was still to BIOS and GRUB, and (b) still nothing in syslog, though of course that's one of the services that wouldn't be running, even though my rootfs would still have been writeable.
_________________
Greybeard
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Mon Apr 22, 2024 6:55 pm    Post subject: Reply with quote

Goverp wrote:
OK, here's my config, as a pastebin,
and here's the output from
Code:
kexec -l /boot/vmlinuz.new --reuse-cmdline


vmlinuz.new is a symbolic link, and file says of it:
Quote:
/boot/vmlinuz-6.8.7-git.new: Linux kernel x86 boot executable bzImage, version 6.8.7-git (packager@ryzen) #213 SMP Mon Apr 22 15:38:48 BST 2024, RO-rootFS, swap_dev 0XD, Normal VGA


To sprinkle a little confusion over this, the kernel is compiled with clang and lto-thin and KCFLAGS="-march=native", but I built a clean kernel with pure gcc and no fancy KCFLAGS, still exactly the same result - kexec -e thinks for a bit, then reboots into BIOS and thence GRUB.

In all cases so far, A and B kernels are the same - i.e. the target of "kexec -l" is the kernel under which I'm running. I couldn't get any dmesg output from kexec -e, though I'll try again, and there's nothing in dmesg from kexec, or indeed anything from the reboot, though that's not surprising as nothing is writeable in runlevel 6.


I have a guess, your kernel seems compressed by zstd, but kexec-tool could only support gzip or lzma. so that may be the reason. So if you can try to rebuild kernel with gzip or lzma compress to see if this is the cause.

TL:DR, conflict information from debug log of kexec -l and .config
From .config you have CONFIG_KERNEL_ZSTD=y however the debug log show "Try gzip decompression." follow by correct information about the information of the kernel image, which from the source code point of view seems to be impossible, Because the code logic call "slurp_decompress_file()" to decompress and load the file into memory. And the "slurp_decompress_file()" first try to use zlib call to check the file then try to use lzma to read the file. from the log it seems to me the zlib have successfully open and read in some bytes and verified those bytes meet the expected signature, therefor it continue read to the end of file and perform decompress.

The most confusion part for me is that log show whole bunch of
Code:
sym: sha256_starts info: 12 other: 00 shndx: 1 value: 11e0 size: 1f
sym: sha256_starts value: 82f2f81e0 addr: 82f2f7015
R_X86_64_64
,,,
That seem indicate successful decompress and read in the content, From my reading the code logic it seems should not happen. There may be something I miss.

Anyway, I hope you have the opportunity to try using gzip (or lzma) for kernel compress and use it with kexec -d -l ... to see if that make any different.
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2011

PostPosted: Tue Apr 23, 2024 11:22 am    Post subject: Reply with quote

pingtoo wrote:
...
I have a guess, your kernel seems compressed by zstd, but kexec-tool could only support gzip or lzma. so that may be the reason. So if you can try to rebuild kernel with gzip or lzma compress to see if this is the cause.
...
That seem indicate successful decompress and read in the content, From my reading the code logic it seems should not happen. There may be something I miss.
...

Good suggestion, but when I tried with Gzip, no change.

I'm not too surprised, I didn't think BIOS or EFI or GRUB decompressed the kernel, so I googled a bit and found the following in the wikipedia article on the vmlinux files:
Quote:
Traditionally, when creating a bootable kernel image, the kernel is also compressed using gzip, or, since Linux 2.6.30,[3] using LZMA or bzip2, which requires a very small decompression stub to be included in the resulting image. The stub decompresses the kernel code, on some systems printing dots to the console to indicate progress, and then continues the boot process.
...
The bzImage file is in a specific format. It contains concatenated bootsect.o + setup.o + misc.o + piggy.o.[8] piggy.o contains the gzipped vmlinux file in its data section. The script extract-vmlinux found under scripts/ in the kernel sources decompresses a kernel image. Some distributions (e.g. Red Hat and clones) may come with a kernel-debuginfo RPM that contains the vmlinux file for the matching kernel RPM, and it typically gets installed under /usr/lib/debug/lib/modules/`uname -r`/vmlinux or /usr/lib/debug/lib64/modules/`uname -r`/vmlinux.

which means that the image loaded by GRUB, the EFI stub, or kexec -l, contains the the above three parts, and I guess setup.o decompresses piggy.o, not kexec.

I had a look again on the kexec mailing list archives; there's no follow-up as yet on the problem with AMD hardware and kexec reported last month. I'll try contacting the author, who has taken the discussion off-line to avoid polluting the kernel mailing lists with discussions of how to perform git bisect...
_________________
Greybeard
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum