Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED] fails to reboot, but will startup
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 440

PostPosted: Sat Jun 05, 2021 10:46 pm    Post subject: [SOLVED] fails to reboot, but will startup Reply with quote

Hi,

I have a strange issue where my machine won't reboot with the 'reboot' command, but if I shutdown (either with 'shutdown -h now' or a hard stop with the power button), and then start it up, it starts fine. The bootloader is lilo, and the failure occurs between the lilo boot screen while it says it's loading and the first steps of the actual bootup process. This issue came up after a recent kernel update, which is now 5.12.7 from gentoo-sources.

I'd be grateful for any thoughts.


Last edited by jyoung on Thu Jun 17, 2021 4:19 pm; edited 1 time in total
Back to top
View user's profile Send private message
alamahant
Advocate
Advocate


Joined: 23 Mar 2019
Posts: 3879

PostPosted: Sun Jun 06, 2021 12:49 pm    Post subject: Reply with quote

I think its simple
1.Good kernel
2.Good initrd
3.Grub
Why use lilo?who uses lilo nowadays?
Only slackware no?

Also plz plz dont do hard reboots.
:)
A lot more info is needed to troubleshoot this.
It could be a combination of many things.
dmesg
is your friend.
Also a pic of the failing boot maybe?
A patebin of your kernel .config also?
Do you use an initrd?
How is your partition layout?
etc
_________________
:)
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 440

PostPosted: Sun Jun 06, 2021 4:54 pm    Post subject: Reply with quote

Okay, here is my kernel config
http://www.pastebin.com/1G5d0NEi

And dmesg
http://www.pastebin.com/m40DqKAK

My partition layout is pretty simple, here's part of /etc/fstab:

Code:
/dev/sda1          /boot   ext4   noauto,noatime      1 2
/dev/sda2           none           swap   sw             0 0
/dev/sda3           /           ext4   noatime         0 1
/dev/sda5      /home   ext4   noatime         0 0


Agreed, hard stops are to be avoided, but in this case when the machine is hang before bootup there's no option. I'm not using an initrd. I'm using lilo because when I setup this machine five years ago I knew that I wanted something simple and I hadn't tried lilo yet, and while it was old even at that point the gentoo handbook still described it as reliable. Honestly, I haven't had problems with it since, although it's always possible that this could be the first instance. Here's my lilo config:

Code:
boot=/dev/sda
prompt
timeout=60
default=gentoo-linux

image=/boot/vmlinuz-5.12.7-gentoo
  label=gentoo-linux
  read-only
  root=/dev/sda3
  append="acpi_osi=Linux"

image=/boot/vmlinuz-5.6.11-gentoo
  label=gentoo-backup
  read-only
  root=/dev/sda3
  append="acpi_osi=Linux"
Back to top
View user's profile Send private message
Jaglover
Watchman
Watchman


Joined: 29 May 2005
Posts: 8291
Location: Saint Amant, Acadiana

PostPosted: Sun Jun 06, 2021 5:23 pm    Post subject: Reply with quote

Code:
acpi_osi=Linux

This is certainly affecting kernel interaction with BIOS, have you tried without it? And don't get carried away, this has nothing to do with your booloader choice. Booloader works only for a fraction of second during boot, bootloader has nothing to do with reboot or shutdown.
_________________
My Gentoo installation notes.
Please learn how to denote units correctly!
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 440

PostPosted: Sun Jun 06, 2021 8:45 pm    Post subject: Reply with quote

Hmm, I can't recall why I added acpi_osi=Linux to lilo.conf. I was five years ago that I set it up... However, this post suggests that it's a reasonable thing to do:
https://askubuntu.com/questions/28848/what-does-the-kernel-boot-parameter-set-acpi-osi-linux-do

Still, no harm in trying without. I setup this new lilo.conf file

Code:
boot=/dev/sda
prompt
timeout=60
default=gentoo-linux

image=/boot/vmlinuz-5.12.7-gentoo
  label=gentoo-linux
  read-only
  root=/dev/sda3
  append="acpi_osi=Linux"

image=/boot/vmlinuz-5.12.7-gentoo
  label=gentoo-test
  read-only
  root=/dev/sda3

image=/boot/vmlinuz-5.6.11-gentoo
  label=gentoo-backup
  read-only
  root=/dev/sda3
  append="acpi_osi=Linux"


Both 'gentoo-linux' and 'gentoo-test' seem to behave the same, that is, they startup normally except after a 'reboot' command.

Some more observations: When the failure occurs, the last thing I see is a message from the lilo screen saying "BIOS data check successful". This message is printed regardless of whether or not the startup is about to fail. In the failure case, it then goes to a blank screen. I also know that this is not just a display issue, and the machine is actually not starting up, since I can't login remotely.
Back to top
View user's profile Send private message
Logicien
Veteran
Veteran


Joined: 16 Sep 2005
Posts: 1555
Location: Montréal

PostPosted: Mon Jun 07, 2021 11:58 am    Post subject: Reply with quote

When reboot the system softwares are reinitialised but not the microcode of the devices as it is done with a poweroff. I had a problem like this with an Intel Apu where the reboot was not finished using the kernel module i915. After blacklist i915 Linux started to use the efifb framebuffer and than reboot finished positively.

So this can have to do with the graphic card if you use an Intel integrated video card. Using an Ati/Amd Pcie video card with the Linux radeon module reboot is fine. One device may not be reinitialised correctly on reboot when stuck on the boot process.
_________________
Paul
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 440

PostPosted: Mon Jun 07, 2021 9:50 pm    Post subject: Reply with quote

From this article it looks like one option might be to pull the microcode into the kernel

http://www.kernel.org/doc/html/latest/x86/microcode.html

If I were to go the route of switching to an efifb framebuffer, this article

http://www.kernel.org/doc/html/latest/fb/efifb.html

says "The system must be booted via the EFI stub for this to be usable." That seems drastic. What are the advantages of efifb?
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 440

PostPosted: Tue Jun 08, 2021 8:38 pm    Post subject: Reply with quote

Okay, I tried compiling the appropriate microcode into the kernel as per this tutorial

https://wiki.gentoo.org/wiki/Intel_microcode

But the results are the same.
Back to top
View user's profile Send private message
Jaglover
Watchman
Watchman


Joined: 29 May 2005
Posts: 8291
Location: Saint Amant, Acadiana

PostPosted: Tue Jun 08, 2021 8:45 pm    Post subject: Reply with quote

jyoung,

the askubuntu article you linked to also tells "Yes, BIOS's usually disable functionality if Windows is not detected", which means if you tell such a braindead BIOS you are running Linux it may misbehave. Maybe lying to it you are running windows makes it listen to the OS? My 2¢.
_________________
My Gentoo installation notes.
Please learn how to denote units correctly!
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 440

PostPosted: Tue Jun 08, 2021 9:44 pm    Post subject: Reply with quote

Alas, no luck. Here's my lilo.conf file:

Code:
boot=/dev/sda
prompt
timeout=60
default=gentoo-linux

image=/boot/vmlinuz-5.12.7-gentoo
  label=gentoo-linux
  read-only
  root=/dev/sda3
  append="acpi_osi=Linux"

image=/boot/vmlinuz-5.12.7-gentoo
  label=gentoo-test
  read-only
  root=/dev/sda3
  append="acpi_osi=Windows"

image=/boot/vmlinuz-5.6.11-gentoo
  label=gentoo-backup
  read-only
  root=/dev/sda3
  append="acpi_osi=Linux"


Even with append="acpi_osi=Windows" under gentoo-test, the problem persists.
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 440

PostPosted: Wed Jun 09, 2021 12:34 am    Post subject: Reply with quote

The intel microcode gentoo wiki also states

Quote:
If the initramfs USE flag is active the intel-microcode ebuild will automatically install a cpio archive of all microcode into /boot/intel-uc.img.


With equery uses intel-microcode I get

Code:
- - hostonly    : only install ucode(s) supported by currently available (=online) processor(s)
 - - initramfs   : install a small initramfs for use with CONFIG_MICROCODE_EARLY
 + + split-ucode : install the split binary ucode files (used by the kernel directly)
 - - vanilla     : install only microcode updates from Intel's official microcode tarball


So, maybe I should focus on early microcode loading? But, there's no CONFIG_MICROCODE_EARLY in the .config file, and in menuconfig I can't find any reference to MICROCODE_EARLY with the '/' search.
Back to top
View user's profile Send private message
Jaglover
Watchman
Watchman


Joined: 29 May 2005
Posts: 8291
Location: Saint Amant, Acadiana

PostPosted: Wed Jun 09, 2021 1:30 am    Post subject: Reply with quote

If you build it into kernel it will load early, there is no extra option for early loading, see the timestamp in my dmesg.
Code:
[    0.000000] microcode: microcode updated early to revision 0xea, date = 2021-01-05

_________________
My Gentoo installation notes.
Please learn how to denote units correctly!
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21632

PostPosted: Wed Jun 09, 2021 1:45 am    Post subject: Reply with quote

It appears that MICROCODE_EARLY was removed in fe055896c040df571e4ff56fb196d6845130057b in 2015. However, as Jaglover says, the functionality still exists. There is just no symbol for excluding it, because early microcode was deemed to be the better approach than supporting late microcode.
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 440

PostPosted: Wed Jun 09, 2021 1:50 am    Post subject: Reply with quote

Okay, so the microcode is in the kernel and loaded early... but even on a reboot? Logicien, you suggested that the microcode my not be reinitialized on a reboot, and that seems to fit the symptoms here.
Back to top
View user's profile Send private message
Logicien
Veteran
Veteran


Joined: 16 Sep 2005
Posts: 1555
Location: Montréal

PostPosted: Wed Jun 09, 2021 5:59 am    Post subject: Reply with quote

With i915 the backlight is always on and I have not found a way to disable it after try I everything I could. In plus the reboot is slow with it. On a Dell Optiplex 7100 the Dell Efi/Bios logo is not reappearing and the computer stay in an idle state and the screen go to save power mode. Replacing i915 by efifb resolv all problems, backlight is off and reboot is good. But, efifb is not performing in FPS as i915.

Now I use an Amd/Ati Pcie extension card and the radeon module work well. But the integrated Intel Apu is performing the best in terms of Frames Per Second (FPS). Anyway I think that the cold poweroff is better than a reboot in terms of testing an upgrade for example.

If you use Grub2 and it display properly you can try to pass to it in the /etc/default/grub file the parameter GRUB_GFXPAYLOAD_LINUX=keep and see if Linux use it and boot properly too. Or, use the Linux kernel parameter video= to tell to Linux which resolution use.
_________________
Paul
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 440

PostPosted: Fri Jun 11, 2021 3:49 am    Post subject: Reply with quote

Okay, I can setup grub and try out the GRUB_GFXPAYLOAD_LINUX=keep option. lilo is nice and simple, but perhaps we've hit its limits. I should be able to report back on that sometime tomorrow.

With the video= kernel option, would that be the resolution of the monitor? I have to admit that it seems kind of weird to need to put the monitor resolution into the bootloader, but it's easy enough to try.

Agreed, it does seem that a cold restart is preferable! But it would be great to get the reboot ability working. It's sometimes necessary for me to reboot this machine remotely.
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 440

PostPosted: Fri Jun 11, 2021 6:32 pm    Post subject: Reply with quote

This afternoon I switched to grub2 and added GRUB_GFXPAYLOAD_LINUX=keep to /etc/default/grub. Booting through grub works as normal, but rebooting through grub hangs right after the grub menu, when it prints 'Loading Linux 5.12.7-gentoo ...'. It seems like the issue is the same as with lilo.
Back to top
View user's profile Send private message
Jaglover
Watchman
Watchman


Joined: 29 May 2005
Posts: 8291
Location: Saint Amant, Acadiana

PostPosted: Fri Jun 11, 2021 6:51 pm    Post subject: Reply with quote

I think I have a minor version of this bug. When I reboot then my 2560x1440 display is not detected and comes up 1920x1080, furthermore, 2560x1440 resolution is not available in X, either. I haven't worked on this as I reboot very seldom.
Have you played with EDID loading option in kernel?

Edit. Have to retract, having a closer look at kernel options I do not see anything what would affect my Intel HD 630 reboot.
_________________
My Gentoo installation notes.
Please learn how to denote units correctly!
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 440

PostPosted: Sun Jun 13, 2021 12:21 am    Post subject: Reply with quote

When I reboot off the old kernel (5.6.11), the bug does not occur. So either something was messed up in the migration, or there's a bug in the new (5.12.7) source. Or, there was something messed up in the 5.6.11 .config file that, by pure luck, remained asymptomatic until the migration.

When I migrated from 5.6.11 to 5.12.7, I used make oldefconfig. I'm going to try rebuilding 5.12.7 from scratch, and see that works any better.
Back to top
View user's profile Send private message
Jaglover
Watchman
Watchman


Joined: 29 May 2005
Posts: 8291
Location: Saint Amant, Acadiana

PostPosted: Sun Jun 13, 2021 1:18 am    Post subject: Reply with quote

I can't imagine a case where I would use olddefconfig, it overwrites (modifies) kernel configuration without even notifying what was done. I certainly do not want such disaster to my kernels, considering how many default options I have to change every time I run oldconfig.
_________________
My Gentoo installation notes.
Please learn how to denote units correctly!
Back to top
View user's profile Send private message
Tony0945
Watchman
Watchman


Joined: 25 Jul 2006
Posts: 5127
Location: Illinois, USA

PostPosted: Sun Jun 13, 2021 2:35 pm    Post subject: Reply with quote

I agree with Jaglover. I recently posted a buildscript in "Tips and Tricks". It uses "make oldconfig" not make olddefconfig.
You should have a /boot/config-<somerthing or other> from your working kernel. Just eselect the new kernel and then pass the location of that config as a parameter to that script.
Better yet, boot the working kernel, then eselect the new kernel and just run,. This assumes that you have the config built into the kernel.
See https://www.xaprb.com/blog/2006/05/23/how-to-use-linuxs-proc-config-feature/
If not just run it passing the config I referenced above.


Last edited by Tony0945 on Sun Jun 13, 2021 5:15 pm; edited 1 time in total
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 440

PostPosted: Sun Jun 13, 2021 4:36 pm    Post subject: Reply with quote

Indeed, it appears that this thread will read as cautionary tale for those who might opt for make olddefconfig. I just setup a new .config file from scratch, compiled and installed the kernel, and I was able to reboot without issue.

Tony0945, today or tomorrow I'm going to try some of the tips you suggested to make a clean migration from the old kernel to the new one. I'll report back, but I it certainly looks like we're close to solving this issue.
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 440

PostPosted: Thu Jun 17, 2021 4:19 pm    Post subject: Reply with quote

Okay, I'm marking this thread as 'solved'. The root of the problem was with one of the default options pulled in by 'make olddefconfig'. Thanks a lot to everyone for trouble shooting this with me!
Back to top
View user's profile Send private message
GDH-gentoo
Veteran
Veteran


Joined: 20 Jul 2019
Posts: 1530
Location: South America

PostPosted: Thu Jun 17, 2021 6:28 pm    Post subject: Reply with quote

jyoung wrote:
The root of the problem was with one of the default options pulled in by 'make olddefconfig'.

For the benefit of future readers, why don't you tell us which option was that and what was the setting that fixed the problem?
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 440

PostPosted: Mon Jul 26, 2021 12:37 am    Post subject: Reply with quote

That's a good point GDH-gentoo. I just ran diff on the two config files, and the differences are quite numerous. I'd be happy to post the entire list, but I wonder if there's a good way to determine the key differences.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum