View previous topic :: View next topic |
Author |
Message |
Marcih Apprentice


Joined: 19 Feb 2018 Posts: 213
|
Posted: Sat Dec 08, 2018 9:40 am Post subject: gentoo-sources-4.4.150 doesn't boot |
|
|
The boot process is stuck after selecting the kernel in the GRUB menu, i.e. at "Loading Linux 4.4.150-gentoo..."
What's peculiar is that 4.4.164 (no real reason for me to choose this other than it was the latest stable in the 4.4 tree) with virtually the same config (what I did copy the config from the 4.4.164 source folder into the 4.4.150 source folder and did "make silentoldconfig") is currently running on this machine, diff below: Code: | 3c3
< # Linux/x86 4.4.150-gentoo Kernel Configuration
---
> # Linux/x86 4.4.164-gentoo Kernel Configuration
360,361d359
< CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
< CONFIG_QUEUED_SPINLOCKS=y |
In 4.4.150, CONFIG_ARCH_USE_QUEUED_SPINLOCKS is selected by X86=y, while in 4.4.164 it has nothing to be selected by (going by the results of the search in menuconfig) so I assume that's why the options are not set.
Any help with this issue? Alternatively, how can I acquire the sources, or more specifically the genpatches for out-of-tree kernel versions? When trying to build the manifest file for the ebuild I copied from Gentoo's git repo and tried plopping into my local overlay, it failed on the account it couldn't fetch the genpatches. _________________
Bones McCracker wrote: | It wouldn't be so bad, if it didn't suck. |
NeddySeagoon wrote: | The problem with leaving is that you can only do it once and it reduces your influence. |
|
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55200 Location: 56N 3W
|
Posted: Sat Dec 08, 2018 10:30 am Post subject: |
|
|
Marcih,
It sounds like your console driver no longer works.
Start here for genpatches
Code: | make silentoldconfig | doen't always do the right thing. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
Marcih Apprentice


Joined: 19 Feb 2018 Posts: 213
|
Posted: Sat Dec 08, 2018 1:21 pm Post subject: |
|
|
Thanks, I feel I should've found that myself...
What do I do with the tarball, stick it in distfiles so Portage won't try to fetch it from a mirror? (EDIT: That didn't work...)
NeddySeagoon wrote: | It sounds like your console driver no longer works.
Code: | make silentoldconfig | doen't always do the right thing. |
Would a missing console driver prohibit the system from booting altogether? How does one check whether that is the problem? If silentoldconfig doesn't always do the right thing, is there a better alternative? _________________
Bones McCracker wrote: | It wouldn't be so bad, if it didn't suck. |
NeddySeagoon wrote: | The problem with leaving is that you can only do it once and it reduces your influence. |
Last edited by Marcih on Sat Dec 08, 2018 1:30 pm; edited 1 time in total |
|
Back to top |
|
 |
Jaglover Watchman


Joined: 29 May 2005 Posts: 8291 Location: Saint Amant, Acadiana
|
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55200 Location: 56N 3W
|
Posted: Sat Dec 08, 2018 1:28 pm Post subject: |
|
|
Marcih,
Yep, put all the bits into distfiles, so portage just uses what it finds.
I use so I see the new options one at a time.
There have been a few odd choices., like turning off all PCI support and turning off PCI support for USB, over the years.
I failed to spot all PCI support being off, so my kernel wouldn't boot but I did catch the USB one, which I depend on for keyboard.
You win some and loose some.
A missing console driver will not prevent booting. You just get no console output.
Once boot completes, ssh will work, if you have it configured. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
Marcih Apprentice


Joined: 19 Feb 2018 Posts: 213
|
Posted: Sat Dec 08, 2018 1:40 pm Post subject: |
|
|
NeddySeagoon wrote: | Yep, put all the bits into distfiles, so portage just uses what it finds. |
Ignore the edit on the last post, the ebuilds require different version patches than the kernel version for which they are for(?), i.e. 4.14.70 has K_GENPATCHES_VER=76, 4.4.156 requires 157 etc. Why is this?
NeddySeagoon wrote: | A missing console driver will not prevent booting. You just get no console output.
Once boot completes, ssh will work, if you have it configured. |
I don't have sshd running, but for example toggling numlock should work (the LED on the keyboard should turn on/off) and it doesn't, that's why I figure the kernel panics at some point during the boot process. I feel I should also point out I'm using x11-drivers/nvidia-drivers-390.87.
ACK (EDIT: make oldconfig produces the same kernel config) _________________
Bones McCracker wrote: | It wouldn't be so bad, if it didn't suck. |
NeddySeagoon wrote: | The problem with leaving is that you can only do it once and it reduces your influence. |
|
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55200 Location: 56N 3W
|
Posted: Sat Dec 08, 2018 4:02 pm Post subject: |
|
|
Marcih,
make oldconfig will produce the same .config if you accept the defaults.
The idea is to not do that but you need to read the help.
Please pastebin you kernel .config. Lets see if we can get a a console for some diagnostic.
Do you use EFI or legacy BIOS to boot?
That will make a difference to the console options that are useful to you.
Post the output of too please and tell us your root filesystem type. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
Marcih Apprentice


Joined: 19 Feb 2018 Posts: 213
|
Posted: Sat Dec 08, 2018 5:11 pm Post subject: |
|
|
NeddySeagoon wrote: | make oldconfig will produce the same .config if you accept the defaults.
The idea is to not do that but you need to read the help. |
Weird, I recall using "silentoldconfig" in major kernel updates (or downgrading from 4.14 to 4.9 back when that happened) and it asked for every new option... Either way, I was not prompted for anything when using either one when using the config from 4.4.164 for 4.4.150.
NeddySeagoon wrote: | Please pastebin you kernel .config. Lets see if we can get a a console for some diagnostic.
Do you use EFI or legacy BIOS to boot?
That will make a difference to the console options that are useful to you.
Post the output of too please and tell us your root filesystem type. |
Kernel config.
I use GRUB2 EFI to boot.
Code: | $ /usr/sbin/lspci -knn
00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers [8086:191f] (rev 07)
Subsystem: ASUSTeK Computer Inc. Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers [1043:8694]
00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 07)
Kernel driver in use: pcieport
00:14.0 USB controller [0c03]: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller [8086:a12f] (rev 31)
Subsystem: ASUSTeK Computer Inc. Sunrise Point-H USB 3.0 xHCI Controller [1043:8694]
Kernel driver in use: xhci_hcd
00:16.0 Communication controller [0780]: Intel Corporation Sunrise Point-H CSME HECI #1 [8086:a13a] (rev 31)
Subsystem: ASUSTeK Computer Inc. Sunrise Point-H CSME HECI [1043:8694]
00:17.0 SATA controller [0106]: Intel Corporation Sunrise Point-H SATA controller [AHCI mode] [8086:a102] (rev 31)
Subsystem: ASUSTeK Computer Inc. Sunrise Point-H SATA controller [AHCI mode] [1043:8694]
Kernel driver in use: ahci
00:1c.0 PCI bridge [0604]: Intel Corporation Sunrise Point-H PCI Express Root Port #5 [8086:a114] (rev f1)
Kernel driver in use: pcieport
00:1d.0 PCI bridge [0604]: Intel Corporation Sunrise Point-H PCI Express Root Port #9 [8086:a118] (rev f1)
Kernel driver in use: pcieport
00:1d.2 PCI bridge [0604]: Intel Corporation Sunrise Point-H PCI Express Root Port #11 [8086:a11a] (rev f1)
Kernel driver in use: pcieport
00:1d.3 PCI bridge [0604]: Intel Corporation Sunrise Point-H PCI Express Root Port #12 [8086:a11b] (rev f1)
Kernel driver in use: pcieport
00:1f.0 ISA bridge [0601]: Intel Corporation Sunrise Point-H LPC Controller [8086:a148] (rev 31)
Subsystem: ASUSTeK Computer Inc. Sunrise Point-H LPC Controller [1043:8694]
00:1f.2 Memory controller [0580]: Intel Corporation Sunrise Point-H PMC [8086:a121] (rev 31)
Subsystem: ASUSTeK Computer Inc. Sunrise Point-H PMC [1043:8694]
00:1f.3 Audio device [0403]: Intel Corporation Sunrise Point-H HD Audio [8086:a170] (rev 31)
Subsystem: ASUSTeK Computer Inc. Sunrise Point-H HD Audio [1043:86ae]
Kernel driver in use: snd_hda_intel
00:1f.4 SMBus [0c05]: Intel Corporation Sunrise Point-H SMBus [8086:a123] (rev 31)
Subsystem: ASUSTeK Computer Inc. Sunrise Point-H SMBus [1043:8694]
00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-V [8086:15b8] (rev 31)
Subsystem: ASUSTeK Computer Inc. Ethernet Connection (2) I219-V [1043:8672]
Kernel driver in use: e1000e
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204 [GeForce GTX 970] [10de:13c2] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd GM204 [GeForce GTX 970] [1458:367a]
Kernel driver in use: nvidia
Kernel modules: nvidia_drm, nvidia
01:00.1 Audio device [0403]: NVIDIA Corporation GM204 High Definition Audio Controller [10de:0fbb] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd GM204 High Definition Audio Controller [1458:367a]
Kernel driver in use: snd_hda_intel
04:00.0 PCI bridge [0604]: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge [1b21:1080] (rev 04)
06:00.0 USB controller [0c03]: ASMedia Technology Inc. ASM1142 USB 3.1 Host Controller [1b21:1242]
Subsystem: ASUSTeK Computer Inc. ASM1142 USB 3.1 Host Controller [1043:8675]
Kernel driver in use: xhci_hcd |
I use ext4 for my rootfs.
I should also mention that 4.4.156 boots fine (the version I wanted to know where to get genpatches) so it may just be something got fixed somewhere between 150 and 156. _________________
Bones McCracker wrote: | It wouldn't be so bad, if it didn't suck. |
NeddySeagoon wrote: | The problem with leaving is that you can only do it once and it reduces your influence. |
|
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55200 Location: 56N 3W
|
Posted: Sat Dec 08, 2018 5:46 pm Post subject: |
|
|
Marcih,
That all looks good, it just doesn't work.
Enable Code: | # CONFIG_FB_SIMPLE is not set | and try again. The kernel will inherit grubs framebuffer and draw on it until something better comes along.
Alternatively turn off Code: | CONFIG_FRAMEBUFFER_CONSOLE=y | and have the old 80x24 text console.
Neither will interfere with nvidia-drivers. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
Marcih Apprentice


Joined: 19 Feb 2018 Posts: 213
|
Posted: Sat Dec 08, 2018 5:55 pm Post subject: |
|
|
NeddySeagoon wrote: | Enable Code: | # CONFIG_FB_SIMPLE is not set | and try again. The kernel will inherit grubs framebuffer and draw on it until something better comes along. |
Are you sure that'll work with EFI? I currently havewhich seems to already inherit GRUB's FB, or at least the resolution, with Code: | GRUB_GFXPAYLOAD_LINUX=keep |
NeddySeagoon wrote: | Alternatively turn off Code: | CONFIG_FRAMEBUFFER_CONSOLE=y | and have the old 80x24 text console. |
That's not really an option for me since I like to spend plenty of time in a plain TTY console. _________________
Bones McCracker wrote: | It wouldn't be so bad, if it didn't suck. |
NeddySeagoon wrote: | The problem with leaving is that you can only do it once and it reduces your influence. |
|
|
Back to top |
|
 |
Jaglover Watchman


Joined: 29 May 2005 Posts: 8291 Location: Saint Amant, Acadiana
|
|
Back to top |
|
 |
Marcih Apprentice


Joined: 19 Feb 2018 Posts: 213
|
Posted: Sat Dec 08, 2018 6:12 pm Post subject: |
|
|
Jaglover wrote: | Last time I used nVidia its driver had KMS module for console, you are not using it? |
I know it does and I have no idea whether I am or not!
How can I check? _________________
Bones McCracker wrote: | It wouldn't be so bad, if it didn't suck. |
NeddySeagoon wrote: | The problem with leaving is that you can only do it once and it reduces your influence. |
|
|
Back to top |
|
 |
Jaglover Watchman


Joined: 29 May 2005 Posts: 8291 Location: Saint Amant, Acadiana
|
|
Back to top |
|
 |
Marcih Apprentice


Joined: 19 Feb 2018 Posts: 213
|
Posted: Tue Dec 11, 2018 10:48 am Post subject: |
|
|
Neddy,
I tried your suggestions and 4.4.150 doesn't boot even with this config.
Like I already suggested, I already had doubts that this was a console or drawing to framebuffer issue. Furthermore, 4.4.156 boots fine with simply doing "make oldconfig" from 4.4.164, so it must be something else. _________________
Bones McCracker wrote: | It wouldn't be so bad, if it didn't suck. |
NeddySeagoon wrote: | The problem with leaving is that you can only do it once and it reduces your influence. |
|
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55200 Location: 56N 3W
|
Posted: Tue Dec 11, 2018 11:07 am Post subject: |
|
|
Marcih,
... unless you messed up the kernel install ... that happens too.
What does show?
Both with and without boot mounted.
With boot not mounted, it should be empty. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
Marcih Apprentice


Joined: 19 Feb 2018 Posts: 213
|
Posted: Tue Dec 11, 2018 11:39 am Post subject: |
|
|
With all due respect Neddy, I know you're trying to go through all the possibilities in order to help me find a solution, but checking that my /boot is mounted when "make install"ing a newly-compiled kernel is almost the first thing I learned to do when I came to Gentoo.
Nevertheless: Code: | # ls -al /boot/
total 8
drwxr-xr-x 2 root root 4096 Apr 21 2018 .
drwxr-xr-x 22 root root 4096 Nov 20 19:44 ..
# mount /boot/
# ls -al /boot/
total 17113
drwxr-xr-x 4 root root 2048 Jan 1 1970 .
drwxr-xr-x 22 root root 4096 Nov 20 19:44 ..
-rwxr-xr-x 1 root root 89250 Dec 11 11:34 config-4.4.150-gentoo
-rwxr-xr-x 1 root root 89326 Dec 8 15:45 config-4.4.156-gentoo
drwxr-xr-x 4 root root 512 Mar 17 2018 EFI
drwxr-xr-x 6 root root 512 Dec 11 11:10 grub
-rwxr-xr-x 1 root root 2863977 Dec 11 11:34 System.map-4.4.150-gentoo
-rwxr-xr-x 1 root root 3023510 Dec 8 15:45 System.map-4.4.156-gentoo
-rwxr-xr-x 1 root root 5464384 Dec 11 11:34 vmlinuz-4.4.150-gentoo
-rwxr-xr-x 1 root root 5984576 Dec 8 15:45 vmlinuz-4.4.156-gentoo |
_________________
Bones McCracker wrote: | It wouldn't be so bad, if it didn't suck. |
NeddySeagoon wrote: | The problem with leaving is that you can only do it once and it reduces your influence. |
|
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55200 Location: 56N 3W
|
Posted: Tue Dec 11, 2018 11:51 am Post subject: |
|
|
Marcih,
I know.
I spend a whole day almost, in #gentoo helping someone fix their sound.
It wasn't working because they didn't mount boot ... and I didn't ask. :)
Rule one is assume nothing. That lesson above really drove it home.
Hence there are no stupid questions, except the one you don't ask.
Now I suppose, its how much work do you want to put into finding the problem?
You can do a binary search to establish the version that introduced the problem, then use git bisect to do a binary search on on the commits that went into that version.
That will identify the problem commit. Its a lot of work. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
Marcih Apprentice


Joined: 19 Feb 2018 Posts: 213
|
Posted: Wed Dec 12, 2018 10:39 am Post subject: |
|
|
NeddySeagoon wrote: | I know.
I spend a whole day almost, in #gentoo helping someone fix their sound.
It wasn't working because they didn't mount boot ... and I didn't ask. :)
Rule one is assume nothing. That lesson above really drove it home.
Hence there are no stupid questions, except the one you don't ask. |
Understandable, I remember reading a post of yours where you stated that rule; living by it ever since (at least when troubleshooting ).
NeddySeagonn wrote: | Now I suppose, its how much work do you want to put into finding the problem?
You can do a binary search to establish the version that introduced the problem, then use git bisect to do a binary search on on the commits that went into that version.
That will identify the problem commit. Its a lot of work. |
A bit of background on why I'm even trying to run such an "old" kernel anyways:
My experience has progressively been getting worse ever since I came to Gentoo Linux in August of last year (this is not the fault of Gentoo as such, rather it's the fault of the upstream of several important packages), such as when I moved to 4.14, when attaching a USB mass storage device, it gets recognised, an entry about it appears in dmesg but it doesn't get attatched to a block device, so I can't use it; or versions of libinput past 1.7.* not supporting the "double-click" LMB feature of my mouse. In short, when I came here everything worked perfectly, but things have been over time slowly breaking down, bit by bit.
I've decided to stop updating, keep a close eye on GLSAs and CVEs for potential security issues, and start rolling back parts of my system to the oldest version that works. One of the first "victims" was the Linux kernel. With the oldest gentoo-sources in the tree, 4.4.150, not working, I arbitrarily chose another version, 4.4.156 (with it being the last minor version before the whole CoC, it seemed as good of a place as any), which works wonderfully, and called it a day. _________________
Bones McCracker wrote: | It wouldn't be so bad, if it didn't suck. |
NeddySeagoon wrote: | The problem with leaving is that you can only do it once and it reduces your influence. |
|
|
Back to top |
|
 |
Jaglover Watchman


Joined: 29 May 2005 Posts: 8291 Location: Saint Amant, Acadiana
|
|
Back to top |
|
 |
Marcih Apprentice


Joined: 19 Feb 2018 Posts: 213
|
Posted: Wed Dec 12, 2018 3:10 pm Post subject: |
|
|
I'm aware this is not the ideal long-term strategy Jaglover, I should've also mentioned that I'm hoping to have moved away from Linux by the time 4.4 (or maybe even 4.9) LTS ends, whether it be to Haiku, which I'm very excited for to be usable, or maybe one of the BSD's. Wbat I'm doing is a temporary thing until I've scouted out my choices.
As for reporting the bugs, yes, I've of course been meaning to do that for a while now. However I may not always figure out which package is causing the incorrect behaviour, like I have in the case of libinput; this makes bug reports nigh-impossible. _________________
Bones McCracker wrote: | It wouldn't be so bad, if it didn't suck. |
NeddySeagoon wrote: | The problem with leaving is that you can only do it once and it reduces your influence. |
|
|
Back to top |
|
 |
|