Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED] No video on new kernel...
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2  
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5153
Location: The Peanut Gallery

PostPosted: Sat May 12, 2018 12:28 am    Post subject: Reply with quote

OK, got it, though at the point where you could not log in and get a console, I would always sort that out first, which you can still do now.

Start by using an older kernel, where you know you can get video, so you at least have a properly functional machine.

Then I'd follow Hu's advice, if you can git; if not, #git are lovely souls, very helpful and informative.

Chances are at least 2 or 3 people in there will walk you through the bisect.
Back to top
View user's profile Send private message
The_Great_Sephiroth
Veteran
Veteran


Joined: 03 Oct 2014
Posts: 1602
Location: Fayetteville, NC, USA

PostPosted: Sun May 13, 2018 3:42 pm    Post subject: Reply with quote

I've never done any of that. My issue right now is time. I am swamped at work, now doing 10+ hour days, so time to sit down, learn bisect, test different kernels, and what-not is very slim. I believe the best thing to do for now is roll back to 4.9.76 and worry about this when I am back to normal operating hours. Spring is usually one of our busy seasons. I will look into what bisect is, but again, limited time. Thanks to both of you for the help.
_________________
Ever picture systemd as what runs "The Borg"?
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21635

PostPosted: Sun May 13, 2018 4:25 pm    Post subject: Reply with quote

I found https://wiki.gentoo.org/wiki/Kernel_git-bisect via Google search for git bisect kernel and read through it. It looks like a decent guide. I disagree with its implicit recommendation to use the root account to build the kernel, but that is irrelevant to your current problem. Your experiment may go faster if you have one machine building kernels and a separate machine to try them, so that you can start building a new kernel immediately upon identifying whether the latest result is good or bad. If you did it all on one machine, you would need to get back to a working build environment after each reboot. According to my earlier post, you'll need about 11 build/install/test cycles to find the bad patch. The time required for that will vary quite a bit depending on how long it takes you to build one kernel (which depends on how many features are enabled) and to some extent on how many of those patches change headers. If a header is changed, all the files that use it rebuild, which in the worst case means rebuilding everything. If only a source file changes, the rebuild only recompiles that file and relinks the kernel. git diff --stat v4.9.76 v4.9.95 -- '*.h' says 232 header files changed in that window. Some of them are irrelevant to you (other architectures, tools rather than kernel, etc.), but that's still a lot. Conservatively, figure at least the first half of the bisection run will be mostly full recompiles.

If you have a trusted rookie and a sacrificial machine, this task can be delegated to that person. There is little creativity in the process, and almost no use for expertise. It only involves a human element at all because we don't have a good automated recognizer for determining that the resulting kernel is "bad," since we don't know yet why the screen comes up blank on bad kernels and it is not worth writing a tool to use a webcam to watch the screen and diagnose automatically from the image. :)
Back to top
View user's profile Send private message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5153
Location: The Peanut Gallery

PostPosted: Mon May 14, 2018 2:28 pm    Post subject: Reply with quote

Hu wrote:
There is little creativity in the process, and almost no use for expertise. It only involves a human element at all because we don't have a good automated recognizer for determining that the resulting kernel is "bad," since we don't know yet why the screen comes up blank on bad kernels and it is not worth writing a tool to use a webcam to watch the screen and diagnose automatically from the image. :)
Hmm wrt that use-case, on kernels, it seems to me one could easily automate it, given a two-machine setup; you just set up the one booting the test kernel to send out a known packet, which it might do for net startup anyway (so you could likely just use existing tools to verify. netcat comes to mind, but i'd ask #bash first.)
Obviously you'd need a bit of hw connect, for power/reset, but not much.
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21635

PostPosted: Tue May 15, 2018 1:31 am    Post subject: Reply with quote

It might be subject to automation in some cases, but in this case, it is not. The "bad" kernels boot and run fine. They does not panic, nor fail any particular system call, nor omit any particular device. The only problem with them is that they do not drive the text console over his monitor, and he objects to doing everything blind. So far, no one in the thread has identified a way to programmatically detect that the monitor is not producing useful output. If we had that, then yes, the bisection could be easily automated.
Back to top
View user's profile Send private message
The_Great_Sephiroth
Veteran
Veteran


Joined: 03 Oct 2014
Posts: 1602
Location: Fayetteville, NC, USA

PostPosted: Tue May 15, 2018 3:09 am    Post subject: Reply with quote

OK, so I did one last thing this evening. I opened two konsole sessions in Plasma and went through every single item in the menu config, one at a time. On the old kernels an option to support framebuffer consoles was built-in, but on the new one it was a module. I am betting this was the issue. The kernel wouldn't load it. So I am now recompiling with this option built-in. Cross your fingers for me, would you? I'll report back in a bit!
_________________
Ever picture systemd as what runs "The Borg"?
Back to top
View user's profile Send private message
The_Great_Sephiroth
Veteran
Veteran


Joined: 03 Oct 2014
Posts: 1602
Location: Fayetteville, NC, USA

PostPosted: Tue May 15, 2018 3:32 am    Post subject: Reply with quote

OK, that was it. I had to build the option "Framebuffer console support" into the kernel. Using it as a module prevents framebuffers from working. No idea why it was a module in this kernel version, but that fixed everything. It was on my end, not the kernel. I just wish there was an easier way to compare kernel configurations. Still, I am golden now!
_________________
Ever picture systemd as what runs "The Borg"?
Back to top
View user's profile Send private message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5153
Location: The Peanut Gallery

PostPosted: Tue May 15, 2018 5:37 pm    Post subject: Reply with quote

Hu wrote:
It might be subject to automation in some cases, but in this case, it is not. .. So far, no one in the thread has identified a way to programmatically detect that the monitor is not producing useful output. If we had that, then yes, the bisection could be easily automated.
Lul, fair enough (though ofc it is feasible.)

Glad it got sorted out, GS.
Back to top
View user's profile Send private message
Tony0945
Watchman
Watchman


Joined: 25 Jul 2006
Posts: 5127
Location: Illinois, USA

PostPosted: Tue May 15, 2018 11:40 pm    Post subject: Reply with quote

Hu wrote:
then there are only 15 commits to test.
Actually no more than 5 steps using binary search. Start with 4.9.96
Don't worry about which commit did it until you know at which kernel it failed.
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21635

PostPosted: Wed May 16, 2018 1:43 am    Post subject: Reply with quote

The_Great_Sephiroth wrote:
I just wish there was an easier way to compare kernel configurations.
diff -u?
steveL wrote:
(though ofc it is feasible.)
I would consider it theoretically possible, but not feasible unless there were a nearly turn-key environment to combine running the kernel and detecting via optical input (i.e. webcam) whether the kernel was producing acceptable video output. Trusted rookies are great for this, but The_Great_Sephiroth seems not to have any on hand. Is there some automation framework that would have been readily appropriate here?
Tony0945 wrote:
Hu wrote:
then there are only 15 commits to test.
Actually no more than 5 steps using binary search. Start with 4.9.96
Don't worry about which commit did it until you know at which kernel it failed.
The rest of this post was written before I noticed the oddity of "15 commits" and reread the thread history to realize you were reacting to my (not quoted) qualifier about drivers/video. Since it shows some general analysis, I decided to post it anyway. It assumes no such gamble is made, and that the tester must consider every commit as potentially suspect.

---

Five steps to identify the stable kernel release in which it failed, but then you still need to know which commit is at fault, which means more tests within the identified kernel. If testing within that kernel requires more than 6 steps, you come out worse than doing a raw release-unaware bisection. Some kernels can easily cause that:
Code:
$ seq 76 94 | while read a; do b=$a; (( ++ b)); echo -n "v4.9.$a..v4.9.$b: "; git log --oneline "v4.9.$a..v4.9.$b" | wc; done
v4.9.76..v4.9.77:      99     803    6586
v4.9.77..v4.9.78:      49     399    3203
v4.9.78..v4.9.79:      68     593    4527
v4.9.79..v4.9.80:      87     781    5871
v4.9.80..v4.9.81:      92     697    6054
v4.9.81..v4.9.82:      88     736    6083
v4.9.82..v4.9.83:      76     630    5090
v4.9.83..v4.9.84:     146    1199    9841
v4.9.84..v4.9.85:      40     351    2801
v4.9.85..v4.9.86:      57     493    3757
v4.9.86..v4.9.87:      66     574    4327
v4.9.87..v4.9.88:      87     719    5706
v4.9.88..v4.9.89:     238    2162   16391
v4.9.89..v4.9.90:     177    1567   12077
v4.9.90..v4.9.91:      68     606    4616
v4.9.91..v4.9.92:      29     251    1920
v4.9.92..v4.9.93:     103     936    7342
v4.9.93..v4.9.94:     312    2717   21233
v4.9.94..v4.9.95:      69     585    4734
Any line that has more than 2**6 -> 64 commits is a line where you need more than 6 steps within the kernel revision. Filtering for kernels with at least 64 commits gives:
Code:
$ seq 76 94 | while read a; do b=$a; (( ++ b)); c=$(git log --oneline "v4.9.$a..v4.9.$b" | wc -l); [[ "$c" -gt 64 ]] && echo "v4.9.$a..v4.9.$b: $c"; done                                                       
v4.9.76..v4.9.77: 99
v4.9.78..v4.9.79: 68
v4.9.79..v4.9.80: 87
v4.9.80..v4.9.81: 92
v4.9.81..v4.9.82: 88
v4.9.82..v4.9.83: 76
v4.9.83..v4.9.84: 146
v4.9.86..v4.9.87: 66
v4.9.87..v4.9.88: 87
v4.9.88..v4.9.89: 238
v4.9.89..v4.9.90: 177
v4.9.90..v4.9.91: 68
v4.9.92..v4.9.93: 103
v4.9.93..v4.9.94: 312
v4.9.94..v4.9.95: 69
There are 15 such lines. There are only 19 released stable kernels in the test range. So he has a 15/19 chance of doing more work with a two step bisection than with a raw release-unaware bisection.
Back to top
View user's profile Send private message
The_Great_Sephiroth
Veteran
Veteran


Joined: 03 Oct 2014
Posts: 1602
Location: Fayetteville, NC, USA

PostPosted: Wed May 16, 2018 2:14 am    Post subject: Reply with quote

Thankfully it was an option I overlooked instead of a real issue. I was not looking forward to figuring out a new process and reporting stuff right now. I am studying for my amateur radio license (technician), working 50+hrs a week, raising my daughter, and working on a phantom coolant leak on my BMW. So who wants to come replace my cooling system? :D
_________________
Ever picture systemd as what runs "The Borg"?
Back to top
View user's profile Send private message
Tony0945
Watchman
Watchman


Joined: 25 Jul 2006
Posts: 5127
Location: Illinois, USA

PostPosted: Thu May 17, 2018 2:52 am    Post subject: Reply with quote

From a recent NeddySeagoon post:
Quote:
You don't have any console drivers set in your kernel. That's not a problem for booting but your console is limited to black text on a black background. I've done that too. Its rather hard to read. :)
Back to top
View user's profile Send private message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5153
Location: The Peanut Gallery

PostPosted: Thu May 17, 2018 3:30 pm    Post subject: Reply with quote

Hu wrote:
I would consider it theoretically possible, but not feasible unless there were a nearly turn-key environment to combine running the kernel and detecting via optical input (i.e. webcam) whether the kernel was producing acceptable video output. ... Is there some automation framework that would have been readily appropriate here?
My bad, Hu: I was being nerdy, and thinking of the webcam (with a switch.)
No, I don't know of any such.
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21635

PostPosted: Fri May 18, 2018 2:25 am    Post subject: Reply with quote

It's OK. I like technical solutions, but in this case, a trusted rookie (maybe an RCG?) is probably more cost effective. ;)
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum