Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
kernel greater than 5.17.11 makes virutalbox unstable
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
jagdpanther
l33t
l33t


Joined: 22 Nov 2003
Posts: 729

PostPosted: Fri Jun 10, 2022 2:03 am    Post subject: kernel greater than 5.17.11 makes virutalbox unstable Reply with quote

Within the sys-kernel/gentoo-sources-5.17.x kernel branch, when I upgrade my kernel to any version greater than gentoo-sources-5.17.11, virtual box guests (both Linux and Windows guests) become unstable and unusable. I am running virtualbox-6.1.34. (I tested and found this virtualbox guest failure when using gentoo-sources-5.17.12, 5.17.13 and 5.17.14. I have not tired 5.18.x as there is a known issue with running virtualbox-6.1.34 with that kernel series.) I also have this problem on a second Gentoo system.

I describe the VirtualBox guest issue below.

When the issue occurs, there are no additional lines added to the host system ~/.config/VirtualBox/VBoxSVC.log. Also there are no additional entries in host's /var/log/messages. (On the guest Linux system I did see some errors in the RedHat VM while doing a tail -f of /var/log/messages via ssh.)

I ran a diff between the VirtualBox working kernel config file and the non-working config file and there are no differences:

Code:
/boot]$ diff config-5.17.11-gentoo-2 config-5.17.12-gentoo-3
3c3
< # Linux/x86 5.17.11-gentoo-2 Kernel Configuration
---
> # Linux/x86 5.17.12-gentoo-3 Kernel Configuration


Linux guest issue (no VBox guest additions):

After booting the Oracle-8 (RHEL-8 clone) guest VM and then logging into the guest, I set the guest screen resolution, which works. Soon after that, usually after moving a terminal window within the guest VM, the VM window appears to crash to a Linux console VM window for a few seconds then the Oracle-8 login screen re-appears. (I assume something is happening to the VM's Xwindow server or Wayland.) I can subsequently login again to the guest VM and repeat the problem.

Windows 10 guest issue (with VBox guest additions):

I can login to the normal Win10 logon screen which leads to a much larger VM screen (because of a past addition of VBox guest additions and dragging the screen.) Soon after I launch any Win10 app and move it (or not) Win10 exits back to the Win10 login screen. Then I can log back into the VM and repeat ...

Any ideas other that remaining at Linux kernel-5.17.11?
(I'll probably post this on virtualbox.org also.)
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21633

PostPosted: Fri Jun 10, 2022 2:53 pm    Post subject: Reply with quote

You could use git bisect to find the specific patch that causes the problem, then report the regression. You could switch to qemu-kvm, which uses a mainline driver and seems to be less susceptible to this kind of breakage.
Back to top
View user's profile Send private message
jagdpanther
l33t
l33t


Joined: 22 Nov 2003
Posts: 729

PostPosted: Thu Jun 16, 2022 7:01 pm    Post subject: Reply with quote

After upgrading to virtualbox-6.1.34-r1 (from virtualbox-6.1.34) I can run VirtualBox successfully under sys-kernel/gentoo-sources-5.17.15. I'll check my second Gentoo system, which was having the same issue tomorrow.

Edit: Never mind. Although some guest actions are more stable, Virtualbox-6.1.34-r1 is still very stable under gentoo-sources -5.17.11 and still has some of the same issues when run under gentoo-sources -5.17.15.
Back to top
View user's profile Send private message
fudge
Tux's lil' helper
Tux's lil' helper


Joined: 25 Jul 2002
Posts: 117

PostPosted: Mon Jun 20, 2022 2:49 am    Post subject: Reply with quote

I've tried using a few different kernel versions on the host and 5.17.11-gentoo is that most recent version that works properly with a Gentoo guest installed. Versions 5.17.12-gentoo and newer cause problems.
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21633

PostPosted: Mon Jun 20, 2022 1:57 pm    Post subject: Reply with quote

fudge: can you git bisect to find which change between 5.17.11 and 5.17.12 is responsible? Identifying that change is the first step to reporting the problem to someone who can fix it.
Back to top
View user's profile Send private message
fudge
Tux's lil' helper
Tux's lil' helper


Joined: 25 Jul 2002
Posts: 117

PostPosted: Mon Jun 20, 2022 2:22 pm    Post subject: Reply with quote

Never done anything like this before. I'll use https://wiki.gentoo.org/wiki/Kernel_git-bisect as a guide and see what I come up with.
Back to top
View user's profile Send private message
fudge
Tux's lil' helper
Tux's lil' helper


Joined: 25 Jul 2002
Posts: 117

PostPosted: Tue Jun 28, 2022 12:25 pm    Post subject: Reply with quote

That took a while and a personal event intervened. Here's the result and how I went about it.

I used https://wiki.gentoo.org/wiki/Kernel_git-bisect as a guide. In the Gentoo guest, my test was to build llvm. It went bad, bad, bad, bad, good.

I checked out the kernel as follows:
Code:
# git clone --shallow-exclude v5.17 --branch v5.17.12 git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git


Here's the bisect log:
Code:
Bisecting: 55 revisions left to test after this (roughly 6 steps)
[d82e9eac3aae49e6a34e2d4ccaf39c259b2fe3be] random: skip fast_init if hwrng provides large chunk of entropy
Bisecting: 27 revisions left to test after this (roughly 5 steps)
[fc8ce099962615ddba4642e89b84fcf3c0564871] random: introduce drain_entropy() helper to declutter crng_reseed()
Bisecting: 13 revisions left to test after this (roughly 4 steps)
[6057a5d6a3b71451518022285bf8f82ddeb75990] random: ensure early RDSEED goes through mixer on init
Bisecting: 6 revisions left to test after this (roughly 3 steps)
[4fa0d8ed5c4584a66198152a8b78f4d24e7f4df1] random: make credit_entropy_bits() always safe
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[efba5eb2281ec51a295a9571b0ff73466272430d] random: use computational hash for entropy extraction
Bisecting: 0 revisions left to test after this (roughly 1 step)
[19a66796d1f0dd4ce4b05f76d53ce1d0a7dc817d] KVM: x86/mmu: fix NULL pointer dereference on guest INVPCID
efba5eb2281ec51a295a9571b0ff73466272430d is the first bad commit
commit efba5eb2281ec51a295a9571b0ff73466272430d
Author: Jason A. Donenfeld <Jason@zx2c4.com>
Date:   Sun Jan 16 14:23:10 2022 +0100

    random: use computational hash for entropy extraction
   
    commit 6e8ec2552c7d13991148e551e3325a624d73fac6 upstream.
   
    The current 4096-bit LFSR used for entropy collection had a few
    desirable attributes for the context in which it was created. For
    example, the state was huge, which meant that /dev/random would be able
    to output quite a bit of accumulated entropy before blocking. It was
    also, in its time, quite fast at accumulating entropy byte-by-byte,
    which matters given the varying contexts in which mix_pool_bytes() is
    called. And its diffusion was relatively high, which meant that changes
    would ripple across several words of state rather quickly.
   
    However, it also suffers from a few security vulnerabilities. In
    particular, inputs learned by an attacker can be undone, but moreover,
    if the state of the pool leaks, its contents can be controlled and
    entirely zeroed out. I've demonstrated this attack with this SMT2
    script, <https://xn--4db.cc/5o9xO8pb>, which Boolector/CaDiCal solves in
    a matter of seconds on a single core of my laptop, resulting in little
    proof of concept C demonstrators such as <https://xn--4db.cc/jCkvvIaH/c>.
   
    For basically all recent formal models of RNGs, these attacks represent
    a significant cryptographic flaw. But how does this manifest
    practically? If an attacker has access to the system to such a degree
    that he can learn the internal state of the RNG, arguably there are
    other lower hanging vulnerabilities -- side-channel, infoleak, or
    otherwise -- that might have higher priority. On the other hand, seed
    files are frequently used on systems that have a hard time generating
    much entropy on their own, and these seed files, being files, often leak
    or are duplicated and distributed accidentally, or are even seeded over
    the Internet intentionally, where their contents might be recorded or
    tampered with. Seen this way, an otherwise quasi-implausible
    vulnerability is a bit more practical than initially thought.
   
    Another aspect of the current mix_pool_bytes() function is that, while
    its performance was arguably competitive for the time in which it was
    created, it's no longer considered so. This patch improves performance
    significantly: on a high-end CPU, an i7-11850H, it improves performance
    of mix_pool_bytes() by 225%, and on a low-end CPU, a Cortex-A7, it
    improves performance by 103%.
   
    This commit replaces the LFSR of mix_pool_bytes() with a straight-
    forward cryptographic hash function, BLAKE2s, which is already in use
    for pool extraction. Universal hashing with a secret seed was considered
    too, something along the lines of <https://eprint.iacr.org/2013/338>,
    but the requirement for a secret seed makes for a chicken & egg problem.
    Instead we go with a formally proven scheme using a computational hash
    function, described in sections 5.1, 6.4, and B.1.8 of
    <https://eprint.iacr.org/2019/198>.
   
    BLAKE2s outputs 256 bits, which should give us an appropriate amount of
    min-entropy accumulation, and a wide enough margin of collision
    resistance against active attacks. mix_pool_bytes() becomes a simple
    call to blake2s_update(), for accumulation, while the extraction step
    becomes a blake2s_final() to generate a seed, with which we can then do
    a HKDF-like or BLAKE2X-like expansion, the first part of which we fold
    back as an init key for subsequent blake2s_update()s, and the rest we
    produce to the caller. This then is provided to our CRNG like usual. In
    that expansion step, we make opportunistic use of 32 bytes of RDRAND
    output, just as before. We also always reseed the crng with 32 bytes,
    unconditionally, or not at all, rather than sometimes with 16 as before,
    as we don't win anything by limiting beyond the 16 byte threshold.
   
    Going for a hash function as an entropy collector is a conservative,
    proven approach. The result of all this is a much simpler and much less
    bespoke construction than what's there now, which not only plugs a
    vulnerability but also improves performance considerably.
   
    Cc: Theodore Ts'o <tytso@mit.edu>
    Cc: Dominik Brodowski <linux@dominikbrodowski.net>
    Reviewed-by: Eric Biggers <ebiggers@google.com>
    Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Reviewed-by: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
    Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 drivers/char/random.c | 304 +++++++++-----------------------------------------
 1 file changed, 55 insertions(+), 249 deletions(-)
Back to top
View user's profile Send private message
mortonP
Tux's lil' helper
Tux's lil' helper


Joined: 22 Dec 2015
Posts: 84

PostPosted: Wed Jun 29, 2022 12:13 pm    Post subject: Reply with quote

As another data point, Arch forum also reports instabilities on latest kernels: https://bbs.archlinux.org/viewtopic.php?id=276883

Myself, I booted Win10 today (on VB 6.1.34-r1) and it is also unstable, after some time it crashes with:

kernel: SUPR0GipMap: fGetGipCpu=0x1b
kernel: vboxdrv: 00000000a96976a3 VMMR0.r0
kernel: vboxdrv: 000000009759b5ca VBoxDDR0.r0
kernel: VMMR0InitVM: eflags=246 fKernelFeatures=0x0 (SUPKERNELFEATURES_SMAP=0)

I'm on 5.15.49 and only recently upgraded from an earlier 5.15.x - need to figure out which was previous kernel...
Back to top
View user's profile Send private message
mortonP
Tux's lil' helper
Tux's lil' helper


Joined: 22 Dec 2015
Posts: 84

PostPosted: Wed Jun 29, 2022 2:23 pm    Post subject: Reply with quote

update: after several tries/crashes Win10 now seems to run stable.

Maybe the initial "check for updates" and whatever background things run after booting Win10 after a long time triggered it - but now that these processes are done it runs stable again? Hmmmm....
Back to top
View user's profile Send private message
mortonP
Tux's lil' helper
Tux's lil' helper


Joined: 22 Dec 2015
Posts: 84

PostPosted: Wed Jul 06, 2022 12:08 pm    Post subject: Reply with quote

Today upgraded virtualbox-6.1.34-r1 -> virtualbox-6.1.34-r4

Now at every start of Win10 VM the VM immediately crashes.

Downgrade to virtualbox-6.1.34-r1.

Works again.

.... weird?
Back to top
View user's profile Send private message
jagdpanther
l33t
l33t


Joined: 22 Nov 2003
Posts: 729

PostPosted: Wed Jul 06, 2022 7:36 pm    Post subject: Reply with quote

I am still using gentoo-sources-5.17.11 as kernels greater than this have VirtualBox issues. (This is a know issue according to posts on the virtualbox forums and should be resolved with the next virtualbox release.)

Using kernel gentoo-sources-5.17.11 I tried upgrading from virtualbox-6.1.34-r1 to 6.1.34-r5. Both my Linux and Windows10 VMs failed to start and gave errors in a pop-up window similar to:

Code:
The configuration constructor in main failed due to a COM error. Check the release log of the VM for further details. (VERR_MAIN_CONFIG_CONSTRUCTOR_COM_ERROR).


Result Code:
NS_ERROR_FAILURE (0x80004005)
Component:
ConsoleWrap
Interface:
IConsole {872da645-4a9b-1727-bee2-...


Reverting to virtualbox-6.1.34-r1 solved the issue and the VM guests are working again.
Back to top
View user's profile Send private message
jagdpanther
l33t
l33t


Joined: 22 Nov 2003
Posts: 729

PostPosted: Tue Jul 12, 2022 6:23 pm    Post subject: Reply with quote

Virtualbox-6.1.34-r6 DOES work with gentoo-sources-5.17.11. (tested win10 VM)

Virtualbox-modules-6.1.34 will NOT compile if you try to use gentoo-sources-5.18.x (still waiting on new, production, not testing version of VirtualBox to fix this. Probably Virtualbox-6.1.36.)

I did NOT try Virtualbox-6.1.34-r6 with gentoo-sources-5.17.15 (the final 5.17.x)
Back to top
View user's profile Send private message
fudge
Tux's lil' helper
Tux's lil' helper


Joined: 25 Jul 2002
Posts: 117

PostPosted: Mon Jul 18, 2022 1:12 pm    Post subject: Reply with quote

It seems like that patch in the kernel has caused a problem with Virtualbox and that it's known.
https://www.virtualbox.org/ticket/20914
Back to top
View user's profile Send private message
devsk
Advocate
Advocate


Joined: 24 Oct 2003
Posts: 2995
Location: Bay Area, CA

PostPosted: Mon Jul 25, 2022 4:32 am    Post subject: Reply with quote

Anyone has any idea on when 6.1.36 comes out? I don't see a tracker for it in bugs.gentoo.org
Back to top
View user's profile Send private message
jagdpanther
l33t
l33t


Joined: 22 Nov 2003
Posts: 729

PostPosted: Mon Jul 25, 2022 5:30 pm    Post subject: Reply with quote

Quote:
Anyone has any idea on when 6.1.36 comes out?


Virtualbox v6.1.36 was released on 19 July 2022.
https://www.virtualbox.org/

The gentoo Portage version virtualbox-6.1.36 does not seem to be available yet.
Back to top
View user's profile Send private message
devsk
Advocate
Advocate


Joined: 24 Oct 2003
Posts: 2995
Location: Bay Area, CA

PostPosted: Mon Jul 25, 2022 11:25 pm    Post subject: Reply with quote

yeah, that's what I meant by no tracker on bugs.gentoos.org
Back to top
View user's profile Send private message
devsk
Advocate
Advocate


Joined: 24 Oct 2003
Posts: 2995
Location: Bay Area, CA

PostPosted: Wed Jul 27, 2022 10:45 pm    Post subject: Reply with quote

Virtualbox v6.1.36 is now available in the portage
Back to top
View user's profile Send private message
jagdpanther
l33t
l33t


Joined: 22 Nov 2003
Posts: 729

PostPosted: Thu Jul 28, 2022 10:18 pm    Post subject: solved Reply with quote

virtualbox-6.1.36 solves the issue with Linux kernels greater than 5.17.11.

I am currently running gentoo-sources-5.18.14 and both Linux and Windows VMs running in virtualbox-6.1.36 work without issue.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum