View previous topic :: View next topic |
Author |
Message |
Colt45 Tux's lil' helper
Joined: 05 Sep 2007 Posts: 122 Location: Central Washington
|
Posted: Fri Apr 29, 2022 10:01 pm Post subject: Xen guest not working through OpenVswitch on 5.15.32 |
|
|
I have a Gentoo system running Xen and one of the guests is an OpnSense router. Im using OpenVswitch to configure the networking.
Under 5.15.26, it works fine. 5.15.32 it does not work. There is no connectivity to the Opnsense guest. I have not obviously gone through and checked every version in between. 5.15.26 was what I was running and after reboot into 5.15.32 I noticed it wasnt working. I want back to 5.15.26 and it works.
This is what the OVS looks like
Code: | Bridge vbr0
Port vif1.0-emu
Interface vif1.0-emu
error: "could not open network device vif1.0-emu (No such device)"
Port bond0
Interface enp65s0f1
Interface enp65s0f0
Port vif1.0
Interface vif1.0
Port vbr0
Interface vbr0
type: internal
Port vif2.0
Interface vif2.0
Bridge vbr1
Port enp65s0f2
Interface enp65s0f2
Port vbr1
Interface vbr1
type: internal
Port vif1.1-emu
Interface vif1.1-emu
error: "could not open network device vif1.1-emu (No such device)"
Port vif1.1
Interface vif1.1
|
The concerning part obviously is the "No such device" Im not sure what thats caused by. Yet even though it says that, it works! At least under 5.15.26. I suspect this is the reason 5.15.32 is not working, but I dont know how to fix it.
This is what the networking portion of the xen config looks like for the opnsense VM
Code: | vif = [ 'script=vif-openvswitch,bridge=vbr0',
'script=vif-openvswitch,bridge=vbr1'
]
|
|
|
Back to top |
|
|
Colt45 Tux's lil' helper
Joined: 05 Sep 2007 Posts: 122 Location: Central Washington
|
Posted: Sat Apr 30, 2022 2:50 am Post subject: |
|
|
So I found "Vertio Network Driver" was disabled in the kernel. I enabled it and recompilied, booted into 5.15.32 and that fixed the error in the openvswitch listing. However the network is still not coming up on the guest. |
|
Back to top |
|
|
Colt45 Tux's lil' helper
Joined: 05 Sep 2007 Posts: 122 Location: Central Washington
|
Posted: Sun May 01, 2022 11:19 pm Post subject: |
|
|
Currently working on building every kernel from 26-32 so I can step through them rapidly when I get a chance. |
|
Back to top |
|
|
Colt45 Tux's lil' helper
Joined: 05 Sep 2007 Posts: 122 Location: Central Washington
|
Posted: Mon May 02, 2022 4:40 am Post subject: |
|
|
I downloaded directly from kernel.org 5.15.26-5.15.32
I built and installed each version, then rebooted starting at 5.15.26, when that worked I went to 5.15.27 and so on.
The failure occurs at 5.15.29. Meaning 5.15.28 is good and working, 5.15.29 does not work. Now to figure out which of the hundreds of changes is the culprit.
https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.15.29
These are my favorites from the changelog of 5.15.29
Code: | commit 2708ceb4e5cc84ef179bad25a2d7890573ef78be
Author: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Date: Tue Feb 22 01:18:17 2022 +0100
Revert "xen-netback: Check for hotplug-status existence before watching"
[ Upstream commit e8240addd0a3919e0fd7436416afe9aa6429c484 ]
This reverts commit 2afeec08ab5c86ae21952151f726bfe184f6b23d.
The reasoning in the commit was wrong - the code expected to setup the
watch even if 'hotplug-status' didn't exist. In fact, it relied on the
watch being fired the first time - to check if maybe 'hotplug-status' is
already set to 'connected'. Not registering a watch for non-existing
path (which is the case if hotplug script hasn't been executed yet),
made the backend not waiting for the hotplug script to execute. This in
turns, made the netfront think the interface is fully operational, while
in fact it was not (the vif interface on xen-netback side might not be
configured yet).
This was a workaround for 'hotplug-status' erroneously being removed.
But since that is reverted now, the workaround is not necessary either.
More discussion at
https://lore.kernel.org/xen-devel/afedd7cb-a291-e773-8b0d-4db9b291fa98@ipxe.org/T/#u
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Michael Brown <mbrown@fensystems.co.uk>
Link: https://lore.kernel.org/r/20220222001817.2264967-2-marmarek@invisiblethingslab.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit fe39ab30dcc204e321c2670cc1cf55904af35d01
Author: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Date: Tue Feb 22 01:18:16 2022 +0100
Revert "xen-netback: remove 'hotplug-status' once it has served its purpose"
[ Upstream commit 0f4558ae91870692ce7f509c31c9d6ee721d8cdc ]
This reverts commit 1f2565780e9b7218cf92c7630130e82dcc0fe9c2.
The 'hotplug-status' node should not be removed as long as the vif
device remains configured. Otherwise the xen-netback would wait for
re-running the network script even if it was already called (in case of
the frontent re-connecting). But also, it _should_ be removed when the
vif device is destroyed (for example when unbinding the driver) -
otherwise hotplug script would not configure the device whenever it
re-appear.
Moving removal of the 'hotplug-status' node was a workaround for nothing
calling network script after xen-netback module is reloaded. But when
vif interface is re-created (on xen-netback unbind/bind for example),
the script should be called, regardless of who does that - currently
this case is not handled by the toolstack, and requires manual
script call. Keeping hotplug-status=connected to skip the call is wrong
and leads to not configured interface.
More discussion at
https://lore.kernel.org/xen-devel/afedd7cb-a291-e773-8b0d-4db9b291fa98@ipxe.org/T/#u
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Link: https://lore.kernel.org/r/20220222001817.2264967-1-marmarek@invisiblethingslab.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org> |
But Im no kernel expert. Does anyone have any ideas how to go about determining the exact problem? |
|
Back to top |
|
|
Colt45 Tux's lil' helper
Joined: 05 Sep 2007 Posts: 122 Location: Central Washington
|
Posted: Mon May 02, 2022 2:52 pm Post subject: |
|
|
Colt45 wrote: | I downloaded directly from kernel.org 5.15.26-5.15.32
I built and installed each version, then rebooted starting at 5.15.26, when that worked I went to 5.15.27 and so on.
The failure occurs at 5.15.29. Meaning 5.15.28 is good and working, 5.15.29 does not work. Now to figure out which of the hundreds of changes is the culprit.
https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.15.29
These are my favorites from the changelog of 5.15.29
Code: | commit 2708ceb4e5cc84ef179bad25a2d7890573ef78be
Author: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Date: Tue Feb 22 01:18:17 2022 +0100
Revert "xen-netback: Check for hotplug-status existence before watching"
[ Upstream commit e8240addd0a3919e0fd7436416afe9aa6429c484 ]
This reverts commit 2afeec08ab5c86ae21952151f726bfe184f6b23d.
The reasoning in the commit was wrong - the code expected to setup the
watch even if 'hotplug-status' didn't exist. In fact, it relied on the
watch being fired the first time - to check if maybe 'hotplug-status' is
already set to 'connected'. Not registering a watch for non-existing
path (which is the case if hotplug script hasn't been executed yet),
made the backend not waiting for the hotplug script to execute. This in
turns, made the netfront think the interface is fully operational, while
in fact it was not (the vif interface on xen-netback side might not be
configured yet).
This was a workaround for 'hotplug-status' erroneously being removed.
But since that is reverted now, the workaround is not necessary either.
More discussion at
https://lore.kernel.org/xen-devel/afedd7cb-a291-e773-8b0d-4db9b291fa98@ipxe.org/T/#u
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Michael Brown <mbrown@fensystems.co.uk>
Link: https://lore.kernel.org/r/20220222001817.2264967-2-marmarek@invisiblethingslab.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit fe39ab30dcc204e321c2670cc1cf55904af35d01
Author: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Date: Tue Feb 22 01:18:16 2022 +0100
Revert "xen-netback: remove 'hotplug-status' once it has served its purpose"
[ Upstream commit 0f4558ae91870692ce7f509c31c9d6ee721d8cdc ]
This reverts commit 1f2565780e9b7218cf92c7630130e82dcc0fe9c2.
The 'hotplug-status' node should not be removed as long as the vif
device remains configured. Otherwise the xen-netback would wait for
re-running the network script even if it was already called (in case of
the frontent re-connecting). But also, it _should_ be removed when the
vif device is destroyed (for example when unbinding the driver) -
otherwise hotplug script would not configure the device whenever it
re-appear.
Moving removal of the 'hotplug-status' node was a workaround for nothing
calling network script after xen-netback module is reloaded. But when
vif interface is re-created (on xen-netback unbind/bind for example),
the script should be called, regardless of who does that - currently
this case is not handled by the toolstack, and requires manual
script call. Keeping hotplug-status=connected to skip the call is wrong
and leads to not configured interface.
More discussion at
https://lore.kernel.org/xen-devel/afedd7cb-a291-e773-8b0d-4db9b291fa98@ipxe.org/T/#u
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Link: https://lore.kernel.org/r/20220222001817.2264967-1-marmarek@invisiblethingslab.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org> |
But Im no kernel expert. Does anyone have any ideas how to go about determining the exact problem? |
I reverted these two commits and now the network is operating normally in 5.15.32 |
|
Back to top |
|
|
Hu Moderator
Joined: 06 Mar 2007 Posts: 21635
|
Posted: Tue May 03, 2022 1:16 am Post subject: |
|
|
Colt45 wrote: | But Im no kernel expert. Does anyone have any ideas how to go about determining the exact problem? | Normally, git bisect would be the next step, to let you examine good and bad kernels without needing to visit every commit in the path. However, now that you found the offending commits, this is unnecessary. Colt45 wrote: | I reverted these two commits and now the network is operating normally in 5.15.32 | Good. The next step then is to report this so that those commits can be handled properly upstream. Since you are reverting revert commits, that suggests the reverted commits were important. Upstream may be able to come up with a commit that implements the useful parts of these reverted commits while not introducing the problems that motivated reverting the commits. |
|
Back to top |
|
|
Colt45 Tux's lil' helper
Joined: 05 Sep 2007 Posts: 122 Location: Central Washington
|
Posted: Tue May 03, 2022 1:47 am Post subject: |
|
|
Hu wrote: | Colt45 wrote: | But Im no kernel expert. Does anyone have any ideas how to go about determining the exact problem? | Normally, git bisect would be the next step, to let you examine good and bad kernels without needing to visit every commit in the path. However, now that you found the offending commits, this is unnecessary. Colt45 wrote: | I reverted these two commits and now the network is operating normally in 5.15.32 | Good. The next step then is to report this so that those commits can be handled properly upstream. Since you are reverting revert commits, that suggests the reverted commits were important. Upstream may be able to come up with a commit that implements the useful parts of these reverted commits while not introducing the problems that motivated reverting the commits. |
I am trying right now to bring this up with Xen but they dont make it easy. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|