Joined: 20 May 2005
|Posted: Thu Feb 25, 2016 4:48 pm Post subject: OpenVSwitch 2.3.1 Cisco bonding problems
|To all of the OVS gurus out there:
I have multiple Xen 4.6 dom0 systems running OpenVSwitch 2.3.1 with Linux 4.1 and it mostly works fine, but for currently unknown reasons, one or sometimes more of the devices in the OVS bond that connects the systems to a Cisco 3750G and 6509E just decide to go into the disabled state. Also, if I restart the ovs-vswitchd daemon some of the devices never participate in the bond. In all cases, bringing the bond/channel down/up in the switch seems to always correct the problem.
This problem was also present back when I was running Xen 4.5.2 and Linux 3.18. Upgrading to Xen 4.6 and Linux 4.1 did not make a difference. Also, running the testing OpenVSwitch 2.4 version appears to do the same thing.
The "show interface" command on the switches in question report the status for the given interface(s) as down and err-disabled. Also, in the switche's log I've seen a message that says:
Feb 24 21:23:32 10.x.x.x 8962: Feb 25 02:23:31.511: %ETHCNTR-3-LOOP_BACK_DETECTED: Loop-back detected on GigabitEthernet2/0/3. (sw-3750)
Feb 24 21:23:32 10.x.x.x 8963: Feb 25 02:23:31.511: %PM-4-ERR_DISABLE: loopback error detected on Gi2/0/3, putting Gi2/0/3 in err-disable state (sw-3750)
In the mean time, on some of my systems, I experimentally configured a normal Linux bond device and bridged it with the default OVS internal interface using a normal Linux bridge. I know, it is less than pretty, but it seems to work without issues. So, it seems that something is not playing nice between OVS bonding and the Cisco implementation.
Any suggestions, experiences, etc?