View previous topic :: View next topic |
Author |
Message |
araxon Tux's lil' helper
Joined: 25 May 2011 Posts: 83
|
Posted: Thu May 06, 2021 5:49 am Post subject: eudev did not rename net interface on reboot |
|
|
I rebooted the server remotely yesterday and it did not come up again.
This morning I went to investigate on site and the problem was that the network interface was not renamed from eth0 to net0.
I checked if the /etc/init.d/udev is running, and it was. I checked the rule in /etc/udev/rules.d/70-my-network.rules and the MAC address and everything else was correct. I restarted udev service and it did rename the interface this time and I was able to bring net0 and everything else up. Without editing any file or making any change at all.
Code: | May 6 06:35:26 gorgon /etc/init.d/udev[149464]: WARNING: you are stopping a sysinit service
May 6 06:35:26 gorgon /etc/init.d/udev-trigger[149465]: WARNING: you are stopping a sysinit service
May 6 06:35:26 gorgon kernel: udevd[149539]: starting version 3.2.10
May 6 06:35:27 gorgon kernel: udevd[149539]: starting eudev-3.2.10
May 6 06:35:27 gorgon kernel: e1000e 0000:00:1f.6 net0: renamed from eth0 |
Curious if this would happen on the next reboot, I restarted the server and this time all went smoothly - interface was renamed automatically and everything started as it should in the first place.
It seems like this could happen occasionally when rebooting. It did happen first time, but I have several dozens of Gentoo servers, most of them more remote than this (in different countries). This is the kind of problem I don't need when rebooting. Did anyone else experience this? How can I debug or investigate this further? I am willing to perform more reboots of this particular server to find the culprit, but I don't know what to look for.
sys-fs/eudev-3.2.10
sys-apps/openrc-0.42.1-r1
sys-kernel/gentoo-sources-5.10.27 |
|
Back to top |
|
|
Zucca Moderator
Joined: 14 Jun 2007 Posts: 3339 Location: Rasi, Finland
|
Posted: Sun May 09, 2021 9:44 pm Post subject: |
|
|
Can you post the contents of your /etc/udev/rules.d/70-my-network.rules ? _________________ ..: Zucca :..
Gentoo IRC channels reside on Libera.Chat.
--
Quote: | I am NaN! I am a man! |
|
|
Back to top |
|
|
araxon Tux's lil' helper
Joined: 25 May 2011 Posts: 83
|
Posted: Mon May 10, 2021 6:54 am Post subject: |
|
|
Zucca: certainly, here it is:
Code: | SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="e0:07:1b:ff:f8:8c", NAME="net0" |
Nothing complicated, only one line and it is the only file in /etc/udev/rules.d/ |
|
Back to top |
|
|
dmpogo Advocate
Joined: 02 Sep 2004 Posts: 3267 Location: Canada
|
Posted: Mon May 10, 2021 7:45 am Post subject: |
|
|
Does your server have one network interface or more ?
Last edited by dmpogo on Wed May 12, 2021 7:05 pm; edited 1 time in total |
|
Back to top |
|
|
araxon Tux's lil' helper
Joined: 25 May 2011 Posts: 83
|
Posted: Mon May 10, 2021 10:13 am Post subject: |
|
|
dmpogo: the server in question has only one physical network interface (onboard NIC with only one port). It is HP ML10Gen9 tower server. |
|
Back to top |
|
|
Zucca Moderator
Joined: 14 Jun 2007 Posts: 3339 Location: Rasi, Finland
|
Posted: Mon May 10, 2021 1:37 pm Post subject: |
|
|
I've encontered similar problems too.
My solution was to invert/negate the ACTION -part of the rule: Code: | SUBSYSTEM=="net", ACTION!="remove", ATTR{address}=="e0:07:1b:ff:f8:8c", NAME="net0" |
I think that during early initialization the interface doesn't have yet the required values for the rule to match. The rule above then matches to add and change ACTIONs. _________________ ..: Zucca :..
Gentoo IRC channels reside on Libera.Chat.
--
Quote: | I am NaN! I am a man! |
|
|
Back to top |
|
|
araxon Tux's lil' helper
Joined: 25 May 2011 Posts: 83
|
Posted: Wed May 12, 2021 8:31 am Post subject: |
|
|
Zucca: thank you for the tip. I'll try it and then update my servers accordingly.
I'm curious if the ACTION parameter is even needed? I do not plan adding or removing interfaces on the fly... |
|
Back to top |
|
|
figueroa Advocate
Joined: 14 Aug 2005 Posts: 2961 Location: Edge of marsh USA
|
Posted: Thu May 13, 2021 3:42 am Post subject: |
|
|
If the machine only has one interface, why bother renaming it?
I'm wondering about the proposed solution, since what I'm using seems to always work. I would be distressed to learn that it might NOT always work because I do this on a remote machine where the built-in interface is flaky but cannot be disabled in BIOS.
On a remote server:
Code: | $ cat /etc/udev/rules.d/90-local-net-name.rules
# NOTES:
#
# lan0 Intel 82579LM Gigabit (old failing built-in) e1000e module
# lan1 Intel 82574L Gigabit (new pci-e card) e1000e module
SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="e8:39:35:63:59:b8", NAME="lan0"
SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="00:1b:21:3a:ed:05", NAME="lan1" |
I do need this to work every time. I should MAYBE change the action to be ACTION!="remove"?
Zucca -- did you just reason this out on your own? It's kind of exotic Does this then actually mean ACTION matches everything except "remove"? Can you explain why this might be more reliable than ACTION=="add"?
The fun part does happen really early in the boot:
Code: | $ dmesg | grep eth1
[ 1.344652] e1000e 0000:02:00.0 eth1: (PCI Express:2.5GT/s:Width x1) 00:1b:21:3a:ed:05
[ 1.344721] e1000e 0000:02:00.0 eth1: Intel(R) PRO/1000 Network Connection
[ 1.344786] e1000e 0000:02:00.0 eth1: MAC: 3, PHY: 8, PBA No: E42641-005
[ 5.625692] e1000e 0000:02:00.0 lan1: renamed from eth1 |
_________________ Andy Figueroa
hp pavilion hpe h8-1260t/2AB5; spinning rust x3
i7-2600 @ 3.40GHz; 16 gb; Radeon HD 7570
amd64/23.0/split-usr/desktop (stable), OpenRC, -systemd -pulseaudio -uefi |
|
Back to top |
|
|
araxon Tux's lil' helper
Joined: 25 May 2011 Posts: 83
|
Posted: Thu May 13, 2021 5:32 am Post subject: |
|
|
figueroa: why renaming only one interface? Company naming conventions. But that is not the point, we have some servers with as much as five network ports and there the renaming really should be better working, or all hell breaks loose.
We use for this exactly the same udev rules syntax as you, it may even originate from some Gentoo migration guide... But it did work flawlessly in the past. Except this one time, after which I started this forum thread.
As for the solution, I tried removing ACTION part entirely from the rule and restarted the server few times. It did work as before, the interface was correctly renamed and everything started all right. If removing ACTION had any negative consequences, I did not observe them. |
|
Back to top |
|
|
Zucca Moderator
Joined: 14 Jun 2007 Posts: 3339 Location: Rasi, Finland
|
Posted: Thu May 13, 2021 12:34 pm Post subject: |
|
|
figueroa wrote: | Zucca -- did you just reason this out on your own? It's kind of exotic Does this then actually mean ACTION matches everything except "remove"? Can you explain why this might be more reliable than ACTION=="add"? | I've had this problem on at least two different machines. So it only based on my experience. ACTION!="remove" has fixed the issue.
When I have used ACTION=="add" the results have varied. I can only guess it's a race condition or something not triggering on early initialization of the interface.
I've also had differing results on some rules on eudev vs. systemd-udev long time ago. *sigh* I remember the headaches I had when I needed to drop systemd from one of my systems. There were lots of my custom rules which stopped working.
If a remote server relies on udev renaming an interface, I'd add some mechanism to trigger udev re-running rules for the interface(s) later on the boot process based on if said interface(s) haven't been found. That said... My rules on two machines running eudev have succesfully been working without any hacks. Most of them are actually with ACTION=="add", but there's one usb interface with which I seem to have problems with and I had the following style of rule for it: Code: | SUBSYSTEM=="net", ACTION!="remove", ATTRS{serial}=="c0ffee", NAME="ethFOO" | So I rather matched the serial of the device than its MAC.
As to why ACTION!="remove" is more reliable than ACTION=="add" is because it's kind of brute forcing the rule to work. It's not pretty, but different hardware and combinations of hardware components can act very differently. And yes. It does match to everything but remove action. So if there are many changes to the interface the renaming will take place on each change (logs may reveal how many times).
One could avoid this by adding condition NAME!="iface", there iface being the name you want to rename the interface. Example: Code: | SUBSYSTEM=="net", ACTION!="remove", NAME!="myeth", ATTRS{serial}=="serial", NAME="myeth" |
_________________ ..: Zucca :..
Gentoo IRC channels reside on Libera.Chat.
--
Quote: | I am NaN! I am a man! |
|
|
Back to top |
|
|
figueroa Advocate
Joined: 14 Aug 2005 Posts: 2961 Location: Edge of marsh USA
|
Posted: Thu May 13, 2021 4:40 pm Post subject: |
|
|
Zucca: I tried it, it certainly works. The interfaces get renamed only once same as ACTION="add" vs ACTION!="remove" with a normal boot. Must keep this in my bag of tricks. Thank you.
What I'd been doing on a Debian based remote desktop with two NICs where the original NIC went flaky, the built-in NIC cannot be disabled and both NICs use the same module, is to use udev to set the working interface from eth1 to lan1, then use rules in /etc/network/interfaces to set a static IP for lan1, then ALSO set the same static IP for eth1 in case the interface did not get renamed as follows:
Code: | auto lan1
iface lan1 inet static
address 192.168.1.40
netmask 255.255.255.0
gateway 192.168.1.1
auto eth1
#iface eth1 inet dhcp
iface eth1 inet static
address 192.168.1.40
netmask 255.255.255.0
gateway 192.168.1.1 |
So far, the interface has not failed to be renamed by udev. The eth1 rule gets ignored as long as "lan1: renamed from eth1" takes place during boot. I realize, however, that if the kernel were to bring up the working interface as eth0, this trick wouldn't do anything. So far, I've only tried this on remote Debian-based machines. I'm continuing to develop what I hope will be relatively bulletproof. If A doesn't work, then B, and if neither A or B work, then C. I haven't gotten to C yet. I hope to generalize what I'm learning on this particular Debian box Gentoo boxes with multiple NICs.
In case you are wondering about DNS, I brute force the DNS by entering /etc/resolv.conf manually, then making the file immutable with chattr. I don't need the extra help of a glitch causing DNS issues. _________________ Andy Figueroa
hp pavilion hpe h8-1260t/2AB5; spinning rust x3
i7-2600 @ 3.40GHz; 16 gb; Radeon HD 7570
amd64/23.0/split-usr/desktop (stable), OpenRC, -systemd -pulseaudio -uefi |
|
Back to top |
|
|
wwdev16 n00b
Joined: 29 Aug 2018 Posts: 52
|
Posted: Mon May 17, 2021 8:41 am Post subject: |
|
|
A less elegant way (but maybe less dependent on kernel/udev interaction) is to manually find the
interface and then rename it using iproute. This example is for one nic using open-rc, but might
be adaptable to systemd.
The scenario is a vm image with one nic but it has different names depending
on vmm (e.g. qemu/kvm vs vmware) and the hardware emulation selected.
The approach is a custom service in the boot level that is before net and after udev. It just scans all
of the interface names and excludes names such as lo, tun, tap, wg. When the list of interface names
has only one entry, rename it to the desired name.
Renaming more than one nic would require more sophisticated matching using values in sysfs,
e.g. /sys/class/net/<name>/address to match a mac address
FWIW this is the core start behavior:
Code: | status=0
old_name=""
new_name=net0
for ifname in $(cd /sys/class/net/; ls | grep -v net | grep -v lo | grep -v dummy | grep -v tun | grep -v wg); do
if [ "$old_name" != "" ]; then
eerror " more than one interface: $old_name and $ifname"
status=1
break
fi
old_name="$ifname"
done
if [ "$old_name" = "" ]; then
eerror " no interface found"
eend 1
return 1
fi
.... error checking elided .....
einfo " Renaming $old_name to $new_name"
einfo " ip link set dev $old_name name $new_name"
msg=$(ip link set dev "$old_name" name "$new_name" 2>&1)
if [ ! $? ]; then
eerror " Error renaming $old_name: $msg"
status=1
fi
eend $status
return $status
|
|
|
Back to top |
|
|
araxon Tux's lil' helper
Joined: 25 May 2011 Posts: 83
|
Posted: Mon May 24, 2021 8:41 am Post subject: |
|
|
10 days ago I stripped the ACTION from the rule:
Code: | SUBSYSTEM=="net", ATTR{address}=="e0:07:1b:ff:f8:8c", NAME="net0" |
The interface did get renamed exactly once according to dmesg and if this simplification of the rule has any negative sides, I did not observe them.
Question remains if this would have helped before, when my original rule did not trigger for some reason. I'm still not convinced that there is not any unrelated rare race condition in eudev that prevented the renaming. |
|
Back to top |
|
|
|