View previous topic :: View next topic |
Author |
Message |
twork Apprentice
Joined: 28 Jul 2006 Posts: 183
|
Posted: Fri May 01, 2015 1:50 am Post subject: netmount fails at boot; race condition(?) |
|
|
I'm setting up a new VM, that wants to mount /home over NFS. Everything works... except at boot. After the system comes up, /home isn't mounted, but if we log in immediately and mount /home, it works fine. So something is is out of order somewhere.
I read:
https://www.gentoo.org/support/news-items/2015-02-02-nfs-service-changes.html
...and (think I) followed the instructions there; I have both nfsclient and netmount in my default runlevel, along with their rpc friends: Code: | # rc-status
Runlevel: default
syslog-ng [ started ]
vixie-cron [ started ]
nfsclient [ started ]
sshd [ started ]
netmount [ started ]
dbus [ started ]
local [ started ]
Dynamic Runlevel: hotplugged
Dynamic Runlevel: needed
rpcbind [ started ]
rpc.statd [ started ]
rpc.pipefs [ started ]
rpc.idmapd [ started ]
Dynamic Runlevel: manual |
Like I said, hand-mounting /home works; it also works fine if I log in and run '/etc/init.d/netmount restart'. So it's looking like everything within the nfs/RPC structure itself is working fine, but something outside it isn't ready when it comes up...? Race condition maybe?
To test that, I stuck a ten-second sleep(1) in the netmount runscript, and sure enough, on the next boot, /home was mounted.
My first guess is that the network isn't coming up fast enough (it is DHCP), but shouldn't the init structure account for that? |
|
Back to top |
|
|
twork Apprentice
Joined: 28 Jul 2006 Posts: 183
|
Posted: Fri May 01, 2015 4:07 am Post subject: |
|
|
FWIW: DHCP isn't the culprit; static IP addressing gives the same behavior.
Can't blame my name service either (long shot); finding the NFS server by IP address also doesn't change things.
But that ten second delay at mount time works every time. |
|
Back to top |
|
|
steveL Watchman
Joined: 13 Sep 2006 Posts: 5153 Location: The Peanut Gallery
|
Posted: Fri May 01, 2015 1:15 pm Post subject: |
|
|
/home and other partitions are mounted as part of localmount which I'm not seeing listed.
edit: oops you said a delay fixes it.
You might find these functions useful, if you think there's something up with the dependencies. |
|
Back to top |
|
|
twork Apprentice
Joined: 28 Jul 2006 Posts: 183
|
Posted: Fri May 01, 2015 2:01 pm Post subject: |
|
|
Thanks. But... All that tells me (right?) is the dependencies listed in a particular init script. I can already read that for myself.
What I don't know is what I should be looking for. I've stepped through the init process several times, and I can't make it fail any other way besides slowing it down.
My hunch is that there's something in the NFS setup apparatus that's working fine, but takes a while to finish; but the init script doesn't test for that process completing before it tries to do the mount. And I don't know any way to test for it other than... a successful mount, and I don't want to stick some sort of while-do loop in there that's going to hang indefinitely the first time my NFS setup has a real problem...
For now, the delay is good enough, I suppose; it does the job. But I'd rather fix the trouble for real if I just knew what it was. |
|
Back to top |
|
|
steveL Watchman
Joined: 13 Sep 2006 Posts: 5153 Location: The Peanut Gallery
|
Posted: Sat May 02, 2015 1:48 pm Post subject: |
|
|
Sure you can read that for yourself, but it's more about when you're working deep in the tree of dependencies, and you need to see what various things (pretty much at random) are doing. It helps you keep an overview.
But like I said, that's "if you think there's something up with the dependencies." (though you can ofc see any function, from any initscript.)
As for your actual issue, I don't use NFS, so I'll bow out.
I'd recommend asking in #gentoo on IRC: chat.freenode.net for live support, and quicker discussion.
#networking is also meant to be good, but it's not my area so not been in there. (probably not very useful for this, but useful generally.)
Oh, and the wiki page on this (again not something I have first-hand knowledge of) using openrc without netifrc, ie dhcpcd.
Someone will be along to resume normal service shortly.. ;)
Good luck. :) |
|
Back to top |
|
|
Atom2 Apprentice
Joined: 01 Aug 2011 Posts: 185
|
Posted: Sun May 03, 2015 2:58 pm Post subject: Re: netmount fails at boot; race condition(?) |
|
|
twork wrote: | I'm setting up a new VM, that wants to mount /home over NFS. Everything works... except at boot. After the system comes up, /home isn't mounted, but if we log in immediately and mount /home, it works fine. | I assume you are using the standard/recommended way of connecting the VM to the host through a bridge. If so, then my money is on your network setup, in particular the spanning tree protocol (STP) and related settings. By disabling STP (stp off), setting the forward delay to 0 (setfd 0) and the hello time also to 0 (hello 0), your interface should be up immediately. In essence you should try to change/add the following parameters for the bridge interface on the host in /etc/conf.d/net (assuming you are using OpenRC): Code: | brctl_xenbr0="stp off setfd 0 sethello 0" | Obviously you need to change the xenbr0 part in my example statement with what your bridge interface is called. My setup example stems from a XEN dom0 setup. Please note that there's no change required on the network configuration for/within the VM. |
|
Back to top |
|
|
javeree Guru
Joined: 29 Jan 2006 Posts: 453
|
Posted: Tue Mar 29, 2016 9:59 pm Post subject: |
|
|
I have a very similar problem, though not on a virtual machine. I've narrowed it down to:
Quote: |
Mar 29 23:03:20 [dhcpcd] ethm: rebinding lease of 192.168.4.50
Mar 29 23:03:20 [dhcpcd] ethm: probing address 192.168.4.50/24
Mar 29 23:03:20 [sm-notify] Version 1.3.1 starting
Mar 29 23:03:20 [ifplugd(ethm)] Link beat detected.
Mar 29 23:03:20 [/etc/init.d/netmount] WARNING: netmount will start when net.ethm has started
Mar 29 23:03:22 [ifplugd(ethm)] Executing '/etc/ifplugd/ifplugd.action ethm up'.
Mar 29 23:03:22 [dhcpcd] sending commands to master dhcpcd process
Mar 29 23:03:22 [dhcpcd] control command: dhcpcd -m 2 ethm
Mar 29 23:03:22 [ifplugd(ethm)] client: sending commands to master dhcpcd process
Mar 29 23:03:23 [ifplugd(ethm)] client: mount.nfs: Failed to resolve server portage: Temporary failure in name resolution
- Last output repeated 7 times -
Mar 29 23:03:23 [/etc/init.d/netmount] ERROR: netmount failed to start
Mar 29 23:03:23 [ifplugd(ethm)] client: * ERROR: netmount failed to start
Mar 29 23:03:23 [ifplugd(ethm)] Program executed successfully.
Mar 29 23:03:25 [dhcpcd] ethm: leased 192.168.4.50 for 3600 seconds
Mar 29 23:03:25 [dhcpcd] ethm: adding route to 192.168.4.0/24
Mar 29 23:03:25 [dhcpcd] ethm: adding default route via 192.168.4.1
Mar 29 23:03:25 [dhcpcd] ethm: removing route to 192.168.4.0/24
|
In my case I try to mount nfs from you /etc/fstab, but I use a servername instead of an IP address for the nfs server. I believe ifplugd starts the initialization of the network interface, but starts netmount as soon as dhcpcd has received a reply. but before the ethm interface has been set up beyond its IP address (so /etc/resolv.conf is not yet written). the nfsmount starts and tries to resolve the nfs server name, which fails, hence the whole netmount script fails at boot. Next net.eth finalizes filling out /etc/resolv.conf, and your next attempt to start netmount manually will succeed.
I hope your case has similarities with mine that help you find and resolve this problem. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54244 Location: 56N 3W
|
Posted: Tue Mar 29, 2016 10:19 pm Post subject: |
|
|
twork,
It goes like this Code: | localmount | boot
net.eth0 | default
netmount | default |
localmount is in the boot runlevel. It reads /etc/fstab and passes its contents to mount.
Mounting network filesystems fails because neither the network nor netmount are running yet.
The NFS designers thought of this. Add the bg option to your fstab entry.
This allows the mount to continue to run in the background and it will work once the dust settles.
It also allows you to do crazy things like cross NFS mounts that work without locking up the boot of all the boxes involved in the cross mounting. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
|