Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
netmount fails at boot; race condition(?)
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
twork
Apprentice
Apprentice


Joined: 28 Jul 2006
Posts: 176

PostPosted: Fri May 01, 2015 1:50 am    Post subject: netmount fails at boot; race condition(?) Reply with quote

I'm setting up a new VM, that wants to mount /home over NFS. Everything works... except at boot. After the system comes up, /home isn't mounted, but if we log in immediately and mount /home, it works fine. So something is is out of order somewhere.

I read:
https://www.gentoo.org/support/news-items/2015-02-02-nfs-service-changes.html
...and (think I) followed the instructions there; I have both nfsclient and netmount in my default runlevel, along with their rpc friends:
Code:
# rc-status
Runlevel: default
 syslog-ng                                                         [  started  ]
 vixie-cron                                                        [  started  ]
 nfsclient                                                         [  started  ]
 sshd                                                              [  started  ]
 netmount                                                          [  started  ]
 dbus                                                              [  started  ]
 local                                                             [  started  ]
Dynamic Runlevel: hotplugged
Dynamic Runlevel: needed
 rpcbind                                                           [  started  ]
 rpc.statd                                                         [  started  ]
 rpc.pipefs                                                        [  started  ]
 rpc.idmapd                                                        [  started  ]
Dynamic Runlevel: manual

Like I said, hand-mounting /home works; it also works fine if I log in and run '/etc/init.d/netmount restart'. So it's looking like everything within the nfs/RPC structure itself is working fine, but something outside it isn't ready when it comes up...? Race condition maybe?

To test that, I stuck a ten-second sleep(1) in the netmount runscript, and sure enough, on the next boot, /home was mounted.

My first guess is that the network isn't coming up fast enough (it is DHCP), but shouldn't the init structure account for that?
Back to top
View user's profile Send private message
twork
Apprentice
Apprentice


Joined: 28 Jul 2006
Posts: 176

PostPosted: Fri May 01, 2015 4:07 am    Post subject: Reply with quote

FWIW: DHCP isn't the culprit; static IP addressing gives the same behavior.

Can't blame my name service either (long shot); finding the NFS server by IP address also doesn't change things.

But that ten second delay at mount time works every time.
Back to top
View user's profile Send private message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5153
Location: The Peanut Gallery

PostPosted: Fri May 01, 2015 1:15 pm    Post subject: Reply with quote

/home and other partitions are mounted as part of localmount which I'm not seeing listed.

edit: oops you said a delay fixes it.
You might find these functions useful, if you think there's something up with the dependencies.
Back to top
View user's profile Send private message
twork
Apprentice
Apprentice


Joined: 28 Jul 2006
Posts: 176

PostPosted: Fri May 01, 2015 2:01 pm    Post subject: Reply with quote

Thanks. But... All that tells me (right?) is the dependencies listed in a particular init script. I can already read that for myself.

What I don't know is what I should be looking for. I've stepped through the init process several times, and I can't make it fail any other way besides slowing it down.

My hunch is that there's something in the NFS setup apparatus that's working fine, but takes a while to finish; but the init script doesn't test for that process completing before it tries to do the mount. And I don't know any way to test for it other than... a successful mount, and I don't want to stick some sort of while-do loop in there that's going to hang indefinitely the first time my NFS setup has a real problem...

For now, the delay is good enough, I suppose; it does the job. But I'd rather fix the trouble for real if I just knew what it was.
Back to top
View user's profile Send private message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5153
Location: The Peanut Gallery

PostPosted: Sat May 02, 2015 1:48 pm    Post subject: Reply with quote

Sure you can read that for yourself, but it's more about when you're working deep in the tree of dependencies, and you need to see what various things (pretty much at random) are doing. It helps you keep an overview.
But like I said, that's "if you think there's something up with the dependencies." (though you can ofc see any function, from any initscript.)

As for your actual issue, I don't use NFS, so I'll bow out.

I'd recommend asking in #gentoo on IRC: chat.freenode.net for live support, and quicker discussion.

#networking is also meant to be good, but it's not my area so not been in there. (probably not very useful for this, but useful generally.)

Oh, and the wiki page on this (again not something I have first-hand knowledge of) using openrc without netifrc, ie dhcpcd.

Someone will be along to resume normal service shortly.. ;)

Good luck. :)
Back to top
View user's profile Send private message
Atom2
Apprentice
Apprentice


Joined: 01 Aug 2011
Posts: 185

PostPosted: Sun May 03, 2015 2:58 pm    Post subject: Re: netmount fails at boot; race condition(?) Reply with quote

twork wrote:
I'm setting up a new VM, that wants to mount /home over NFS. Everything works... except at boot. After the system comes up, /home isn't mounted, but if we log in immediately and mount /home, it works fine.
I assume you are using the standard/recommended way of connecting the VM to the host through a bridge. If so, then my money is on your network setup, in particular the spanning tree protocol (STP) and related settings. By disabling STP (stp off), setting the forward delay to 0 (setfd 0) and the hello time also to 0 (hello 0), your interface should be up immediately. In essence you should try to change/add the following parameters for the bridge interface on the host in /etc/conf.d/net (assuming you are using OpenRC):
Code:
brctl_xenbr0="stp off setfd 0 sethello 0"
Obviously you need to change the xenbr0 part in my example statement with what your bridge interface is called. My setup example stems from a XEN dom0 setup. Please note that there's no change required on the network configuration for/within the VM.
Back to top
View user's profile Send private message
javeree
Guru
Guru


Joined: 29 Jan 2006
Posts: 453

PostPosted: Tue Mar 29, 2016 9:59 pm    Post subject: Reply with quote

I have a very similar problem, though not on a virtual machine. I've narrowed it down to:
Quote:

Mar 29 23:03:20 [dhcpcd] ethm: rebinding lease of 192.168.4.50
Mar 29 23:03:20 [dhcpcd] ethm: probing address 192.168.4.50/24
Mar 29 23:03:20 [sm-notify] Version 1.3.1 starting
Mar 29 23:03:20 [ifplugd(ethm)] Link beat detected.
Mar 29 23:03:20 [/etc/init.d/netmount] WARNING: netmount will start when net.ethm has started
Mar 29 23:03:22 [ifplugd(ethm)] Executing '/etc/ifplugd/ifplugd.action ethm up'.
Mar 29 23:03:22 [dhcpcd] sending commands to master dhcpcd process
Mar 29 23:03:22 [dhcpcd] control command: dhcpcd -m 2 ethm
Mar 29 23:03:22 [ifplugd(ethm)] client: sending commands to master dhcpcd process
Mar 29 23:03:23 [ifplugd(ethm)] client: mount.nfs: Failed to resolve server portage: Temporary failure in name resolution
- Last output repeated 7 times -
Mar 29 23:03:23 [/etc/init.d/netmount] ERROR: netmount failed to start
Mar 29 23:03:23 [ifplugd(ethm)] client: * ERROR: netmount failed to start
Mar 29 23:03:23 [ifplugd(ethm)] Program executed successfully.
Mar 29 23:03:25 [dhcpcd] ethm: leased 192.168.4.50 for 3600 seconds
Mar 29 23:03:25 [dhcpcd] ethm: adding route to 192.168.4.0/24
Mar 29 23:03:25 [dhcpcd] ethm: adding default route via 192.168.4.1
Mar 29 23:03:25 [dhcpcd] ethm: removing route to 192.168.4.0/24

In my case I try to mount nfs from you /etc/fstab, but I use a servername instead of an IP address for the nfs server. I believe ifplugd starts the initialization of the network interface, but starts netmount as soon as dhcpcd has received a reply. but before the ethm interface has been set up beyond its IP address (so /etc/resolv.conf is not yet written). the nfsmount starts and tries to resolve the nfs server name, which fails, hence the whole netmount script fails at boot. Next net.eth finalizes filling out /etc/resolv.conf, and your next attempt to start netmount manually will succeed.

I hope your case has similarities with mine that help you find and resolve this problem.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54208
Location: 56N 3W

PostPosted: Tue Mar 29, 2016 10:19 pm    Post subject: Reply with quote

twork,

It goes like this
Code:
localmount | boot
net.eth0 |      default
netmount |      default


localmount is in the boot runlevel. It reads /etc/fstab and passes its contents to mount.
Mounting network filesystems fails because neither the network nor netmount are running yet.

The NFS designers thought of this. Add the bg option to your fstab entry.
This allows the mount to continue to run in the background and it will work once the dust settles.

It also allows you to do crazy things like cross NFS mounts that work without locking up the boot of all the boxes involved in the cross mounting.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum