Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
NFS sharing not mounting at boot
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 436

PostPosted: Thu Jun 14, 2018 12:29 am    Post subject: NFS sharing not mounting at boot Reply with quote

I recently updated a machine that starts some net-requiring services at boot, but, after the update, the services no longer start properly. Specifically, it needs to start an ntp client and mount an NFS share. After the bootup is complete, I can mount the NFS share manually without issue, and there are no other signs of lack of connectivity.

During the update, emerge printed this message:
http://gentoo.org/support/news-items/2015-02-02-nfs-service-changes.html

but, I've implemented the suggested change:
Code:
rc-update add nfsclient


This machine uses OpenRC, and it *seems* like the solution would be to force the nfsclient service to wait for a connection. However, I've not had any success in doing this. I've added this to /etc/conf.d/nfsclient
Code:

rc_after="net"
rc_need="net"


I also found this old post which describes the opposite problem, the machine waiting too long for a connection before giving up:
forums.gentoo.org/viewtopic-p-2704033.html


I tried the opposite solution, adding the following line to /etc/conf.d/net:
Code:

dhcpcd_enp0s31f6="-t 60"


But, without success. Any help would be great!
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21431

PostPosted: Thu Jun 14, 2018 1:24 am    Post subject: Reply with quote

You mention ntp near the beginning, then focus on NFS. Is ntp working properly?

For your NFS problem, please post the fstab lines for the affected filesystems and output from the boot so we can see whether your attempt to make nfsclient start later worked.
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7470

PostPosted: Thu Jun 14, 2018 12:47 pm    Post subject: Reply with quote

You have miss this one https://gentoo.org/support/news-items/2015-10-07-openrc-0-18-localmount-and-netmount-changes.html

Which basically mean, if any mount fail, the service that try to do this mount will fail, and all services that depends on that service will "not fail", but will wait for it to succeed (delayed).
The delayed issue is not new, a service that depends on another should wait for its parent to succeed, why try to start if your parent is not ready and you need it?
But what is new is that a mount attempt is now marking the whole service has fail, while previously a fail mount was just ignore and if any other mount succeed, the service was mark as succeed.
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3095

PostPosted: Thu Jun 14, 2018 6:59 pm    Post subject: Reply with quote

Quote:
After the bootup is complete, I can mount the NFS share manually without issue, and there are no other signs of lack of connectivity.

Adding "_netdev" to fstab (in mount options column) should fix it. Once done, mount for that device will be called after network changes status to started.

Regarding ntp, did you tamper with init scripts?

What do "rc-service ntpd ineed" and "rc-service net.enp0s31f6 iprovide" say?

Funny... I wanted to ask about "rc-service -r net", but it doesn't work for any services with indirect provider (notable example, though not limited to: net -> net.eth0 ) multiple provides on my system, and yet system boots just fine.
Back to top
View user's profile Send private message
Jaglover
Watchman
Watchman


Joined: 29 May 2005
Posts: 8291
Location: Saint Amant, Acadiana

PostPosted: Thu Jun 14, 2018 7:05 pm    Post subject: Reply with quote

Isn't _netdev a systemd option?
_________________
My Gentoo installation notes.
Please learn how to denote units correctly!
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 436

PostPosted: Thu Jun 14, 2018 8:23 pm    Post subject: Reply with quote

npt isn't working either. I just set the time to an incorrect time and rebooted, and ntp did not reset the time and isn't currently running. The log does show some ntp errors:

Code:

Jun 14 09:13:01 box24 ntpd[5145]: bind(20) AF_INET6 fe80::eb96:b401:c451:e155%2#123 flags 0x11 failed: Cannot assign requested address
Jun 14 09:13:01 box24 ntpd[5145]: unable to create socket on enp0s31f6 (4) for fe80::eb96:b401:c451:e155%2#123
Jun 14 09:13:01 box24 ntpd[5145]: failed to init interface for address fe80::eb96:b401:c451:e155%2
Jun 14 09:13:01 box24 ntpd[5145]: Listening on routing socket on fd #20 for interface updates
Jun 14 09:13:02 box24 ntpd[5145]: bind(23) AF_INET6 fe80::eb96:b401:c451:e155%2#123 flags 0x11 failed: Cannot assign requested address
Jun 14 09:13:02 box24 ntpd[5145]: unable to create socket on enp0s31f6 (5) for fe80::eb96:b401:c451:e155%2#123
Jun 14 09:13:02 box24 ntpd[5145]: failed to init interface for address fe80::eb96:b401:c451:e155%2
Jun 14 09:13:04 box24 ntpd[5145]: Listen normally on 6 enp0s31f6 [fe80::eb96:b401:c451:e155%2]:123
Jun 14 09:13:11 box24 ntpd[5145]: Listen normally on 7 enp0s31f6 138.110.75.115:123


I did tinker with the ntp scripts months ago, but not recently.

The line in fstab that for this particular NFS share is:
Code:

<IP address>:/export/cluster   /cluster nfs bg,timeo=14,_netdev,hard,intr,noatime,rsize=32768,wsize=32768,auto,nofail,_netdev 0 0


Note that I've just added the nofail (based on krinn's comment) and _netdev (based on szatox's comment). After reboot, still no luck.

"rc-service ntpd ineed" returns "fsck localmount dhcpcd"
and" rc-service net.enp0s31f6 iprovide" returns "net"
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3095

PostPosted: Thu Jun 14, 2018 9:15 pm    Post subject: Reply with quote

Jaglover wrote:
Isn't _netdev a systemd option?

No. It's actually a mount option, and rc-service localmount start function does consider it. (a very short snippet here)
Quote:
no_netdev="-O no_netdev"
mount -at "$types" $no_netdev

Quote:
I just set the time to an incorrect time and rebooted, and ntp did not reset the time and isn't currently running

Ntpd does not hard reset time. If you heavily skewed your clock (i think the threshold is set on 2 minutes by default), ntpd with simply exit, possibly returning an error code. In such cases you have to either change config manually to force set time on boot, or run ntp-client before starting ntpd. Ntp-client will attempt to contact ntp servers, set your system time immediately to the correct value and then exit. From this point ntpd can take over to keep your clock in sync by speeding it up or slowing down as you lag behind or rush ahead of the world.
Quote:

Jun 14 09:13:02 box24 ntpd[5145]: bind(23) AF_INET6 fe80::eb96:b401:c451:e155%2#123 flags 0x11 failed: Cannot assign requested address
Jun 14 09:13:02 box24 ntpd[5145]: unable to create socket on enp0s31f6 (5) for fe80::eb96:b401:c451:e155%2#123
This is weird. Do you have another process listening on this port?

Errr.... I've just noticed this little bit below and I'm thinking about implications of kinda weird output you got there. Are you using netifrc AND dhcpcd in deamon mode at the same time (as a separate service)? I mean, are both, net.<unpredictable_interface_name> and dhcpcd services started during boot?
Quote:

"rc-service ntpd ineed" returns "fsck localmount dhcpcd"
and" rc-service net.enp0s31f6 iprovide" returns "net"
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 436

PostPosted: Thu Jun 14, 2018 10:06 pm    Post subject: Reply with quote

Okay, it looks like ntp is starting at boot. I just tried the same experiment, but with a 2min offset, and ntp corrected the time. And, the ntp daemon is now running. So it looks like the problem may be localized to nfs ...

Yes, actually, I do have net.enp0s10 and dhcpcd started during boot.
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 436

PostPosted: Thu Jun 14, 2018 10:33 pm    Post subject: Reply with quote

I just tried again with dhcpcd removed from the default runlevel. Now, it works. There's still an error message about the mount failing during the boot sequence, but based on krinn's comments that to be expected (yes?).

Returning to the gentoo handbook, it says to add the net.* devices to the default runlevel, but not dhcpcd. Not sure why I did that back when I set up the system ...
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21431

PostPosted: Fri Jun 15, 2018 1:49 am    Post subject: Reply with quote

jyoung wrote:
There's still an error message about the mount failing during the boot sequence, but based on krinn's comments that to be expected (yes?).
No, krinn's comment was that the rules for deciding whether network mounts had succeeded are different now than they once were. If your configuration is correct and the server is up, you should get an automated mount at the right point during boot.
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7470

PostPosted: Fri Jun 15, 2018 12:26 pm    Post subject: Reply with quote

Hu wrote:
that the rules for deciding whether network mounts had succeeded

It's not limited to network mount, it's specially visible with them as its a dependency of other, but it affect all "user" mounts (ones done from localmount).
And the problem: if you have a fs that you try to mount that is on error, even nofail is not helping.
The key is that nofail only work on non present device, a device present but on error will always report an error and the service end in error.
You endup with a cascading blocked services (which will include nfs, network...), even to a non bootable system! see https://bugs.gentoo.org/579876


This could be disable by adding ignore_mount_errors="yes" in /etc/conf.d/localmount
If your nfsclient report an error, it's bad, because anyone depending on it will be stuck. You should fix that.

ps: _netdev is an option for mount where the system is unable to determine itself if the device is network or not, a special case, using _netdev on nfs mount does nothing, system is fully aware nfs mount are network mount.
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 436

PostPosted: Fri Jun 15, 2018 6:19 pm    Post subject: Reply with quote

The failed attempt at mounting occurs before the net service, , is started. Maybe part of the solution would be to change the order?

Quote:
ignore_mount_errors="yes" in /etc/conf.d/localmount


Let me see if I understand this: adding this will ignore (safely) the initial failure to mount the NFS share, but allow nfsclient to keep attempting the mount in the background until it succeeds?
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3095

PostPosted: Fri Jun 15, 2018 6:42 pm    Post subject: Reply with quote

I'm glad you're making progress here, one problem solved is always a good news.

Quote:
There's still an error message about the mount failing during the boot sequence, but based on krinn's comments that to be expected (yes?).
No, your system is not supposed to even try mounting NFS until your network is up and running, and once it is up and running, nfsmount should succeed.

Quote:
ps: _netdev is an option for mount where the system is unable to determine itself if the device is network or not, a special case, using _netdev on nfs mount does nothing, system is fully aware nfs mount are network mount.
Does localmount script know about it/mount know that network is down? I checked script and manpages for _netdev, it explicitly orders mount to skip those devices. I'm not sure what happens without this parameter.
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 436

PostPosted: Fri Jun 15, 2018 7:49 pm    Post subject: Reply with quote

My /etc/conf.d/localmount only has comments in it; I haven't altered it. Should I alter it to make localmount aware of the network issues?
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3095

PostPosted: Fri Jun 15, 2018 11:11 pm    Post subject: Reply with quote

jyoung, localmount is a service that keeps all of its relevant configuration in a completely unrelated file: /etc/fstab. There should be no need to ever alter it. You just make sure it's enabled in "boot" runlevel and let it do its job.


Now, I inspected a few other init script and I spotted an epic fail on our part:
nfsmount is an outdate, dummy script now. It has been replaced by a set of nfsclient + netmount.
Netmount actually "wants" nfsclient, so adding netmount to "default" runlevel should be sufficient, since it will attempt to start nfsclient if it finds nfs share in /etc/fstab.

Back to the failure mounting NFS share: which particular service attempts and fails to mount NFS share before network start?
Are you actually on linux kernel, or bsd one?
I'm confused, something doesn't fit. A quick summary of current issue would help us ensure we're on the same page.
I know it may seem redundant. Still, natural language is not a very strict protocol and redundant data helps a lot with error correction.
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 436

PostPosted: Sat Jun 16, 2018 12:24 am    Post subject: Reply with quote

I have the NFS share setup in /etc/fstab, so that means that the service which is trying to start it would be netmount (yes?)

Prior to the system update, all that was needed for this machine to start properly was the config in /etc/fstab and nfsmount in the default runlevel. This machine is a node in a cluster, and it needs to have the NFS share mounted as part of the boot processes.

netmount and nfsclient are both in the default and boot runlevels. Should I remove nfsclient?
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21431

PostPosted: Sat Jun 16, 2018 1:19 am    Post subject: Reply with quote

jyoung wrote:
I have the NFS share setup in /etc/fstab, so that means that the service which is trying to start it would be netmount (yes?)
Maybe. That's why szatox asked you to tell us what service is mounting it. The output around when it fails should show what we need. Please quote it to us.
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7470

PostPosted: Sat Jun 16, 2018 1:43 pm    Post subject: Reply with quote

It might help you seeing it:
Code:
rc-update | grep "nfs\|mount"
           localmount | boot                                         
             mount-ro |                        shutdown               
             netmount |      default                                 

nfsmount content should had been change (forget to etc-update?) doing nothing than just emit a deprecated message, here it is:
Code:
   ewarn "nfsmount is deprecated, please migrate as described in the news i
tem: 2015-02-02-nfs-service-changes"
   ewarn "This migration script will be removed after 01 Aug 2015."

szatox wrote:
Does localmount script know about it/mount know that network is down?

No, it never know if the network is down, it only know if the device is network or not. nfs is a known network fs, that's why you don't have to hint about it with _netdev
But with or without _netdev, it will try to mount them ; to balance this, the service depends on net provider, still even if net is up, it doesn't mean it will works (net is up if the card is up, still without a cable link, the network is not "ready", another case, net is up, cable is plug, but the server itself is down or just nfsd is not start).
_netdev is for device that could be both network and local, with _netdev you make things clear the device is not local.
I balance this myself with that initscript ; because if you are mounting 6 shares from that host, and that host is down, you have to wait for each timeout (if timeout is set to 30s, it mean 30sx6 delay ; boring)
Code:
depend() {
   need net
}

start() {
   ebegin "Starting ifserverup"
   test=$(ping -c1 192.168.0.6);
   rc=$?
   if [ $rc -eq 0 ]; then
      /etc/init.d/netmount start
   else
      ewend $rc "Server is down"
      return 1
   fi
   return 0
}

stop() {
   ebegin "Stopping network share"
   /etc/init.d/netmount stop
   eend $?
}
Back to top
View user's profile Send private message
P.Kosunen
Guru
Guru


Joined: 21 Nov 2005
Posts: 309
Location: Finland

PostPosted: Sat Jun 16, 2018 4:33 pm    Post subject: Reply with quote

/etc/conf.d/netmount:
rc_need="dhcpcd"

/etc/dhcpcd.conf:
waitip 4

I have "waitip 4" option in dhcpcd.conf for dhcpcd to wait until IPv4 is ready before boot continue and netmount set to wait dhcpcd before starting. Also "_netdev" must be in fstab options for NFS mounts. IIRC OpenRC paraller start must be disabled.
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 436

PostPosted: Sun Jun 17, 2018 10:00 pm    Post subject: Reply with quote

Here's the snippet from /var/log/daemon.log that shows the NFS failure. And, actually, I'm seeing that NTP is failing too at first. I missed that among the messages scrolling past during boot.

Code:
Jun 12 15:12:45 box24 dhcpcd[4290]: control command: dhcpcd -m 2 enp0s31f6
Jun 12 15:12:45 box24 ntpdate[5439]: name server cannot be used: Temporary failure in name resolution (-3)
Jun 12 15:12:45 box24 /etc/init.d/ntp-client[5418]: ERROR: ntp-client failed to start
Jun 12 15:12:46 box24 /etc/init.d/netmount[5496]: Failed to mount /cluster
Jun 12 15:12:46 box24 /etc/init.d/netmount[5472]: ERROR: netmount failed to start


Krinn, yes, I did forget to do etc-update. Once I ran that, the machine no longer mounted the NFS share (recall that removing dhcpcd from boot runlevel allow NFS to mount, albeit with an error message). One of the files that was updated by etc-update was rc.conf. The difference between the old and the updated version was that the old (working) version had:

Code:
rc_depend_strict="NO"
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 436

PostPosted: Tue Jun 19, 2018 9:01 pm    Post subject: Reply with quote

During a recent reboot I was able to catch a message not reported in /var/log/daemon.log:

Code:
mount.nfs Network is unreachable



P.Kosunen, I tried your setup, modifying for IPv6:

/etc/conf.d/netmount:
rc_need="dhcpcd"

/etc/dhcpcd.conf:
waitip 6

That seems to work; there's no longer a message either at boot or in the logs about the NFS share failing to mount. To make this solution a bit more general, would there be a way to make dhcpcd wait for either IPv4 or IPv6?

krinn, I'm very interested in your solution as it seems a good idea for cases where there's no connection. This script is in /etc/init.d, and you added it to rc, yes?
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7470

PostPosted: Tue Jun 19, 2018 10:36 pm    Post subject: Reply with quote

jyoung wrote:
krinn, I'm very interested in your solution as it seems a good idea for cases where there's no connection. This script is in /etc/init.d, and you added it to rc, yes?

yes in /etc/init.d
implementation is easy, change ping test with your server IP and
Code:
rc-update add ifserverup
rc-update del netmount

You might have notice how simple it is, but it do the job, but if someone is in mood for improvements ;)
Back to top
View user's profile Send private message
P.Kosunen
Guru
Guru


Joined: 21 Nov 2005
Posts: 309
Location: Finland

PostPosted: Thu Jun 21, 2018 11:06 am    Post subject: Reply with quote

jyoung wrote:
To make this solution a bit more general, would there be a way to make dhcpcd wait for either IPv4 or IPv6?

If you leave number out it should do that(?).
Back to top
View user's profile Send private message
jyoung
Guru
Guru


Joined: 20 Mar 2007
Posts: 436

PostPosted: Fri Jun 22, 2018 6:13 pm    Post subject: Reply with quote

I just tried leaving the number out, and it still works on IPv6. Unfortunately, I don't have an IPv4 network to test with. I also ran the additional test of booting without a connection at all. Of course, it didn't start NFS or NTP, but it also didn't get stuck or anything like that.

Unless any objects, I'm going to mark this thread as SOLVED.
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3095

PostPosted: Fri Jun 22, 2018 11:09 pm    Post subject: Reply with quote

This solution sucks, but it's up to you to decide whether or not you're happy with it.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum