Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED] how to reboot (without pulling the plug)
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
dpaddy
Tux's lil' helper
Tux's lil' helper


Joined: 25 Jun 2008
Posts: 142

PostPosted: Wed May 03, 2017 7:17 pm    Post subject: [SOLVED] how to reboot (without pulling the plug) Reply with quote

I think I have lots of problems related to nfs, but perhaps the first to solve is: How do I make the command
Code:
reboot
work so that the machine reboots?

What happens is that the machine begins to shutdown but hangs after printing the message
Quote:
Unmounting network filesystems ...
I could pull the power cord, but I'ld rather learn how to make the reboot command work. :?

The problem has two parts. If I'm typing reboot on a client machine, then the situation is as described above. If I type reboot on the server machine then the message is preceded by
Quote:
Error: nfs failed to stop


I realize I've got problems with nfs -- and I'ld like to know how to fix them -- but first: is there a way to make the reboot command work, or is it the case that when nfs is misconfigured, pulling power cords might be the only way to shutdown the machines?


Last edited by dpaddy on Fri May 05, 2017 9:44 pm; edited 1 time in total
Back to top
View user's profile Send private message
NTU
Apprentice
Apprentice


Joined: 17 Jul 2015
Posts: 187

PostPosted: Wed May 03, 2017 7:37 pm    Post subject: Reply with quote

What does
Code:
telinit 0
or
Code:
shutdown -r now
do for you?

Sent from my Droid 4 XT894


Last edited by NTU on Wed May 03, 2017 8:04 pm; edited 1 time in total
Back to top
View user's profile Send private message
dpaddy
Tux's lil' helper
Tux's lil' helper


Joined: 25 Jun 2008
Posts: 142

PostPosted: Wed May 03, 2017 7:54 pm    Post subject: Reply with quote

Near as I can tell the machines are frozen ("telinit 0" and "shutdown -r now" have no effect). I guess there is nothing to be done... so I'll pull power chords :(

I'll interpret your reply as a suggestion to in the future type "telinit 0" or "shutdown -r now" as an alternative to reboot. I am generally cluless -- is "telinit 0" safe as compared with reboot?
Back to top
View user's profile Send private message
NTU
Apprentice
Apprentice


Joined: 17 Jul 2015
Posts: 187

PostPosted: Wed May 03, 2017 8:04 pm    Post subject: Reply with quote

Back on my Linux rig again.

The reason your machine won't shut down sounds like it's because OpenRC can't stop the NFS process; stopping services and unmounting filesystems is a required part of the init shutdown service as that's what it does to properly kill whatever processes and applications are loaded in memory; this is to avoid corruption and improper exiting of such things.

Are you using systemd or eudev+OpenRC? Are you able to properly shutdown / restart the server even though that error shows up or does that hang too? What happens if you try to reboot the NFS server without a client connected to it (client is unplugged for example) do you get any errors or hangs on the server?

You really don't want to be unplugging power cords and forcing it to shut-off as this can result in filesystem corruption and/or loss of data.

Hey, one more post and you'll be Tux's lil' helper like me! Congrats!
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3136

PostPosted: Wed May 03, 2017 8:31 pm    Post subject: Reply with quote

Quote:
Unmounting network filesystems ...
Looks like a broken connection to NFS server. You mounted some NFS share, and then server went offline. As a result you have a filesystem that is at the same time mounted and unreachable, and NFS by default freezes IO rather than return an error, so you have to bring the server up again to let your system reconnect and then properly unmount.
You can try using NFS with option "soft" to mitigate this in the future.
For now you can:
* force lazy unmount ( umount -l -f ) which should work but in some cases does not
* force sync to avoid local FS corruption (with a command "sync" or Ctrl+SysRq+S) and then force reboot (reboot -f will skip the whole init, Ctr+SysRq+B works similar)
Back to top
View user's profile Send private message
dpaddy
Tux's lil' helper
Tux's lil' helper


Joined: 25 Jun 2008
Posts: 142

PostPosted: Wed May 03, 2017 9:05 pm    Post subject: Reply with quote

After pulling plugs, I restarted, then put
Quote:
/bin/umount -a -f -t nfs,nfs4
into /etc/local.d/nfs.stop (on the server) and made it executable. MAYBE IT SHOULD BE ON CLIENTS AS WELL :?:

Should I use
Quote:
/bin/umount -a -l -f -t nfs,nfs4
instead :?:

I also added the no_root_squash option in /etc/exports (on the server) as for example
Quote:
/home/nfs 123.456.7.89(no_root_squash,insecure,rw,sync,fsid=0)
because I was having issues with root writing to the shared directory.

Finally, I changed the mount entry (on clients and server) in /etc/fstab to
Quote:
hp24:/home/nfs /home/nfs nfs4 defaults,noatime,intr 0 0

Things work now, although that could have been a result of the order in which "reboot" happened on the machines.

However, there would seem to be a chicken-and-egg issue... How would I know whether "reboot" hangs untill it does, and in that case, it would seem too late to type something else in place of "reboot"... Is there an alternative that avolds the chicken-and-egg issue... an analogue of "emerge -pv ..." :?:

Or maybe a better command to be using, perhaps I should type "reboot -f" all the time so as to avoid problems :?:
Back to top
View user's profile Send private message
NTU
Apprentice
Apprentice


Joined: 17 Jul 2015
Posts: 187

PostPosted: Wed May 03, 2017 9:09 pm    Post subject: Reply with quote

szatox wrote:
As a result you have a filesystem that is at the same time mounted and unreachable, and NFS by default freezes IO rather than return an error

I figured because he's new I'd try to at least make sure he can shut his server down and didn't break something else (i.e. sysvinit/systemd) before jumping straight into debugging NFS. Raises a few eyebrows when you need to resort to cable pulling. Assuming all else is fine and it's strictly an NFS problem, yes the filesystem should be unmounted before turning off the NFS daemon service, however I think OpenRC should factor this in with a timeout value much like it does for DHCP and NTP?.. Seems odd they'd leave it broken like this? "Yeah let's just make the user resort to their Magic SysRq key" :? This all seems wrong to me..

edit: Custom created local init scripts isn't the right way to do this either. /etc/fstab and kernel automounter (v4) should handle this.

https://www.centos.org/docs/5/html/Deployment_Guide-en-US/s1-nfs-client-config.html

https://www.kernel.org/doc/Documentation/filesystems/autofs4.txt

https://wiki.gentoo.org/wiki/AutoFS
Back to top
View user's profile Send private message
dpaddy
Tux's lil' helper
Tux's lil' helper


Joined: 25 Jun 2008
Posts: 142

PostPosted: Wed May 03, 2017 9:18 pm    Post subject: Reply with quote

Oh, maybe I'm beginning to understand... I need to learn about
Quote:
Magic SysRq key
because that is the answer when the system hangs :?:

I'll have a look at
Quote:
https://wiki.gentoo.org/wiki/Magic_SysRQ


But to speed my learning curve, what is recommended -- what command keys in what order :?:


Thanks!
Back to top
View user's profile Send private message
NTU
Apprentice
Apprentice


Joined: 17 Jul 2015
Posts: 187

PostPosted: Wed May 03, 2017 9:25 pm    Post subject: Reply with quote

I'm not trying to be rude here so please don't take offense, but if you spent more time learning how to properly set up your init system to handle NFS and mount and unmount properly, you wouldn't need to use Magic SysRq, reboot -f etc, if you're doing any of those things, it isn't a solution. It's a sign something is wrong. Please look over the documentation in the links I posted for handling NFS over fstab (my preferred way) or automount/autofs.

You should never have to do any of the things you're doing in a sane Linux environment. The direction you're taking, tackling your shutdown problem, shouldn't be faced from a "what keys do I press to trigger SysRq" angle.
Back to top
View user's profile Send private message
dpaddy
Tux's lil' helper
Tux's lil' helper


Joined: 25 Jun 2008
Posts: 142

PostPosted: Wed May 03, 2017 9:32 pm    Post subject: Reply with quote

Fair enough, I appreciate the guidance.
Back to top
View user's profile Send private message
dpaddy
Tux's lil' helper
Tux's lil' helper


Joined: 25 Jun 2008
Posts: 142

PostPosted: Fri May 05, 2017 9:44 pm    Post subject: Reply with quote

Magic SysRq key is the answer!
Quote:
https://wiki.gentoo.org/wiki/Magic_SysRQ
http://www.linuxhowtos.org/Tips%20and%20Tricks/sysrq.htm


Things occasionally go wrong; one will not always be "in a sane Linux environment"... when that happens, Magic SysRq key to the rescue :wink:
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21633

PostPosted: Sat May 06, 2017 12:50 am    Post subject: Reply with quote

Sysrq is a software alternative to disconnecting the power cable (or holding the power button long enough for the firmware to force power off); it is still an unclean halt used as a workaround for systems that are hung too hard for a normal shutdown. Whenever possible, you should avoid entering a situation where an unclean halt is required.
Back to top
View user's profile Send private message
gordonb3
Apprentice
Apprentice


Joined: 01 Jul 2015
Posts: 185

PostPosted: Sun May 07, 2017 10:40 am    Post subject: Reply with quote

I get this from time to time on ARM systems that don't receive frequent updates. Might be related to using a (crossdev) binhost, or simply because the system fell back further than the devs take into consideration, but running services tend to freeze when trying to write to locations that were removed during the merge. Such changes are usually not obvious and then you'll run into the problem of the system freezing during shutdown.
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7470

PostPosted: Mon May 08, 2017 5:40 am    Post subject: Reply with quote

dpaddy wrote:

Finally, I changed the mount entry (on clients and server) in /etc/fstab to
Quote:
hp24:/home/nfs /home/nfs nfs4 defaults,noatime,intr 0 0

Things work now, although that could have been a result of the order in which "reboot" happened on the machines.

You should fix your nfs too, in nfsv4 mount point are related to the nfs root (fsid=0), while in nfsv3 mount point are related to the path of the export.

so if you have /home/nfs with fsid=0, you define nfsv4 root as this directory
and your clients should mount them
nfsv3: /home/nfs /home/nfs
nfsv4: / /home/nfs

nfsv4 is better with lazzy umount and it might fix your issue using it properly.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum