Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Systemd timing out during local disk mount
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Unsupported Software
View previous topic :: View next topic  
Author Message
Helix
n00b
n00b


Joined: 09 Jun 2005
Posts: 27

PostPosted: Sat Jun 22, 2013 2:40 pm    Post subject: Systemd timing out during local disk mount Reply with quote

Hi all,

I am trying to migrate to systemd from openrc but I am having big problems getting my disks to mount properly during boot.

My previous disk and boot setup looked like this:

I am using a dmcrypt encrypted root partition which is unlocked during boot by some commands in my initramfs. The same passphrase is then also passed to all other encrypted disk partitions and corresponding /dev/mapper/ devices are generated for all, but only the root partition is mounted r/o at the end of the initramfs. After a switch_root, the now-decrypted system's init is called from the initramfs and the boot process continues normally. Later on in the boot process the init system fscks all disks, remounts my root r/w and then mounts all remaining /dev/mapper devices.

This setup has been working flawlessly for years on 4 different machines, before and after the introduction of openrc.

Replacing the final call in the initramfs to point to systemd, the following is happening now:
Everything is booting as expected but when the local filesystems other than the root partition are supposed to be mounted the following error is displayed:

"A start job is running for dev-mapper-data.device ..."

.... ~30 secs pass ...

"Timed out waiting for device dev-mapper-data.device."
"Dependency failed for /data"
"Dependency failed for Local File Systems."

(/data being the mountpoint)

After that the system is greeting me with emergency mode where I can log in using my root password.
If I comment the corresponding line in /etc/fstab (which is possible as my entire system is actually contained in the root partition only, all other partitions are optional in terms in system usage), the booting finishes without any error (and lightning fast ...)

To be honest I am at a complete loss here what might be wrong, except for that I am expecting some device node messup between udev/systemd/initramfs. However, I have too little clue about how systemd works internally or how to debug it, I was planning to start it up into something usable and then finetune and learn things along the way. But currently it looks bad for that :(

Does anyone have any idea what to do or where to find an overview of how the bootprocess using Systemd really works ? Any help is appreciated.

Thanks a lot,
kind regards,
Helix
Back to top
View user's profile Send private message
ulenrich
Veteran
Veteran


Joined: 10 Oct 2010
Posts: 1480

PostPosted: Sat Jun 22, 2013 2:56 pm    Post subject: Reply with quote

I would use the newer dracut-029 of upstream git
https://bugs.gentoo.org/show_bug.cgi?id=473298
Back to top
View user's profile Send private message
smartass
Apprentice
Apprentice


Joined: 04 Jul 2011
Posts: 189
Location: right behind you ... (you did turn around, didn't you?)

PostPosted: Sat Jun 22, 2013 2:59 pm    Post subject: Reply with quote

If I understood you correctly, all your partitions are encrypted, so if systemd is able to check and mount them and not the data partition, then that might signal that something is wrong with that partition. Any special mount options that are different from the other partitions?

You might want to boot using a live medium (or just boot without checking the data partition to get a working system) and check and mount the partition manually, just to be sure.
Back to top
View user's profile Send private message
TomWij
Retired Dev
Retired Dev


Joined: 04 Jul 2012
Posts: 1553

PostPosted: Sat Jun 22, 2013 6:36 pm    Post subject: Reply with quote

If you want to temporarily fix your boot so that at least is not part of the problem anymore, you could add "noauto" and "x-systemd.automount" (without the quotes) to its entry in /etc/fstab.
Back to top
View user's profile Send private message
Helix
n00b
n00b


Joined: 09 Jun 2005
Posts: 27

PostPosted: Sun Jun 23, 2013 9:32 am    Post subject: Reply with quote

Thanks for the speedy replies.

@ulenrich: I will keep Dracut in mind in case everything else fails.

@smartass: No, sorry, apparently I wasn't clear on this. ALL encrypted drives are showing the same behavior, for testing I am only working with /data though so that I don't have to wait for the timeout multiple times. So, I guess a specific problem with this partition is out.

@TomWij: Yes that option works in booting through to the login, however, as the special option than remains in my fstab I cannot mount manually with "mount /data" anymore. And accessing /data then times out again, the option is as you pointed out just postponing the problem.

I dug a little deeper today in the morning and it seems the root cause it not so much the mount but rather the availability of the /dev/mapper/data device, as systemd says "Job dev-mapper-data.device/start timed out." and everything fails afterwards.

man systemd.device says
Quote:

systemd will automatically create dynamic device units for all kernel devices that are marked with the "systemd" udev tag (by default all block and network devices, and a few others). This may be used to define dependencies between devices and other units.


so now I am suspecting that systemd/udev is somehow unaware of these device nodes as they were created manually in the initramfs before udev or systemd were started. When it comes to starting those nodes the udev notification systemd is expecting never shows up as they are already present or something like that.

So, is there any option to either

a) tag them in initramfs (using udevadm maybe ?) or
b) telling udev/systemd in a configuration to treat those devices the same after leaving initramfs or
c) creating pseudo device unit for systemd that doesn't do anything since the nodes are guaranteed to be there already ?

I believe any of those measures should fix the problem somehow.
Any pointers on that ?

Thanks a lot guys.
Kind regards,
Helix
Back to top
View user's profile Send private message
TomWij
Retired Dev
Retired Dev


Joined: 04 Jul 2012
Posts: 1553

PostPosted: Sun Jun 23, 2013 10:04 am    Post subject: Reply with quote

Since it has to do with the presence of that, might it be the case there is something with one of these kernel options? CONFIG_TMPFS, CONFIG_DEVTMPFS, CONFIG_DEVTMPFS_MOUNT

I don't know more about /dev/mapper in specific; the most sane approach sounds to fix this in the initramfs so you don't have to create extra configuration, though this may be the hardest approach so it may be more feasible to somehow do this from systemd instead.
Back to top
View user's profile Send private message
yoshi314
l33t
l33t


Joined: 30 Dec 2004
Posts: 850
Location: PL

PostPosted: Sun Jun 23, 2013 2:32 pm    Post subject: Reply with quote

There is a lvm bug with systemd, search gentoo bugzilla for patches to it. It is relevant for latest lvm (do not know how to paste on tablet)
_________________
~amd64
shrink your /usr/portage with squashfs+aufs
Back to top
View user's profile Send private message
TomWij
Retired Dev
Retired Dev


Joined: 04 Jul 2012
Posts: 1553

PostPosted: Sun Jun 23, 2013 3:17 pm    Post subject: Reply with quote

yoshi314 wrote:
There is a lvm bug with systemd, search gentoo bugzilla for patches to it. It is relevant for latest lvm (do not know how to paste on tablet)


Has this bug been reported at https://bugs.gentoo.org?
Back to top
View user's profile Send private message
ulenrich
Veteran
Veteran


Joined: 10 Oct 2010
Posts: 1480

PostPosted: Sun Jun 23, 2013 9:15 pm    Post subject: Reply with quote

At last lvm with systemd got some love:
https://bugs.gentoo.org/show_bug.cgi?id=453594#c26
https://bugs.gentoo.org/show_bug.cgi?id=473298#c13

But I am aware upstream systemd developers think of lvm as an unwanted extra layer since btrfs features it all.
Back to top
View user's profile Send private message
yoshi314
l33t
l33t


Joined: 30 Dec 2004
Posts: 850
Location: PL

PostPosted: Mon Jun 24, 2013 7:16 pm    Post subject: Reply with quote

ulenrich wrote:
At last lvm with systemd got some love:
https://bugs.gentoo.org/show_bug.cgi?id=453594#c26
https://bugs.gentoo.org/show_bug.cgi?id=473298#c13

But I am aware upstream systemd developers think of lvm as an unwanted extra layer since btrfs features it all.

the first link is the bug i meant. without it systemd times out waiting for dm device nodes, and even if you create them by hand it behaves erratically.
_________________
~amd64
shrink your /usr/portage with squashfs+aufs
Back to top
View user's profile Send private message
Helix
n00b
n00b


Joined: 09 Jun 2005
Posts: 27

PostPosted: Tue Jun 25, 2013 5:02 pm    Post subject: Reply with quote

Alright guys, thanks for all the help, I made some headway with the problem. Two steps improved the situation:

1) Deleting the /dev/dm-* nodes from the initramfs, apparently this created them with permissions unusable for udev later on.
2) Changing mount points to use the /dev/dm-* files instead of the names from /dev/mapper/ that I handed to cryptsetup.

Unfortunately, I don't think it is very practical to use the /dev/dm-* nodes as those might change between boots.

So, I am giving Dracut a shot, as I suspect I will not get proper udev notification from initramfs unless I use udev/systemd there, and Dracut is probably easier to use for that than trying to build that myself. However, I don't seem to be get the hang of the configuration. All I need to do it is this:

0. Prepare initramfs
1. Read passphrase
2. Decode 3 files using GnuPG (there is a crypt-gpg module which I believe is for this) with the same passphrase.
3. Feed decoded file contents to 3 cryptsetup instances and map /dev/sda[1|2|3] as /dev/mapper/[1|2|3]
4. Mount /dev/mapper/1 as /
5. Continue

Step 1-3 is due to an old Gentoo initramfs init script that advocated feeding a passphrase to GnuPG which then decoded the actual cryptsetup key from a file. The advantage back then was that one could use a long random key with LUKS (in the encrypted file) and rely on the much more tried and mature GnuPG as this shifted the security focus from the LUKS passphrase to the GnuPG passphrase. The safeguard was that possible design-flaws in LUKS would be offset by using a long and random passphrase. As cryptsetup LUKS has matured since then and gotten some review, it would probably be possible to safely dispense with the additional GnuPG layer by now, but I'd rather keep the setup as is for now. Finally the same GnuPG passphrase is used for 3 different GnuPG encrypted files containing different random keys for 3 different partitions.

Any pointers on how to achieve this with Dracut ? Pseudocode for Hookscripts (if needed) would also suffice I believe. I just don't know where to start right now.

Thanks a lot guys,
kind regards,
Helix
Back to top
View user's profile Send private message
ulenrich
Veteran
Veteran


Joined: 10 Oct 2010
Posts: 1480

PostPosted: Tue Jun 25, 2013 6:32 pm    Post subject: Reply with quote

qlist dracut|grep gpg
will find the hook script for you if you emerged dracut with
DRACUT_MODULES="crypt-gpg ..."
Back to top
View user's profile Send private message
KShots
Guru
Guru


Joined: 09 Oct 2003
Posts: 590
Location: Florida

PostPosted: Wed Nov 27, 2013 1:10 am    Post subject: Reply with quote

Exact same problem here, although with the newest lvm, I can't even see my device nodes without running 'vgscan --mknodes --ignorelockingfailure' in an infinite loop until the nodes show up. In my case, I think I'm going to try and write something up that can search the /sys filesystem for the dm-? nodes by the name given. Thanks for all the research you did on the problem, I'll post what I come up with when I work through it.

EDIT: For starters, it's easy enough to do this:

Code:
for node in /sys/block/dm-* ; do
   if [ `cat "${node}"/dm/name` = ${partname} ] ; then
      mount -t ${parttype} -o ${partargs} /dev/dm-${node#dm-} ${partdestination}
   fi
done

... but that's only half the equation. Now you need to edit the fstab stored on the root. Perhaps some sed-foo?
_________________
Life without passion is death in disguise
Back to top
View user's profile Send private message
hika
Apprentice
Apprentice


Joined: 13 Mar 2009
Posts: 234
Location: Utrecht

PostPosted: Wed Dec 04, 2013 8:50 pm    Post subject: Reply with quote

I'm currently trying to go over to systemd and have a similar problem with lvm.
With the genkernel initramfs I get nothing.
With dracut I get only my usr volume activated and mounted and it is also the only dm-* node in /dev/ and /proc/partitions. If, after I come to the emergency shell, I run lvs I see the rest but they stay inactive. I also get a comment about lvmetad:
Code:
WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!

When I run vgchange -ay the rest all get mounted and I can continue to default.
So it seems systemd fails to activate them. I'm going to look if I can insert "vgchange -ay" somewhere.

I also have a problem with my nfs mounts. They don't get mounted. I enabled rpcbind, rpc-mountd and rpc-statd and I can mount them manually.
I guess this has to do with the network coming up after mounting. I anyhow need the network up as soon as possible to be able to access my ldap. (among others dbus complains). I had net.enp4s0 in the boot runlevel and didn't use NetworkManager. How do I activate the network in the bootlevel?

Hika
Back to top
View user's profile Send private message
KShots
Guru
Guru


Joined: 09 Oct 2003
Posts: 590
Location: Florida

PostPosted: Wed Dec 04, 2013 9:10 pm    Post subject: Reply with quote

I resolved my dracut issues with lvm and systemd. I had to activate all the volumes as kernel parameters, like so:
Code:
rd.lvm.lv=<logical volume name>
   only activate the logical volumes with the given name. rd.lvm.lv can be specified multiple times on the kernel command line.
I also found the following parameter extremely helpful:
Code:
rdbreak=[pre-udev|pre-mount|mount|pre-pivot|]
   drop the shell on defined breakpoint
With this, I was able to access the dracut emergency shell at various points of its boot cycle, which showed me that dracut was successfully mounting / and /usr (lvm), but was not activating /var, /opt, nor /home, and neither was systemd. So, I already had the following parameter in my kernel parameters:
Code:
rd.lvm.lv=vg/usr
... so I added the following:
Code:
rd.lvm.lv=vg/usr rd.lvm.lv=vg/var rd.lvm.lv=vg/opt rd.lvm.lv=vg/home
... and it booted with all volumes activated and systemd was happy. I could probably do the following (untested):
Code:
rd.lvm.vg=<volume group name>
   only activate the volume groups with the given name. rd.lvm.vg can be specified multiple times on the kernel command line.
I still got the message you talk about:
hika wrote:
Code:
WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
... but it appears to be running. I did note an option inside of lvm.conf that seems to talk about this:
Code:
    # Whether to use (trust) a running instance of lvmetad. If this is set to
    # 0, all commands fall back to the usual scanning mechanisms. When set to 1
    # *and* when lvmetad is running (it is not auto-started), the volume group
    # metadata and PV state flags are obtained from the lvmetad instance and no
    # scanning is done by the individual commands. In a setup with lvmetad,
    # lvmetad udev rules *must* be set up for LVM to work correctly. Without
    # proper udev rules, all changes in block device configuration will be
    # *ignored* until a manual 'vgscan' is performed.
    use_lvmetad = 0
I'd suspect you'd tell dracut to include lvm.conf (dracut.conf or --lvmconf to dracut command) and set use_lvmetad to 1 to get rid of this, but I have yet to test this.

Also, for those using lvm in a custom-built initramfs, I found lots of references to a "lvmwait" kernel parameter, which apparently is used as a hack to get around the original issue I had in this thread. Reference here.

Also, for your nfs issues, are you using nfs3 or nfs4? From what I've seen, nfs4 isn't supported by systemd unit files (at least the stock ones). It has nothing to start nfsidmapd, rpc.gssd, or rpc.svcgssd. I found a wiki full of gentoo systemd unit files helpful for creating unit files for these services, but I can't seem to locate that wiki anymore :(.
_________________
Life without passion is death in disguise
Back to top
View user's profile Send private message
hika
Apprentice
Apprentice


Joined: 13 Mar 2009
Posts: 234
Location: Utrecht

PostPosted: Wed Dec 04, 2013 10:51 pm    Post subject: Reply with quote

I found the simple solution:
in /etc/lvm/lvm.conf
change under global
Code:
use_lvmetad=0

to
Code:
use_lvmetad=1


Now getting my network up early!!

Hika
Back to top
View user's profile Send private message
hika
Apprentice
Apprentice


Joined: 13 Mar 2009
Posts: 234
Location: Utrecht

PostPosted: Wed Dec 04, 2013 10:53 pm    Post subject: Reply with quote

Oh by the way. I think you need lvm2-2.02.103 because with 2.02.97-r1 it is not in the man-page.

Hika
Back to top
View user's profile Send private message
hika
Apprentice
Apprentice


Joined: 13 Mar 2009
Posts: 234
Location: Utrecht

PostPosted: Wed Dec 04, 2013 11:01 pm    Post subject: Reply with quote

About nfs. I think it still uses nfs3, for I never activated 4 on my server explicitly.
The issue is that it already fails the nfs mounts during sysint before the network is up. It does not distinguish between local mounts and nfs mount in fstab. Before you had nfsmount that arranged the right time (after the network is up) to mount the nfs volumes.

Hika
Back to top
View user's profile Send private message
hika
Apprentice
Apprentice


Joined: 13 Mar 2009
Posts: 234
Location: Utrecht

PostPosted: Wed Dec 04, 2013 11:54 pm    Post subject: Reply with quote

It's the service NetworkManager-wait-online instead of just NetworkManager.
Why not by default include this??

But for me it now seems all to work. Except for that I want network up earlier.

Hika
Back to top
View user's profile Send private message
Juan Facundo
Tux's lil' helper
Tux's lil' helper


Joined: 19 Jun 2009
Posts: 138

PostPosted: Fri May 22, 2015 2:21 am    Post subject: Reply with quote

hika wrote:
I found the simple solution:
in /etc/lvm/lvm.conf
change under global
Code:
use_lvmetad=0

to
Code:
use_lvmetad=1


Now getting my network up early!!

Hika


+1... it solved my issue.. thanks.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Unsupported Software All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum