Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
wireguard stopped connecting, again! [solved]
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
jesnow
l33t
l33t


Joined: 26 Apr 2006
Posts: 856

PostPosted: Tue Feb 06, 2024 7:22 pm    Post subject: wireguard stopped connecting, again! [solved] Reply with quote

Wireguard just stopped connecting for no obvious reason. It was working fine yesterday, and today it doesn't. I have two issues with this.

First issue: Wireguard isn't working.

Wireguard stopped working. I can't do anything from the wireguard interface. I have to enable debugging messages using:

Code:
echo module wireguard +p > /sys/kernel/debug/dynamic_debug/control


Then I get super cryptic messages like:

Code:

Feb  6 12:53:16 pogacar kernel: wireguard: wg0: Sending handshake initiation to peer 1 (104.176.81.55:51820)
Feb  6 12:53:21 pogacar kernel: wireguard: wg0: Handshake for peer 1 (104.176.81.55:51820) did not complete after 20 attempts, giving up
Feb  6 12:53:36 pogacar kernel: wireguard: wg0: Sending handshake initiation to peer 1 (104.176.81.55:51820)
Feb  6 12:53:41 pogacar kernel: wireguard: wg0: Handshake for peer 1 (104.176.81.55:51820) did not complete after 5 seconds, retrying (try 2)
Feb  6 12:53:41 pogacar kernel: wireguard: wg0: Sending handshake initiation to peer 1 (104.176.81.55:51820)
Feb  6 12:53:43 pogacar kernel: wireguard: wg0: No peer has allowed IPs matching 239.255.255.250
...


Here is the /etc/wireguard/wg0.conf:

Code:

# Pogacar
Address = 10.0.17.2/32
#SaveConfig = true
#PostUp = iptables -A FORWARD -i wg0 -j ACCEPT; iptables -t nat -A POSTROUTING -o en01 -j MASQUERADE;
#PostDown = iptables -D FORWARD -i wg0 -j ACCEPT; iptables -t nat -D POSTROUTING -o en01 -j MASQUERADE;
ListenPort = 51820
PrivateKey = =
# public key: =

[Peer]
#Merckx
PublicKey =
AllowedIPs = 10.0.17.1/32
Endpoint = merckx.vesarius.net:51820
PersistentKeepalive = 30


It was all working yesterday and I don't know what changed. I did *not* update anything.

I had this exact problem a month ago with another machine. At that time I thought it was after a world update and went down several rabbit holes, but it was not. In fact I could find no reason for the failure and a couple weeks later after the whole network was powered down unexpectedly it all started working again.

https://forums.gentoo.org/viewtopic-t-1166203-highlight-.html

I'm as much at a loss now as I was then. At least I have other machines I can use.

I've already verified all the keys, checked that the network cable is plugged in, rebooted (I'm a former mac user, we do that to solve problems), *cold* booted. Last time when it "just started working again", I had not touched the keys and I'm not doing it now. They are verified.

Issue #2: Silent blocking fail is no bueno

Wireguard apparently thinks it's OK to be starting up and not able to connect to the peers in its list. You have to do extra stuff (see above) to even be sure wireguard isn't working. But if I have network shares mounted over wireguard (why the hell else would I even have it?) wireguard throws no error when it can't connect. Instead, it blithely keeps going and lets nfs be the culprit that *stops my boot process*. So I can't boot all the way to a login prompt if wireguard is not working.

This is a bad failure mode. So what can I do to prevent wireguard from holding up the entire boot sequence?

Thanks for any insight,
Jon.


Last edited by jesnow on Wed Feb 07, 2024 7:38 pm; edited 1 time in total
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Tue Feb 06, 2024 9:08 pm    Post subject: Reply with quote

jesnow,

There are two part of this, User expectation and Technical error.

User expectation,

From your post it seems you expect wireguard will always work and when it is not working wireguard will notify and propagate error to entire system. So network error is reactive and Gentoo have no default setup for handle network error.

I am no expert in setting up wireguard, so I am refer to general network setup. In a general network (wired or wireless) because the connectivity function were designed layer by layer, the bottom layer is physical (as in wire, power, etc...) once bottom layer is power up and wire connect to each end, it is up to upper layer define if they can talk to each other, and each layer define different kind of condition for error correction and recovery, as some lower layer were handled in kernel and some upper layer were handled in user space.

Layers in kernel can only provide notification in term of event and capture the event is something like syslog or some program using kernel API to capture and react.

So if a connection is critical, you will need to setup tool(s) to pull or probe lower layer in kernel to see if kernel have generate event(s) indicate its current condition

I suspect wireguard have some control parameters that can set for number of connection retries and fail the connection when retry exceeding the control setting.

In term for the NFS file share failure, the NFS is a upper layer protocol, it depend on lower network connectivity to function, so in this case it is victim of wireguard not indicate connection lost therefor it is waiting for wireguard its own recovery before it will decide if any error occurred.

Technical error,

Because network connectivity depend on many predictable and non-predictable thingy, so it is very hard to say where something gone wrong,

But from your debugging session error messages it seems indicate the wireguard lost connection session and is trying to reestablish connection but failed, the message
Code:
Feb  6 12:53:16 pogacar kernel: wireguard: wg0: Sending handshake initiation to peer 1 (104.176.81.55:51820)
Feb  6 12:53:21 pogacar kernel: wireguard: wg0: Handshake for peer 1 (104.176.81.55:51820) did not complete after 20 attempts, giving up
This could have many reason, for example your end point at receiving end there are some sort of problem therefor it is not actually reachable at lower layers. so without network packet analysis we will never know.

However a simple quick question we can verify is can you confirm from your existing setting that 104.176.81.55 is the DNS name "merckx.verarius.net" IP address? Since DNS can easily fool people, can you please perform the DNS verify from two different system, one from you wireguard end point and the other from a different node to confirm they both return the same IP address for the name. (I have been fooled many time by staled DNS name so just want to be sure)
Code:

Feb  6 12:53:36 pogacar kernel: wireguard: wg0: Sending handshake initiation to peer 1 (104.176.81.55:51820)
Feb  6 12:53:41 pogacar kernel: wireguard: wg0: Handshake for peer 1 (104.176.81.55:51820) did not complete after 5 seconds, retrying (try 2)
"not complete after 5 seconds" usually indicate lower network layer error, in this case TCP/IP having problem (not always, but majority of time). so again check DNS, ping the receiving end or using alternative connection method to verify, before making changes to wireguard configuration.

Debugging and aftermath configuration,

NFS, "timeo=n", you can try to reduce its default from 600 (60 seconds) to for example to 50 (5 seconds), this is a how soon to retry NFS request, together with "retrans=n" (tcp default 2) so you can get NFS layer to emitting "server not responding" message, this give opportunity for monitoring tool to react.

Add "soft" option to NFS mount will let you application that use NFS share receive OS error indicate file system not function. If no "soft" option or use "hard", the retry will be indefinitely.

Witeguard, I cannot find wireguard monitoring mechanism online using wireguard configuration, so you may already using the best setup. (echo module wireguard +p ....)

For long term you may want to setup monitoring tool like net-analyzer/snort and/or Prometheus to help quick diagnostic,
Back to top
View user's profile Send private message
jesnow
l33t
l33t


Joined: 26 Apr 2006
Posts: 856

PostPosted: Wed Feb 07, 2024 12:40 am    Post subject: Reply with quote

Thanks!

I do know that all the other parts of the system (DNS, etc) are working correctly and that the remote host is available.

The host merckx is in my /etc/hosts, and while it is behind a firewall and can't ping, it can ssh:


Code:

jesnow@pogacar ~ $ ssh merckx
Last login: Tue Feb  6 16:31:33 2024 from 130.39.188.145
jesnow@merckx ~ $


I can go a step farther and use a tunnel to access the nfs mounts on merckx, or use samba, this is my fallback I have used previously to get around this problem with wireguard.

My expectation is, that when wireguard experiences an error condition like a failed handshake it should log it. I could put the debugging command in a startup script I suppose. But what failed about the handshake? Was there any communication at all? Wireguard is very cryptic about this.

Meanwhile, in the next office with an identical setup the host vanaert is working perfectly. I use the host name with a w appended to point to the wireguard tunnel, so it's "merckx" for straight ssh and "merckxw" for the tunnel connection.

Code:


jesnow@vanaert ~ $ ping merckxw
PING merckxw (10.0.17.1) 56(84) bytes of data.
64 bytes from merckxw (10.0.17.1): icmp_seq=1 ttl=64 time=30.2 ms
64 bytes from merckxw (10.0.17.1): icmp_seq=2 ttl=64 time=30.4 ms
64 bytes from merckxw (10.0.17.1): icmp_seq=3 ttl=64 time=30.5 ms
64 bytes from merckxw (10.0.17.1): icmp_seq=4 ttl=64 time=30.5 ms
64 bytes from merckxw (10.0.17.1): icmp_seq=5 ttl=64 time=29.9 ms
64 bytes from merckxw (10.0.17.1): icmp_seq=6 ttl=64 time=29.8 ms
^C
--- merckxw ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 5008ms
rtt min/avg/max/mdev = 29.774/30.195/30.484/0.269 ms
jesnow@vanaert ~ $ ssh merckxw
Last login: Tue Feb  6 16:31:33 2024 from 130.39.188.145
jesnow@merckx ~ $



This is what pogacar is supposed to be doing. What it has been doing and did successfully even last december when vanaert suddenly stopped connecting.


As for issue 2: wg is either s system resource in which case it should report its activity to syslog or it's an addon in which case it should have its own log. Either way I should be able to know what wireguard is doing. I don't like how silent but deadly (pun intended) its fails are.


Cheers,
Jon.
Back to top
View user's profile Send private message
jesnow
l33t
l33t


Joined: 26 Apr 2006
Posts: 856

PostPosted: Wed Feb 07, 2024 7:37 pm    Post subject: Reply with quote

I have solved this problem.

If you have rebuilt the kernel, you must re-emerge net-vpn/wireguard-tools. This installs the encryption keys for the new kernel's modules to use. You must then reboot. I was not able to get it to work using rmmod and restarting the init script. I won't get caught by this again!

Could we please add this to the gentoo wiki regarding wireguard?

I seem to recall there is an automatic list that will prompt to update packages that depend on a specific kernel build (like nvidia-drivers) does.

Cheers,
Jon.
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21637

PostPosted: Wed Feb 07, 2024 8:20 pm    Post subject: Reply with quote

The set @module-rebuild should refer to installed ebuilds that installed kernel modules. From looking at the ebuild for net-vpn/wireguard-tools, it does not look to me like it builds any kernel modules. On a system where you had this problem, what is the output of zgrep WIREGUARD /proc/config.gz ; emerge --pretend --verbose net-vpn/wireguard-tools net-vpn/wireguard-modules ; equery f net-vpn/wireguard-tools net-vpn/wireguard-modules?
Back to top
View user's profile Send private message
jesnow
l33t
l33t


Joined: 26 Apr 2006
Posts: 856

PostPosted: Wed Feb 07, 2024 11:40 pm    Post subject: Reply with quote

I think you're right: I think the modules are in the kernel and they are looking for key signatures of some kind, wireguard-tools does this at install time and at key generation time. I don't know the details, but that's why both methods work.

Now that I know what to look for I'm finding out a lot more.

Exactly emerge @rebuild-modules *should* work too, but I thought I did that.

Code:

anaert jesnow # grep wireg /lib/modules/6.6.6-calculate/modules.alias
alias net-pf-16-proto-16-family-wireguard wireguard
alias rtnl-link-wireguard wireguard


I don't have wireguard-modules installed as they are now in the kernel sources.

Hu wrote:
The set @module-rebuild should refer to installed ebuilds that installed kernel modules. From looking at the ebuild for net-vpn/wireguard-tools, it does not look to me like it builds any kernel modules. On a system where you had this problem, what is the output of zgrep WIREGUARD /proc/config.gz ; emerge --pretend --verbose net-vpn/wireguard-tools net-vpn/wireguard-modules ; equery f net-vpn/wireguard-tools net-vpn/wireguard-modules?


Code:

vanaert jesnow # zgrep WIREGUARD /proc/config.gz ; emerge --pretend --verbose net-vpn/wireguard-tools net-vpn/wireguard-modules ; equery f net-vpn/wireguard-tools net-vpn/wireguard-modules
CONFIG_WIREGUARD=m
# CONFIG_WIREGUARD_DEBUG is not set

Local copy of remote index is up-to-date and will be used.

Local copy of remote index is up-to-date and will be used.

These are the packages that would be merged, in order:

Calculating dependencies... done!
Dependency resolution took 1.44 s (backtrack: 0/20).

[binary   R    ] net-vpn/wireguard-tools-1.0.20210914::gentoo  USE="wg-quick (-selinux)" 0 KiB
[ebuild  N     ] net-vpn/wireguard-modules-1.0.20220627-r1::gentoo  USE="module strip -debug -dist-kernel -module-src -modules-compress -modules-sign" 258 KiB

Total: 2 packages (1 new, 1 reinstall, 1 binary), Size of downloads: 258 KiB
 * Searching for wireguard-tools in net-vpn ...
 * Contents of net-vpn/wireguard-tools-1.0.20210914:
/etc
/etc/init.d
/etc/init.d/wg-quick
/etc/wireguard
/lib
/lib/systemd
/lib/systemd/system
/lib/systemd/system/wg-quick.target
/usr
/usr/bin
/usr/bin/wg
/usr/bin/wg-quick
/usr/share
/usr/share/bash-completion
/usr/share/bash-completion/completions
/usr/share/bash-completion/completions/wg
/usr/share/bash-completion/completions/wg-quick
/usr/share/doc
/usr/share/doc/wireguard-tools-1.0.20210914
/usr/share/doc/wireguard-tools-1.0.20210914/README.md.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/dns-hatchet
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/dns-hatchet/README.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/dns-hatchet/apply.sh.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/dns-hatchet/hatchet.bash.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/embeddable-wg-library
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/embeddable-wg-library/Makefile
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/embeddable-wg-library/README.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/embeddable-wg-library/test.c.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/embeddable-wg-library/wireguard.c.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/embeddable-wg-library/wireguard.h.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests/go
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests/go/main.go.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests/haskell
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests/haskell/Setup.hs
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests/haskell/package.yaml.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests/haskell/src
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests/haskell/src/Data
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests/haskell/src/Data/Time
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests/haskell/src/Data/Time/TAI64.hs.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests/haskell/src/Main.hs.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests/haskell/stack.yaml
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests/python
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests/python/main.py.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests/rust
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests/rust/Cargo.toml.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests/rust/src
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/external-tests/rust/src/main.rs.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/extract-handshakes
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/extract-handshakes/Makefile.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/extract-handshakes/README.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/extract-handshakes/extract-handshakes.sh.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/extract-handshakes/offset-finder.c.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/extract-keys
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/extract-keys/Makefile.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/extract-keys/README.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/extract-keys/config.c.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/extract-keys/extract-keys.c.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/highlighter
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/highlighter/Makefile.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/highlighter/README.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/highlighter/fuzz.c.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/highlighter/gui
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/highlighter/gui/highlight.cpp.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/highlighter/gui/highlight.pro
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/highlighter/highlight.c.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/highlighter/highlighter.c.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/highlighter/highlighter.h.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/json
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/json/README
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/json/wg-json.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/keygen-html
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/keygen-html/README.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/keygen-html/keygen.html
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/keygen-html/wireguard.js
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/launchd
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/launchd/README.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/launchd/com.wireguard.wg0.plist.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/nat-hole-punching
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/nat-hole-punching/README.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/nat-hole-punching/nat-punch-client.c.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/nat-hole-punching/nat-punch-server.c.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/ncat-client-server
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/ncat-client-server/README.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/ncat-client-server/client-quick.sh.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/ncat-client-server/client.sh.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/ncat-client-server/server.sh.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/reresolve-dns
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/reresolve-dns/README.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/reresolve-dns/reresolve-dns.sh.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/sticky-sockets
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/sticky-sockets/README.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/sticky-sockets/sticky-sockets.c.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/synergy
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/synergy/README.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/synergy/synergy-client.sh.bz2
/usr/share/doc/wireguard-tools-1.0.20210914/contrib/synergy/synergy-server.sh.bz2
/usr/share/man
/usr/share/man/man8
/usr/share/man/man8/wg-quick.8.bz2
/usr/share/man/man8/wg.8.bz2

!!! No installed packages matching 'net-vpn/wireguard-modules'
 * Searching for wireguard-modules in net-vpn ...
vanaert jesnow #



Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum