Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
DNS Name Resolution stops after delay
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
dudleyd
n00b
n00b


Joined: 03 Apr 2023
Posts: 15
Location: Michigan

PostPosted: Mon Apr 03, 2023 3:03 pm    Post subject: DNS Name Resolution stops after delay Reply with quote

I'm trying to rescue a system from Debian and update it to Gentoo.

This is a 16-core 4500MHz AMD Ryzen 9 system I plan on using as a database server in the future.

System runs under systemd.

When I first boot the system, everything is great. Websites react as expected (seems to me a little slow compared to Debian, but no complaints).

After running the system for 10-30 minutes (depending on what is active during the time), networking is still active, and I have no problem in pinging a IP address using a IP4 address.

However, if I try to ping a domain name (even a well known name), I will get a name lookup failure in a few minutes.

doing 'systemd restart network-resolved.service' will fix the problem for another few minutes, but again in the future, things will stop in the future.


Thanks for any advise or suggestions-
_________________
David Dudley
Facility for Rare Isotope Beamlines
Michigan State University
Back to top
View user's profile Send private message
alamahant
Advocate
Advocate


Joined: 23 Mar 2019
Posts: 3879

PostPosted: Mon Apr 03, 2023 3:37 pm    Post subject: Reply with quote

Plz post
Code:

ls -l /etc/resolv.conf
cat /etc/resolv.conf

What kind of network manager do you use?
NM or systemd-networkd or other?
Please post
Code:

ls /etc/sysconfig/network-scripts

find the relevant iface file and post its contents.
If you are really desperate just disable and mask and stop systemd-resolved and
Code:

rm /etc/resolv.conf
echo "nameserver 1.1.1.1" > /etc/resolv.conf

pick your preferred dns or router ip.
_________________
:)
Back to top
View user's profile Send private message
dudleyd
n00b
n00b


Joined: 03 Apr 2023
Posts: 15
Location: Michigan

PostPosted: Tue Apr 04, 2023 1:20 am    Post subject: Reply with quote

OK, lets start with this:

This is a systemd system. If it's supposed to have a '/etc/sysconfig' directory, I've missed something.

The installation instructions are a little weak on systemd configuration.

dudleyd@monster ~ $ ls -l /etc/resolv.conf
lrwxrwxrwx 1 root root 32 Apr 3 17:31 /etc/resolv.conf -> /run/systemd/resolve/resolv.conf
dudleyd@monster ~ $ cat /etc/resolv.conf
# This is /run/systemd/resolve/resolv.conf managed by man:systemd-resolved(8).
# Do not edit.
#
# This file might be symlinked as /etc/resolv.conf. If you're looking at
# /etc/resolv.conf and seeing this text, you have followed the symlink.
#
# This is a dynamic resolv.conf file for connecting local clients directly to
# all known uplink DNS servers. This file lists all configured search domains.
#
# Third party programs should typically not access this file directly, but only
# through the symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a
# different way, replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 192.168.12.1
nameserver fe80::ca99:b2ff:feee:bef5%2
search .
dudleyd@monster ~ $ ls /etc/sysconfig
ls: cannot access '/etc/sysconfig': No such file or directory

I don't have a directory called sysconfig. Is that something that initrs has, or do I have something mis-configured?

I have a feeling that this problem has something to do with IPv6 support, which I'm not sure my switch supports well. How do I disable it till I figure this out?
_________________
David Dudley
Facility for Rare Isotope Beamlines
Michigan State University
Back to top
View user's profile Send private message
dudleyd
n00b
n00b


Joined: 03 Apr 2023
Posts: 15
Location: Michigan

PostPosted: Tue Apr 04, 2023 1:22 am    Post subject: Reply with quote

Some more information on this system:

dudleyd@monster ~ $ emerge --info
Portage 3.0.44 (python 3.10.10-final-0, default/linux/amd64/17.1/desktop/plasma/systemd, gcc-12, glibc-2.36-r7, 6.1.19-gentoo-x86_64 x86_64)
=================================================================
System uname: Linux-6.1.19-gentoo-x86_64-x86_64-AMD_Ryzen_9_7950X_16-Core_Processor-with-glibc2.36
KiB Mem: 64974564 total, 56870516 free
KiB Swap: 134217724 total, 134217724 free
Timestamp of repository gentoo: Mon, 03 Apr 2023 22:30:01 +0000
Head commit of repository gentoo: 14b1ba02530944e4c96e7f9ed5ffa0c54bc7670d
sh bash 5.1_p16-r2
ld GNU ld (Gentoo 2.39 p5) 2.39.0
app-misc/pax-utils: 1.3.5::gentoo
app-shells/bash: 5.1_p16-r2::gentoo
dev-lang/perl: 5.36.0-r2::gentoo
dev-lang/python: 3.9.16_p3::gentoo, 3.10.10_p3::gentoo, 3.11.2_p2::gentoo
dev-lang/rust-bin: 1.66.1-r1::gentoo
dev-util/cmake: 3.25.2::gentoo
dev-util/meson: 1.0.1::gentoo
sys-apps/baselayout: 2.13-r1::gentoo
sys-apps/sandbox: 2.29::gentoo
sys-apps/systemd: 252.7::gentoo
sys-devel/autoconf: 2.13-r7::gentoo, 2.71-r5::gentoo
sys-devel/automake: 1.16.5::gentoo
sys-devel/binutils: 2.39-r4::gentoo
sys-devel/binutils-config: 5.4.1::gentoo
sys-devel/clang: 15.0.7-r1::gentoo
sys-devel/gcc: 12.2.1_p20230121-r1::gentoo
sys-devel/gcc-config: 2.8::gentoo
sys-devel/libtool: 2.4.7-r1::gentoo
sys-devel/lld: 15.0.7::gentoo
sys-devel/llvm: 15.0.7::gentoo
sys-devel/make: 4.3::gentoo
sys-kernel/linux-headers: 6.1::gentoo (virtual/os-headers)
sys-libs/glibc: 2.36-r7::gentoo
Repositories:

gentoo
location: /var/db/repos/gentoo
sync-type: rsync
sync-uri: rsync://rsync.gentoo.org/gentoo-portage
priority: -1000
volatile: True
sync-rsync-verify-jobs: 1
sync-rsync-verify-metamanifest: yes
sync-rsync-extra-opts:
sync-rsync-verify-max-age: 24

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/config /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-O2 -pipe"
DISTDIR="/var/cache/distfiles"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GDK_PIXBUF_MODULE_FILE GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR XDG_STATE_HOME"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs binpkg-multi-instance buildpkg-live config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="C.UTF8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LEX="flex"
PKGDIR="/var/cache/binpkgs"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
SHELL="/bin/bash"
USE="X a52 aac acl acpi activities alsa amd64 bluetooth branding bzip2 cairo cdda cdr cli crypt cups dbus declarative developer dri dts dvd dvdr encode exif flac fortran gdbm gif gpm gtk gui hscolour iconv icu ipv6 jpeg kde kwallet lcms libglvnd libnotify libtirpc mad mng mp3 mp4 mpeg multilib ncurses nls nptl ogg opengl openmp pam pango pcre pdf plasma png policykit ppds profile qml qt5 readline sdl seccomp semantic-desktop sound spell split-usr ssl startup-notification svg systemd test-rust tiff truetype udev udisks unicode upower usb vorbis widgets wxwidgets x264 xattr xcb xft xml xv xvid zlib" ABI_X86="64" ADA_TARGET="gnat_2021" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" GRUB_PLATFORMS="efi-64" INPUT_DEVICES="libinput" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-4 php8-0" POSTGRES_TARGETS="postgres12 postgres13" PYTHON_SINGLE_TARGET="python3_10" PYTHON_TARGETS="python3_10" RUBY_TARGETS="ruby30" USERLAND="GNU" VIDEO_CARDS="amdgpu fbdev intel nouveau radeon radeonsi vesa dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq proto steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset: ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CC, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXX, CXXFILT, ELFEDIT, EMERGE_DEFAULT_OPTS, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, INSTALL_MASK, LC_ALL, LD, LFLAGS, LIBTOOL, LINGUAS, MAKE, MAKEFLAGS, MAKEOPTS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, RANLIB, READELF, RUSTFLAGS, SIZE, STRINGS, STRIP, YACC, YFLAGS

dudleyd@monster ~ $
_________________
David Dudley
Facility for Rare Isotope Beamlines
Michigan State University
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3137

PostPosted: Tue Apr 04, 2023 7:44 am    Post subject: Re: DNS Name Resolution stops after delay Reply with quote

dudleyd wrote:
ping a domain name (even a well known name), I will get a name lookup failure in a few minutes.

doing 'systemd restart network-resolved.service' will fix the problem for another few minutes, but again in the future, things will stop in the future.


Thanks for any advise or suggestions-
You probably have systemd-resolved active. Kill it with fire.
Stop it, disable and mask this service; in this order. And also set some actual DNS in /etc/resolv.conf. Your ISP should provide some (usually the best pick), cloudflare provides 1.1.1.1, google provides 8.8.8.8, there are also higher tier resolvers at 4.2.2.2, and you can even go with an alternative provider like opendns for content filters.
If you do want a local DNS cache, use dnsmasq (for simplicity) or bind (for big brainz) or anything else that actually works. Systemd-resolved is never the answer.
Back to top
View user's profile Send private message
dudleyd
n00b
n00b


Joined: 03 Apr 2023
Posts: 15
Location: Michigan

PostPosted: Tue Apr 04, 2023 10:21 am    Post subject: Reply with quote

I get the feeling that systemd is not your preferred package.

We use it everywhere for everything, and I need to use it on this to support a variety of custom software we run.

The Archiver Appliance and some of the functions we use are all setup to run on it already, and are happy on the Debian and Scientific Linux systems, so I wan to maintain that functionality.
_________________
David Dudley
Facility for Rare Isotope Beamlines
Michigan State University
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54247
Location: 56N 3W

PostPosted: Tue Apr 04, 2023 10:36 am    Post subject: Reply with quote

dudleyd,

Do you actually have nameservers at
Code:
nameserver 192.168.12.1
nameserver fe80::ca99:b2ff:feee:bef5%2
?

When it breaks, can you ping them?
I suspect they are both the same next hop towards the internet.

Do you just loose DNS or has your default route gone too?

When its broken what does
Code:
ifconfig -a
show and
Code:
route -n


Lets all avoid thread becoming a philosophical discussion on the merits of systemd. If that happens it will be locked, like so many before it.
Keep it technical. The first step on that path is identifying the root cause of the issue.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3137

PostPosted: Tue Apr 04, 2023 11:16 am    Post subject: Reply with quote

dudleyd wrote:
I get the feeling that systemd is not your preferred package.
You can have whatever feeling you like.
Quote:

We use it everywhere for everything, and I need to use it on this to support a variety of custom software we run.
The Archiver Appliance and some of the functions we use are all setup to run on it already, and are happy on the Debian and Scientific Linux systems, so I wan to maintain that functionality.

You can even use a bed of nails, as far as I'm concerned.
I'm telling you that there is a particular component you are probably using that is known for causing exactly the kind of problems you have and that you can either drop completely or replace with something that actually works. Take it or leave it.
Back to top
View user's profile Send private message
dudleyd
n00b
n00b


Joined: 03 Apr 2023
Posts: 15
Location: Michigan

PostPosted: Tue Apr 04, 2023 11:37 am    Post subject: Reply with quote

These are the IP4 and IP6 addresses of the nameserver on our development network.

When DNS resolution stops, these are still accessible from other systems, with no issues.

On this system, I can continue to ping them using their IP address, but not the names. I can also ping any other IP4 address in the internal network, but none using their DNS names.

ifconfig shows things are still connected, and I can continue to operate successfully, just not getting DNS names.

If shutting down systemd-resolved will solve this, I'll shut it down and disable it. It seemed that this was the way we managed things on one of the other distros, so I just followed suit on this one.

One of the concerns I had was that this layout (if you can call it that) I'm hoping to be able to distribute across a number of different machines.
_________________
David Dudley
Facility for Rare Isotope Beamlines
Michigan State University
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54247
Location: 56N 3W

PostPosted: Tue Apr 04, 2023 12:27 pm    Post subject: Reply with quote

dudleyd,

Neither restarting DNS nor using a static DNS is going to work for you.

It sounds like your network setup, which I guess is dhcpcd of some sort does everything properly at startup then when your lease is renewed, you don't get the nameserver part of the renewal.
Do you have any control over the TTL settings for your dhcp lease?

All i can think of at the moment is to capture the initial dhcp transaction (which works) with a lease renewal and see what the difference is.

I'm not a dhcp user though. Its set up on my network but all the systems i use have static IPv4 settings.
Mostly so I can fix it remotely when it breaks.

Do both IPv4 and IPv6 nameservers fail at the same time?

What is in /etc/resolv.conf before and after it fails?
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
toralf
Developer
Developer


Joined: 01 Feb 2004
Posts: 3922
Location: Hamburg

PostPosted: Tue Apr 04, 2023 5:34 pm    Post subject: Reply with quote

I do have
Quote:
nohook resolv.conf
timeout 8
to prevent dhcpcd from changing /etc/resolv.conf. And in the past I did a
Code:
chattr -a /etc/resolv.conf
because OpenResolv always overwrote that file without any chance to prevent that.
Back to top
View user's profile Send private message
dudleyd
n00b
n00b


Joined: 03 Apr 2023
Posts: 15
Location: Michigan

PostPosted: Tue Apr 04, 2023 6:01 pm    Post subject: Reply with quote

Curiously, none of our other distros have this problem, just this one. I have complete control over the network devices as this particular network is administered by me, but I need to get it setup so it will fit into our normal production environment as well.

Probably should have mentioned that the gentoo liveCD doesn't have this problem, just the one I installed from the manual (with the deviations to install systemd).



NeddySeagoon wrote:
dudleyd,

Neither restarting DNS nor using a static DNS is going to work for you.

It sounds like your network setup, which I guess is dhcpcd of some sort does everything properly at startup then when your lease is renewed, you don't get the nameserver part of the renewal.
Do you have any control over the TTL settings for your dhcp lease?

All i can think of at the moment is to capture the initial dhcp transaction (which works) with a lease renewal and see what the difference is.

I'm not a dhcp user though. Its set up on my network but all the systems i use have static IPv4 settings.
Mostly so I can fix it remotely when it breaks.

Do both IPv4 and IPv6 nameservers fail at the same time?

What is in /etc/resolv.conf before and after it fails?

_________________
David Dudley
Facility for Rare Isotope Beamlines
Michigan State University
Back to top
View user's profile Send private message
dudleyd
n00b
n00b


Joined: 03 Apr 2023
Posts: 15
Location: Michigan

PostPosted: Fri Apr 07, 2023 2:19 am    Post subject: Reply with quote

I disabled system-resolved and started NetworkManager, but same effect. Once sufficient time goes by, the box stops getting DNS updates.

Networking is still enabled and active, and I can access anything by IP, but DNS just stops working.

To make this even stranger, there are no changes in the resolv.conf file after DNS stops, ip addr shows the proper addresses, however anything that needs a DNS reference fails.

Checking the switch, it still shows that the machine is in it's lease period, although it shows no activity from the box referencing the DNS servers.

Strange....
_________________
David Dudley
Facility for Rare Isotope Beamlines
Michigan State University
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3137

PostPosted: Fri Apr 07, 2023 12:43 pm    Post subject: Reply with quote

Quote:
I disabled system-resolved
It must be masked too, otherwise will be pulled in as a dependency.
Quote:
To make this even stranger, there are no changes in the resolv.conf file after DNS stops,
there are no changes, ok, but is the content sensible? Does it define valid nameservers? (127.0.0.1 is not sensible, unless you have replaced resolved with something better)
Can you test those servers e.g. with dig and confirm they respond to your requests?
Also, do you have any sources other than files and dns defined in /etc/nsswitch.conf?
Back to top
View user's profile Send private message
dudleyd
n00b
n00b


Joined: 03 Apr 2023
Posts: 15
Location: Michigan

PostPosted: Fri Apr 07, 2023 7:19 pm    Post subject: Reply with quote

2 questions:

1. In Systemd how do I 'mask' items? I did a 'systemctl stop systemd-resolved && systemctl disable systemd-resolved' to stop it running, and then a 'systemctl enable NetworkManager && systemctl start NetworkManager' to switch to NetworkManager, but can't say I've seen any improvement. Still stops getting DNS after a bit of time has passed. Resetting either NetworkManager or systemd-resolved (previously) will correct the problem for a bit, but then it stops again.

The resolv.conf is correct, with our network switch and DNS server listed, and remains correct after the communications stop. I can continue to access IP addresses and ping or webpage access items by IP, but not by DNS.

I'm sure this is just a configuration problem, as if I start the system using the liveCD image, this problem does not occur.
_________________
David Dudley
Facility for Rare Isotope Beamlines
Michigan State University
Back to top
View user's profile Send private message
alamahant
Advocate
Advocate


Joined: 23 Mar 2019
Posts: 3879

PostPosted: Fri Apr 07, 2023 8:13 pm    Post subject: Reply with quote

Code:

systemctl mask systemd-resolved

After that issue
Code:

nmcli con show
nmcli con mod <your-connection profile name> ipv4.dns <your-preferred-dns>

_________________
:)
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3137

PostPosted: Fri Apr 07, 2023 9:13 pm    Post subject: Reply with quote

Quote:
and then a 'systemctl enable NetworkManager && systemctl start NetworkManager' to switch to NetworkManager, but can't say I've seen any improvement.
Wait... Switch to NetworkManager from what?
Your network was already configured, right? It breaks down after a while, but you must have had _something_ bring it up in in the first place, right? Changing multiple things at the same time is often a one-way street to The Land of Confusion.
Anyway, at this point you should probably do a full reboot too, just to get a clean state and confirm which parts work and which don't, before you spend a few hours chasing a non-reproducible bug possibly introduced by a manual intervention.



Quote:
The resolv.conf is correct, with our network switch and DNS server listed, and remains correct after the communications stop. I can continue to access IP addresses and ping or webpage access items by IP, but not by DNS.

What happens when you call a command like this one below? Try with ipv6 address too.
dig <some domain that should be resolvable on your system but isn't> @192.168.12.1
Back to top
View user's profile Send private message
dudleyd
n00b
n00b


Joined: 03 Apr 2023
Posts: 15
Location: Michigan

PostPosted: Fri Apr 07, 2023 9:32 pm    Post subject: Reply with quote

OK, things are starting to clear, greatly

My network is all DHCP (well, there are actually a bunch of networks.)

All of those networks are isolated from each other through gateway switches.

It's not that DNS is failing (I can use dig or nslookup and get name/IP information immediately, at any time).

The default route information is not being set by the DHCP service.

If I manually go in and execute a 'route -n add default gw GW' command, everything is fine.

Isn't DHCP supposed to handle updating that, or is there a command I forgot when setting up the network?
_________________
David Dudley
Facility for Rare Isotope Beamlines
Michigan State University
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3137

PostPosted: Sat Apr 08, 2023 10:37 am    Post subject: Reply with quote

Quote:
My network is all DHCP (well, there are actually a bunch of networks.)
So it's multihomed and autoconfigured at the same time? This is asking for conflicts, don't all those networks try to set the default GW for your machine? How do you determine which of the automatic routes is correct?
Or maybe one of those networks does not provide GW and it results in your machine dropping the route upon refresh?

Quote:
It's not that DNS is failing (I can use dig or nslookup and get name/IP information immediately, at any time).
ok, so this indicates a problem with system's resolver.
Quote:
If I manually go in and execute a 'route -n add default gw GW' command, everything is fine.

Aaand now it doesn't make any sense again.
You can query DNS manually at any time, but your system's resolver fails when default GW is not set, do I get it right? Are you sure you and your system are both using the same DNS?
Have you successfully killed systemd-resolved? It does some shady deals over dbus, so it may still be in your way as long as it's running, even if resolv.conf doesn't loop back to 127.0.0.1. I can't remember where it stores its upstream server and how it's populated.
Does netstat -nlpu show anything on port 53?
Have you tried running tcpdump on 53/udp to see where are those DNS queries really going?

Quote:
Isn't DHCP supposed to handle updating that, or is there a command I forgot when setting up the network?
It is, and it does by default.
However, on a multihomed server you might want to disable it. In fact, a manual configuration would probably be easier to do and more reliable, since you only have 1 authoritative source of information rather than a bunch of unrelated dhpc servers fighting for domination.
Back to top
View user's profile Send private message
dudleyd
n00b
n00b


Joined: 03 Apr 2023
Posts: 15
Location: Michigan

PostPosted: Sat Apr 08, 2023 11:41 am    Post subject: Reply with quote

I may have confused the issue a little... let me clarify.

In our production system, we have many machines, and hopefully if I get this straight this will begin our quest towards our "...continuous integration..." goals.
We're looking to improve our security an 'update-ability' rather than just updating only when a new release of a distro is made available from the other sources.

Quick network layout summary:

The network is a 'multi-star' configuration. Each point in the start is on it's on dedicated subnet, isolated from all other stars by IP segment and VLAN ID. No point can directly interface with another except through a single upstream connection to our main core switches.

Each local switch provides DHCP to all units attached to it and, depending on activity of that local point, may contain it's on DNS server locally for handling interaction between those points (many of the stars communicate intensely within themselves and we don't want to assign static IPs to equipment that may require updating or replacement frequently)

Therefore, DHCP provides the DNS server address to all devices inside that star area, and the DNS server (if one is in the star), receives a DNS entry to the next level (some of our stars may have sub-stars - one, I think even has a sub-sub-star).

Depending on the type of device (a PLC, or power supply, or RF controller, or detector, or ...) they may require anything from one or two DNS requests per cycle, up to thousands of requests to get access to the device continuously.

No single machine has more than a single default route, but those are all completely assigned using DHCP.

For some reason, that is not being assigned on this particular machine.

Later-
_________________
David Dudley
Facility for Rare Isotope Beamlines
Michigan State University
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3137

PostPosted: Sat Apr 08, 2023 7:07 pm    Post subject: Reply with quote

I see. Well, that is quite an interesting environment you've got there.
And it seems that you have 2 separate issues at hand really... The one without DNS and the one without GW.

Is this particular machine only connected to a single star or to multiple? If there are multiple, what dhcp client implementation are you using? Is it only NM or do you happen to have another client trying to manage your interfaces?
The questions about server's behavior when gw is missing still stand too.
Back to top
View user's profile Send private message
dobbs
Tux's lil' helper
Tux's lil' helper


Joined: 20 Aug 2005
Posts: 105
Location: Wenatchee, WA

PostPosted: Mon Jun 19, 2023 10:01 am    Post subject: Reply with quote

This describes behavior I encountered on one of my machines. In my case, systemd-resolved discarded DNS answers from my local DNS server because they lacked dnssec signatures. The clue was log output from the systemd-resolved service.

Code:
$ journalctl --unit=systemd-resolved
Jun 06 02:28:55 palshife systemd-resolved[1368]: [?] DNSSEC validation failed for question palshife.xxxxx IN DS: no-signature
Jun 06 02:28:55 palshife systemd-resolved[1368]: [?] DNSSEC validation failed for question palshife.xxxxx IN SOA: no-signature
Jun 06 02:28:55 palshife systemd-resolved[1368]: [?] DNSSEC validation failed for question palshife.xxxxx IN A: no-signature
Jun 06 02:28:55 palshife systemd-resolved[1368]: [?] DNSSEC validation failed for question palshife.xxxxx IN AAAA: no-signature


If you’re already disabling systemd-resolved, then I suggest removing references to it in nsswitch. My /etc/nsswitch.conf contained the line:

Code:
hosts:      files mymachines resolve [!UNAVAIL=return] dns myhostname


I simplified that to:

Code:
hosts:      files mymachines dns myhostname


Maybe adding dnssec to my local DNS server would be the “proper” fix, but I am unaware how to do that as yet.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum