Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
How do I diagnose intermittent network outages?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
pente
n00b
n00b


Joined: 21 Mar 2013
Posts: 29

PostPosted: Thu Jan 14, 2021 1:42 am    Post subject: How do I diagnose intermittent network outages? Reply with quote

My gentoo desktop had no network problems until it was moved to its current network, a standard home network. The network has a router with several devices connected over wifi; the desktop is the only device connected with an ethernet wire. The other devices on the network include another gentoo machine and a mac; everything is using dhcp.

Frequently (more than once a day) and unpredictably the desktop loses network access for seconds to minutes, sometimes as long as 15 minutes. The other devices do not experience this. When this happens, ping fails to reach the router, other devices on the network, and devices outside the network. Ping variously gives "Destination Hose Unreachable" error messages or not.

Example of pinging other devices on the same network. Note when the network comes back, several pings return at the same time.

Code:
From 192.168.1.239 icmp_seq=93 Destination Host Unreachable
From 192.168.1.239 icmp_seq=94 Destination Host Unreachable
From 192.168.1.239 icmp_seq=95 Destination Host Unreachable
64 bytes from 192.168.1.79: icmp_seq=96 ttl=255 time=2238 ms
64 bytes from 192.168.1.79: icmp_seq=97 ttl=255 time=1218 ms
64 bytes from 192.168.1.79: icmp_seq=98 ttl=255 time=195 ms
64 bytes from 192.168.1.79: icmp_seq=99 ttl=255 time=3.10 ms
64 bytes from 192.168.1.79: icmp_seq=100 ttl=255 time=4.56 ms
64 bytes from 192.168.1.79: icmp_seq=101 ttl=255 time=2.83 ms


Pings from other devices on the network do not reach the desktop when it experiences an outage. There is no network degradation except when it goes out entirely. Outages seem to be unrelated to network usage. Power-cycling the router or restarting dhcpcd during an outage does not solve it.

I don't have any ideas how to go about diagnosing a problem like this, or what information might be useful. Any suggestions would be appreciated.
Back to top
View user's profile Send private message
DespLock
n00b
n00b


Joined: 27 Jul 2020
Posts: 65

PostPosted: Thu Jan 14, 2021 2:08 am    Post subject: Reply with quote

Hi pente,
to have any idea of what is going on with your network, you (and we) need some more information.

pls issue the following commands once:

Code:

uname -a
lspci -vk (with the info of your network device(s)
lsusb

Are you using OpenRC? if yes:
rc-update
cat /etc/conf.d/net


The following comamnds should be issued twice, once with stable network, once while it is not working:
Code:

ip a
ip route show


Further infos:
a) are you using a firewall on the desktop or the router?
b) are you using networkmanager with or without systemd?


EDIT:
It's most likely one of these three three reasons:

1) a defective network cable
2) some kind of wrong configuration
3) bug with dhcpd (see https://forums.gentoo.org/viewtopic-t-1124935-highlight-dhcpd.html).


Last edited by DespLock on Thu Jan 14, 2021 2:25 am; edited 2 times in total
Back to top
View user's profile Send private message
Buffoon
Veteran
Veteran


Joined: 17 Jun 2015
Posts: 1369
Location: EU or US

PostPosted: Thu Jan 14, 2021 2:09 am    Post subject: Reply with quote

I'd run ethtool to see the status of connection for starters.
Back to top
View user's profile Send private message
Tony0945
Watchman
Watchman


Joined: 25 Jul 2006
Posts: 5127
Location: Illinois, USA

PostPosted: Thu Jan 14, 2021 2:14 am    Post subject: Reply with quote

Bad cable? Can you use another? or is it snaking through the walls? Bad switch? Is it connected to the switch built into the router or different one? try a different slot on the switch.
Swapping cables is a pretty standard test. Did you terminate the cable? Or is it premade? Corroded connections? Cat-6?
One test I made in a similar situation was to connect to an AP. PC had no problems. Intermittent on the 60 foot Cat-6 cable. Wound up replacing the connector on the other end. Didn't like the way the wires looked. Solved the problem. Great cable. US government surplus. US government always buys top quality supplies (well except for mess hall food and the raw food was probably grade A).
Back to top
View user's profile Send private message
pente
n00b
n00b


Joined: 21 Mar 2013
Posts: 29

PostPosted: Thu Jan 14, 2021 2:41 am    Post subject: Reply with quote

Thanks for the tips so far.

Code:

$ uname -a
Linux athena 4.20.4-gentoo #2 SMP Sat Jan 26 04:26:04 2019 x86_64 Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz GenuineIntel GNU/Linux
$ lspci -vk
00:19.0 Ethernet controller: Intel Corporation Ethernet Connection (2) I218-V
        Subsystem: ASRock Incorporation Ethernet Connection (2) I218-V
        Flags: bus master, fast devsel, latency 0, IRQ 29
        Memory at efc00000 (32-bit, non-prefetchable) [size=128K]
        Memory at efc3c000 (32-bit, non-prefetchable) [size=4K]
        I/O ports at f080 [size=32]
        Capabilities: [c8] Power Management version 2
        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [e0] PCI Advanced Features
        Kernel driver in use: e1000e
 $ lsusb
Bus 002 Device 002: ID 8087:8001 Intel Corp.
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 002: ID 8087:8009 Intel Corp.
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 003: ID 067b:2731 Prolific Technology, Inc. USB SD Card Reader     
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 010: ID 05f3:0007 PI Engineering, Inc. Kinesis Advantage PRO MPC/USB Keyboard
Bus 003 Device 009: ID 05f3:0081 PI Engineering, Inc. Kinesis Integrated Hub
Bus 003 Device 004: ID 05e3:0745 Genesys Logic, Inc. Logilink CR0012
Bus 003 Device 002: ID 046d:c52b Logitech, Inc. Unifying Receiver
Bus 003 Device 005: ID 046d:c52b Logitech, Inc. Unifying Receiver
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
$ rc-update
            alsasound | boot                                   
               binfmt | boot                                   
             bootmisc | boot                                   
              cgroups |                                 sysinit
              chronyd |      default                           
               cronie |      default                           
                cupsd |      default                           
                 dbus |      default                           
                devfs |                                 sysinit
                dmesg |                                 sysinit
              elogind | boot                                   
                 fsck | boot                                   
              hddtemp |      default                           
             hostname | boot                                   
              hwclock | boot                                   
              keymaps | boot                                   
            killprocs |                        shutdown       
    kmod-static-nodes |                                 sysinit
                local |      default nonetwork                 
           localmount | boot                                   
             loopback | boot                                   
                  lvm | boot                                   
              metalog |      default                           
              modules | boot                                   
             mount-ro |                        shutdown       
                 mtab | boot                                   
          net.enp0s25 |      default                           
             netmount |      default                           
     opentmpfiles-dev |                                 sysinit
   opentmpfiles-setup | boot                                   
               procfs | boot                                   
                 root | boot                                   
         save-keymaps | boot                                   
    save-termencoding | boot                                   
            savecache |                        shutdown       
                 sshd |      default                           
                 swap | boot                                   
               sysctl | boot                                   
                sysfs |                                 sysinit
         termencoding | boot                                   
                 udev |                                 sysinit
         udev-trigger |                                 sysinit
              urandom | boot                                   
$ ls /etc/conf.d/net*
/etc/conf.d/net-online  /etc/conf.d/net.wlp0s20u1  /etc/conf.d/netmount
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp0s25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether XXXXXXX brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.239/24 brd 192.168.1.255 scope global dynamic noprefixroute enp0s25
       valid_lft 86107sec preferred_lft 75307sec
    inet6 XXXXXX/64 scope link
       valid_lft forever preferred_lft forever
3: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
    link/sit 0.0.0.0 brd 0.0.0.0
$ ip route show
default via 192.168.1.1 dev enp0s25 proto dhcp src 192.168.1.239 metric 2
192.168.1.0/24 dev enp0s25 proto dhcp scope link src 192.168.1.239 metric 2


I'm not aware of any firewalls, and am not sure how I'd go about checking for one. I have networkmanager installed but I don't know what it does and don't think I am "using" it. No systemd. I'm not familiar with how to use ethtool, and a quick look at the documentation didn't help.

I forgot to assess the premade cable. I can inspect the whole cable and it appears fine. I think it's the same one as was being used previously without incident. I will replace it when I have the chance. I have changed which slot it is plugged into and will cycle through all of them if I continue to experience problems.

dhcpcd version 9.1.4.
Back to top
View user's profile Send private message
Tony0945
Watchman
Watchman


Joined: 25 Jul 2006
Posts: 5127
Location: Illinois, USA

PostPosted: Thu Jan 14, 2021 2:51 am    Post subject: Reply with quote

I use netifrc so I can't help with networkmanager. What does "ifconfig -a" show?
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20067

PostPosted: Thu Jan 14, 2021 5:02 am    Post subject: Reply with quote

Do your logs have anything relevant? Particularly dmesg for interfaces (likely beginning with eth and enp). Also dhcp logs, probably in /var/log/syslog or /var/log/messages. Especially useful if you can identify a time period when the problem occurred. Don't forget to verify that your time is correct so that the logged time is appropriate. Also, check whichever system is handing out the addresses.
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
DespLock
n00b
n00b


Joined: 27 Jul 2020
Posts: 65

PostPosted: Thu Jan 14, 2021 10:48 am    Post subject: Reply with quote

Hi pente,
on a first look your config seems to be ok and i can't find any unusual except for

Code:

Linux athena 4.20.4-gentoo
[...]
dhcpcd version 9.1.4.

Both versions are outdated, is this the state of the whole system?

Pls at least make a note if you manually edit the output of commands or logs like:
Code:

[...]
link/ether XXXXXXX
[...]
inet6 XXXXXX/64
[...]


Does you /etc/conf.d/net contain more entries then this for that interface?
Code:

config_enp0s25="dhcp"


Quote:

Do your logs have anything relevant?


Pls check the following for any flashy entries:
Code:

dmesg | grep e1000e
dmesg | grep enp0s25
dmesg | grep firmware



What i would do next:
1) try another port at the router for the network cable and make sure they are properly plugged in on both sides
1) test another cable
2) If you havven't applied updates for some times, i would try an actual liveCD/USB like Fedora and see if the problems still occurs.
3) enable logging for dhcpd if you aren't logging it already (man 5 dhcpcd.conf)
4) don't forget to run again in case of an incident:
Code:

ip a
ip route show
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 4130
Location: Bavaria

PostPosted: Thu Jan 14, 2021 11:12 am    Post subject: Reply with quote

pente,

first of all, diagnosing intermittent network outages is one of the hardest jobs.

You said, the only change was moving your desktop to another router. Therefore my first thinking is: Cable or Router.

If it is not the cable, it should be the router. And yes there exists buggy ones.

You are using dhcp. This is the first I would change. Give your desktop a static IP address. No more changes (also no updates). Only one by one.

If you have still problems, you know it wasnt dhcp ... and we have to investigate further ... else :-)
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 4130
Location: Bavaria

PostPosted: Thu Jan 14, 2021 11:18 am    Post subject: Reply with quote

pente wrote:
I'm not aware of any firewalls, and am not sure how I'd go about checking for one.

Usually if you dont know, then you dont have ... ;-) But, yes you dont have a firewall in your desktop (I see in your rc-update; no loading of iptables or nftables).
Back to top
View user's profile Send private message
Buffoon
Veteran
Veteran


Joined: 17 Jun 2015
Posts: 1369
Location: EU or US

PostPosted: Thu Jan 14, 2021 11:41 am    Post subject: Reply with quote

Quote:
I'm not familiar with how to use ethtool, and a quick look at the documentation didn't help.

How very interesting, you can post here but you have no access to search engines to search for "how to use ethtool". Well, ethtool would show you instantly if this is the cable causing the trouble. Like this, my working connection:
Code:

        ...
        Speed: 1000Mb/s
        Duplex: Full
        ...
 

But of course, if you can't use it then don't. Keep guessing, more fun?
Back to top
View user's profile Send private message
DespLock
n00b
n00b


Joined: 27 Jul 2020
Posts: 65

PostPosted: Thu Jan 14, 2021 1:59 pm    Post subject: Reply with quote

Quote:

...
Speed: 1000Mb/s
Duplex: Full
...


Doesn't that just show that the two network cards auto negotiated a network connection @1000Mb/s and full duplex? 8)

Still a number of reasons left if they don't.

EDIT: and this might be true while the connection is stable also. Would rather say that the temporarily occurrence is a better indicator for a defective cable then your posted output.
Back to top
View user's profile Send private message
Tony0945
Watchman
Watchman


Joined: 25 Jul 2006
Posts: 5127
Location: Illinois, USA

PostPosted: Thu Jan 14, 2021 4:20 pm    Post subject: Reply with quote

Buffoon, no need to be nasty. He said he found the documentation but didn't understand it. No crime in that.
Back to top
View user's profile Send private message
Tony0945
Watchman
Watchman


Joined: 25 Jul 2006
Posts: 5127
Location: Illinois, USA

PostPosted: Thu Jan 14, 2021 4:21 pm    Post subject: Reply with quote

Pente, what is mke and model of the new router and the old router if you still have it?
Back to top
View user's profile Send private message
gengreen
Apprentice
Apprentice


Joined: 23 Dec 2017
Posts: 150

PostPosted: Thu Jan 14, 2021 8:03 pm    Post subject: Reply with quote

I got a similar problem a while ago my is :

Quote:
Ethernet controller: Intel Corporation Ethernet Connection (7) I219-V


If my memory is intact, the problem was from the module e1000e

To mitigate those connectivity loss, I used ethtool as follow :

Code:

ethtool -K eno1 gso off gro off tso off tx off rx off
ethtool -s eno1 speed 10 duplex full


Recent kernel could solve your problem, I'm on a 4.19.152 didn't experience the problem again

Note that when you use ethtool, those change are temporary, restarting your network interface will reset to default

You can check basic information with a simple

Code:
ethtool eno1


obvisouly change eno1 by your need
_________________
Less is best
Back to top
View user's profile Send private message
pente
n00b
n00b


Joined: 21 Mar 2013
Posts: 29

PostPosted: Fri Jan 15, 2021 6:17 am    Post subject: Reply with quote

I saw some odd things in dmesg, more on that at the end. I have no information on the old router, it was in a commercial setup, not a typical home router. New router is asus tm-ac1900.

Code:

$ ethtool enp0s25
Settings for enp0s25:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Supported pause frame use: No
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: 1000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 1
        Transceiver: internal
        Auto-negotiation: on
        MDI-X: on (auto)
Cannot get wake-on-lan settings: Operation not permitted
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: yes
$ ifconfig -a
enp0s25: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.239  netmask 255.255.255.0  broadcast 192.168.1.255
        inet6 XXXXXX  prefixlen 64  scopeid 0x20<link>
        ether XXXXXX  txqueuelen 1000  (Ethernet)
        RX packets 248392094  bytes 300585770434 (279.9 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 135507076  bytes 29767207081 (27.7 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 20  memory 0xefc00000-efc20000 

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 74341785  bytes 5937779001 (5.5 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 74341785  bytes 5937779001 (5.5 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

sit0: flags=128<NOARP>  mtu 1480
        sit  txqueuelen 1000  (IPv6-in-IPv4)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


I have cycled through the slots in the router and upgraded dhcpcd to 9.4.0 without resolving the issue. The kernel is quite out of date as updating is a pain. @world was last updated a few months ago, I checked recently for anything that looked like it particularly needed updating and it seemed fine.

"ip a" and "ip route show" were pretty much unchanged during an outage (I've lost track of which output in my terminal was during the outage or not). Pulling (and restoring) the cable during an outage causes "ip route show" to give no output and ping gave "Network is unreachable" for 26 seconds before returning to "Destination Host Unreachable".

As noted above, I don't have a /etc/conf.d/net file. No dhcpcd logs I could find, I'll have to enable logging. Wiggling the cable connections does not cause any network problems.

I did see something interesting about the timing of these messages in dmesg:
Code:

[4250615.210839] e1000e: enp0s25 NIC Link is Down
[4250618.342852] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[4250730.770904] e1000e: enp0s25 NIC Link is Down
[4250733.734920] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[4250736.450033] e1000e: enp0s25 NIC Link is Down
[4250739.386080] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[6482964.638479] e1000e: enp0s25 NIC Link is Down
[6482967.591513] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[6483082.219650] e1000e: enp0s25 NIC Link is Down
[6483085.181642] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[6483087.165772] e1000e: enp0s25 NIC Link is Down
[6483090.127801] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[7467309.019115] e1000e: enp0s25 NIC Link is Down
[7467313.957213] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[7467428.862331] e1000e: enp0s25 NIC Link is Down
[7467431.820360] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[7467433.776456] e1000e: enp0s25 NIC Link is Down
[7467436.738482] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[7930109.220327] e1000e: enp0s25 NIC Link is Down
[7930113.658373] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[7930228.785480] e1000e: enp0s25 NIC Link is Down
[7930231.824426] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[7930233.890540] e1000e: enp0s25 NIC Link is Down
[7930236.944594] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[8104968.649715] e1000e: enp0s25 NIC Link is Down
[8104972.309130] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[8105090.273666] e1000e: enp0s25 NIC Link is Down
[8105093.954350] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx


I believe none of those correspond to the cable being physically removed. The timing is very odd; in each case the link goes down for 3 seconds, comes back for 115 seconds, goes down for 3 seconds, comes back for 2 seconds, and then goes down for 3 seconds. (Except the last incident is missing the last cycle.) There are several weeks between each of these, although network outages happen more than once a day. I guess this points towards some kind of cable problem, although it is hard to imagine a cable problem that has such regular timing.

I'll assume it is a cable problem unless someone has another line of attack to suggest. Not sure how long it'll be before I get a replacement to test.
Back to top
View user's profile Send private message
C5ace
Guru
Guru


Joined: 23 Dec 2013
Posts: 472
Location: Brisbane, Australia

PostPosted: Fri Jan 15, 2021 9:53 am    Post subject: Reply with quote

If you have or can borrow a second PC buy or make up a short crossover network cable connect both PCs and run ping.

Alternatively get one of those $10.00 Ethernet hubs and two normal short patch cables to connect the two PCs.

If either works, it's the cable or your router.

Now remove one of the connectors of your long cable and crimp on a new one wired for crossover and try again. If this works, your cable is OK and your router is bad.

Note, some cable modems and old commercial routers require crossover cables between the PC's and modem or router. Telstra in Australia used to provide such cable modems to their customers.
_________________
Observation after 30 years working with computers:
All software has known and unknown bugs and vulnerabilities. Especially software written in complex, unstable and object oriented languages such as perl, python, C++, C#, Rust and the likes.
Back to top
View user's profile Send private message
DespLock
n00b
n00b


Joined: 27 Jul 2020
Posts: 65

PostPosted: Fri Jan 15, 2021 12:24 pm    Post subject: Reply with quote

Hi pente,
with your last logs it is most UN-likely a hardware defect or the cable. In that case you would have errors in the RX erorr/TX error fields visible too (beside the link error in dmesg):

Quote:

enp0s25: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.1.239 netmask 255.255.255.0 broadcast 192.168.1.255
inet6 XXXXXX prefixlen 64 scopeid 0x20<link>
ether XXXXXX txqueuelen 1000 (Ethernet)
RX packets 248392094 bytes 300585770434 (279.9 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 135507076 bytes 29767207081 (27.7 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 20 memory 0xefc00000-efc20000


There seems to be a known bug in earlier kernel versions with e1000e driver and you can work around it(see below, you need ethtool) or install a newer kernel (as gengreen said newer lts kernel 4.19.xxx seems to work).


If you want to stick with your kernel and test the solution run as root:
Code:

ethtool -K enp0s25 gso off gro off tso off

or in case it doesn't work:
Code:

ethtool -K enp0s25 gso off gro off tso off tx off rx off

and watch the logs.

If it is working the command needs to be run after bringing the interface up. This can be done via

/etc/conf.d/net.
Code:

postup() {
  # This function could be used, for example, to register with a
  # dynamic DNS service.  Another possibility would be to
  # send/receive mail once the interface is brought up.
 
   if [[ "$IFACE" == *"enp0s25"* ]]; then
      #echo "$IFACE: postup"
      ethtool -K enp0s25 gso off gro off tso off
   fi
   
    return 0
}

Add this function to the file and restart the service.
Back to top
View user's profile Send private message
pente
n00b
n00b


Joined: 21 Mar 2013
Posts: 29

PostPosted: Mon Jan 25, 2021 5:48 pm    Post subject: Reply with quote

Unfortunately the suggested ethtool commands did not yield any change in behavior. It's unclear to me if I should expect upgrading the kernel to resolve the issue.

Since I figured I am due for a kernel upgrade regardless, I just got through the kernel configuration marathon and am about to give it a try. Along the way I noticed the config_generic_phy option which sounded important, but about which I could find very little information, all of which suggested it was related to ethernet in some way. It is disabled in my current kernel. Probably a red herring, but I thought I'd ask if there is a chance this or some other kernel option might be relevant to my issue?

Some new information: the home network is apparently configured as some kind of mesh network. I am unclear about how those work, so I'm not sure of the implication of losing ping to other devices on the network. As I believe other devices on the network are connected (wirelessly) to the device the desktop has a wired connection to, I think it is unlikely the problem lies with the mesh.

Also, on further observation I now believe the connection failures are correlated with network usage (less frequent on weekends), although I'm not sure if it has more to do with the desktop's usage or others'.
Back to top
View user's profile Send private message
pente
n00b
n00b


Joined: 21 Mar 2013
Posts: 29

PostPosted: Wed Feb 24, 2021 5:01 am    Post subject: Reply with quote

I believe I have applied all of the suggestions mentioned above, except the part about a crossover cable that I didn't understand how to do or what the goal was. I have upgraded the kernel and @world, and replaced the cable, in addition to the other steps I've mentioned in my previous comments.

There has been no change in the network interruptions I have been experiencing. Any suggestions on how to diagnose the problem would be appreciated, thanks.
Back to top
View user's profile Send private message
Ralphred
Guru
Guru


Joined: 31 Dec 2013
Posts: 495

PostPosted: Wed Feb 24, 2021 6:12 am    Post subject: Reply with quote

pente wrote:
although it is hard to imagine a cable problem that has such regular timing.

Not really, the environmental* factors that are pushing a cable from "almost failing" to "failing" can be unpredictable, but the way the protocol handles the timeout for a broken link or negotiation for reconnection are predictable.
Termination points are the things to check first, something like
Code:
ping -f [ip_addr]
gives a nice graphical feedback for running around and testing infrastructure, and it the first tool I use if I suspect cable failure.
pente wrote:
connection failures are correlated with network usage (less frequent on weekends), although I'm not sure if it has more to do with the desktop's usage or others'.

Can you get in to the router, see what it's load is; I had a client with similar issues due to the sheer number of apple/android devices slurping data and phoning home, new device from the ISP worked for a couple of months then the same issue. Replaced it with separate modem, router and AP devices** and he's never had issues since.

*These can be anything from temperature changes causing physical stress, to electrical items causing undue interference in unshielded/damp cables etc.
**Though a decent all in one would have probably worked
Back to top
View user's profile Send private message
gentoo_ram
Guru
Guru


Joined: 25 Oct 2007
Posts: 474
Location: San Diego, California USA

PostPosted: Mon Mar 08, 2021 9:53 pm    Post subject: Reply with quote

I saw similar behavior with this driver when my computer was plugged in to a cable modem once. Intermittent drops of signal. Once I got the cable modem replaced, the issue went away. Whatever your computer is plugged into may be having problems.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum