Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
5 Docker containers kills dhcpcd on Gentoo box.
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
datafatmunger
n00b
n00b


Joined: 02 Apr 2016
Posts: 27

PostPosted: Thu Dec 26, 2019 6:09 pm    Post subject: 5 Docker containers kills dhcpcd on Gentoo box. Reply with quote

Not sure if this is totally a Gentoo issue, but long story short ... starting a 5 docker containers (NO MATTER WHAT IMAGE), also exactly 5, with docker-compose is consistently killing dhcpcd. 1 container, fine, 2, also fine, 3, 4 ... also fine. 5 ... BOOM.

I'm posting here, because this apparently is only happening on my Gentoo machine. Here is my syslog with a segfault, I don't always see the segfault, but I always lose the network.

Code:

Dec 26 18:43:20 X200 kernel: [  621.686048] eth0: renamed from veth3b64a65
Dec 26 18:43:20 X200 kernel: [  621.690079] IPv6: ADDRCONF(NETDEV_CHANGE): vethdbb51ca: link becomes ready
Dec 26 18:43:20 X200 kernel: [  621.690143] br-cd93db36063a: port 5(vethdbb51ca) entered blocking state
Dec 26 18:43:20 X200 kernel: [  621.690145] br-cd93db36063a: port 5(vethdbb51ca) entered forwarding state
Dec 26 18:43:20 X200 dhcpcd[10625]: dhcpcd_prestartinterface: veth3b64a65: No such device
Dec 26 18:43:20 X200 dhcpcd[10625]: veth3b64a65: waiting for carrier
Dec 26 18:43:20 X200 dhcpcd[10625]: route socket overflowed - learning interface state
Dec 26 18:43:20 X200 dhcpcd[10625]: veth10b4d67: removing interface
Dec 26 18:43:20 X200 dhcpcd[10625]: veth3b35ae5: removing interface
Dec 26 18:43:20 X200 dhcpcd[10625]: veth779607b: removing interface
Dec 26 18:43:21 X200 dhcpcd[10625]: veth3b64a65: removing interface
Dec 26 18:43:21 X200 dhcpcd[10625]: wlp2s0: carrier lost
Dec 26 18:43:21 X200 kernel: [  622.367267] eth0: renamed from veth88fd91a
Dec 26 18:43:21 X200 kernel: [  622.370384] br-cd93db36063a: port 1(veth097d5a0) entered disabled state
Dec 26 18:43:21 X200 kernel: [  622.371972] br-cd93db36063a: port 1(veth097d5a0) entered blocking state
Dec 26 18:43:21 X200 kernel: [  622.371978] br-cd93db36063a: port 1(veth097d5a0) entered forwarding state
Dec 26 18:43:21 X200 dhcpcd[10625]: wlp2s0: deleting address 2001:984:72e5:1:40c0:52a:57bb:90f5/64
Dec 26 18:43:21 X200 dhcpcd[10625]: wlp2s0: deleting route to 2001:984:72e5:1::/64
Dec 26 18:43:21 X200 dhcpcd[10625]: wlp2s0: deleting default route via fe80::9ec7:a6ff:fecf:ea81
Dec 26 18:43:21 X200 dhcpcd[10625]: wlp2s0: deleting address fe80::d510:c809:d3a6:404b
Dec 26 18:43:21 X200 dhcpcd[10625]: wlp2s0: deleting route to 192.168.178.0/24
Dec 26 18:43:21 X200 dhcpcd[10625]: wlp2s0: deleting default route via 192.168.178.1
Dec 26 18:43:21 X200 dhcpcd[10625]: br-cd93db36063a: carrier acquired
Dec 26 18:43:21 X200 dhcpcd[10625]: br-cd93db36063a: IAID bb:1f:17:eb
Dec 26 18:43:21 X200 kernel: [  622.581005] dhcpcd[10625]: segfault at 8 ip 0000560703ba4240 sp 00007fffaad1e2f8 error 4 in dhcpcd[560703ba1000+32000]
Dec 26 18:43:21 X200 kernel: [  622.581017] Code: a0 00 00 00 48 8b 00 48 85 c0 74 45 66 0f 1f 44 00 00 48 39 c7 74 1a 8b 48 2c 85 c9 74 13 66 85 f6 48 8b 88 c0 00 00 00 75 07 <8b> 49 08 39 0a 74 0a 48 8b 40 08 48 85 c0 75 d8 c3 48 8d 50 18 31


Minimum docker-compose:

Code:

version: "3.1"
services:

    debian:
      image: debian:latest
      container_name: hub-debian
      working_dir: /application
      command: tail -f /dev/null
      networks:
        - hub-internal

    debian2:
      image: debian:latest
      container_name: hub-debian2
      working_dir: /application
      command: tail -f /dev/null
      networks:
        - hub-internal

    debian3:
      image: debian:latest
      container_name: hub-debian3
      working_dir: /application
      command: tail -f /dev/null
      networks:
        - hub-internal

    debian4:
      image: debian:latest
      container_name: hub-debian4
      working_dir: /application
      command: tail -f /dev/null
      networks:
        - hub-internal

    debian5:
      image: debian:latest
      container_name: hub-debian5
      working_dir: /application
      command: tail -f /dev/null
      networks:
        - hub-internal

networks:
  hub-internal:



I hope this rings a bell for someone. Thanks!
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 14922

PostPosted: Fri Dec 27, 2019 1:18 am    Post subject: Reply with quote

What is the output of emerge --verbose --info net-misc/dhcpcd? What backtrace is shown in the core file from dhcpcd? Based on the description, my guess is that Docker is creating extra network interfaces for use with the containers, and then overruns some array in dhcpcd by creating more interfaces than expected. Can you reproduce the problem by manually creating an equivalent number of network interfaces? ip link add DEV type veth should let you create extra interfaces from the CLI, assuming you have VETH support enabled in your kernel.
Back to top
View user's profile Send private message
datafatmunger
n00b
n00b


Joined: 02 Apr 2016
Posts: 27

PostPosted: Fri Dec 27, 2019 8:38 am    Post subject: Reply with quote

Thank for the reply!

I was not able to reproduce to the issue by manually creating veth "devices". Unfortunately. The index number on the devices is suspiciously high:

Code:
145: veth5@veth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:d3:78:94:2a:48 brd ff:ff:ff:ff:ff:ff


The output for more dhcpcd package is here: http://dpaste.com/0XSA2R2 ... everything should be pretty up-to-date, I did an update @world just to see if it was maybe an existing bug already addressed.

VETH is compiled into my kernel. Everything docker has worked fine for a long time ... I've just never had so many containers in one project up to this point.

Not sure where to go looking for the core dump file? Perhaps it is going somewhere unexpected. Thoughts?
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 14922

PostPosted: Sat Dec 28, 2019 1:21 am    Post subject: Reply with quote

Check the value of /proc/sys/kernel/core_pattern. If that is relative, check the working directory of dhcpcd (ls -l /proc/"$(pidof dhcpcd)"/cwd). Check that the dhcpcd process would have permission to write to the path named by the core pattern. Check that dhcpcd's core limit size is non-zero (prlimit -p $(pidof dhcpcd) or similar).
Back to top
View user's profile Send private message
UberLord
Retired Dev
Retired Dev


Joined: 18 Sep 2003
Posts: 6771
Location: Blighty

PostPosted: Mon Dec 30, 2019 3:10 pm    Post subject: Reply with quote

Can you try newer dhcpcd versions? dhcpcd-8.1.2 and dhcpcd-9999 may have fixes in.
Also, try enabling the debug USE flag for newer versions as that will force dhcpcd to be built with ASAN. It will report very high memory usage, but also report where any memory errors are occuring.

I should remind people about dhcpcd-8.1.4 as well.
_________________
Use dhcpcd for all your automated network configuration needs
Use dhcpcd-ui (GTK+/Qt) as your System Tray Network tool
Back to top
View user's profile Send private message
chrisbdaemon
n00b
n00b


Joined: 11 Mar 2020
Posts: 1

PostPosted: Wed Mar 11, 2020 2:17 pm    Post subject: Reply with quote

Don't mean to raise a thread from the dead, but I just ran across this problem.. and this thread.. upgrading dhcpcd to 8.1.6 solved it for me.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum