Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Docker image takes minutes to load
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Fri Sep 24, 2021 10:57 am    Post subject: Docker image takes minutes to load Reply with quote

This problem started to happen some month ago. I have the suspicion that some update on packages causes this. Basically I've got a Jenkins running building nightlies using docker images. Docker images take longer and longer to start up causing Jenkins to randomly fail builds due to docker timeout. I've restarted the docker daemon multiple times but the problem stays the same. This problem happens also outside Jenkins if I try to run the image manually in a terminal.

Somebody else has huge problems with docker images taking minutes to launch or never launch at all? Any ideas what could be the problem?

The system is 64bit GenToo properly updated.

app-emulation/docker: Latest version installed: 20.10.7

Some emerge --info:
Code:
Portage 3.0.20 (python 3.9.5-final-0, default/linux/amd64/17.1/hardened, gcc-10.3.0, glibc-2.33-r1, 4.19.66-gentoo x86_64)
=================================================================
System uname: Linux-4.19.66-gentoo-x86_64-Intel-R-_Xeon-R-_CPU_E3-1220_v6_@_3.00GHz-with-glibc2.33
KiB Mem:     7999412 total,    323064 free
KiB Swap:          0 total,         0 free
Timestamp of repository gentoo: Fri, 24 Sep 2021 02:00:02 +0000
Head commit of repository gentoo: d92534c71f93ba103aecee274942b7b2a82e5af7
sh bash 5.1_p8
ld GNU ld (Gentoo 2.35.2 p1) 2.35.2
app-shells/bash:          5.1_p8::gentoo
dev-java/java-config:     2.3.1::gentoo
dev-lang/perl:            5.32.1::gentoo
dev-lang/python:          2.7.18_p10::gentoo, 3.9.5_p2::gentoo
dev-lang/rust:            1.52.1::gentoo
dev-util/cmake:           3.18.5::gentoo
sys-apps/baselayout:      2.7::gentoo
sys-apps/openrc:          0.42.1-r1::gentoo
sys-apps/sandbox:         2.24::gentoo
sys-devel/autoconf:       2.69-r5::gentoo
sys-devel/automake:       1.16.3-r1::gentoo
sys-devel/binutils:       2.35.2::gentoo
sys-devel/gcc:            10.3.0-r1::gentoo
sys-devel/gcc-config:     2.4::gentoo
sys-devel/libtool:        2.4.6-r6::gentoo
sys-devel/make:           4.3::gentoo
sys-kernel/linux-headers: 5.10::gentoo (virtual/os-headers)
sys-libs/glibc:           2.33-r1::gentoo
Repositories:

gentoo
    location: /usr/portage
    sync-type: rsync
    sync-uri: rsync://rsync.gentoo.org/gentoo-portage
    priority: -1000
    sync-rsync-verify-jobs: 1
    sync-rsync-verify-metamanifest: yes
    sync-rsync-verify-max-age: 24
    sync-rsync-extra-opts:

mva
    location: /var/lib/layman/mva
    masters: gentoo
    priority: 50

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="@FREE dlj-1.1 Oracle-BCLA-JavaSE RAR unRAR freedist linux-firmware no-source-code linux-fw-redistributable"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O2 -pipe -fforce-addr"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt /var/bind"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php7.4/ext-active/ /etc/php/cgi-php7.4/ext-active/ /etc/php/cli-php7.4/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-march=native -O2 -pipe -fforce-addr"
DISTDIR="/usr/portage/distfiles"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch usersandbox usersync"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="en_US.UTF-8"
LC_ALL="en_US.UTF-8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="en de"
MAKEOPTS="-j4"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
USE="7zip acl acpi amd64 apache2 authdaemond bash-completion berkdb bzip2 cacert caps cdr cgi cracklib crypt ctype cups dlz dvd dvdr enscript exif expat fam fontconfig foomaticdb ftp gd geoip gif hardened hpijs iconv icq imagemagick imap imlib ipv6 jabber javascript jbig jit jpeg jpeg2k lcsm ldap libcaca libglvnd libtirpc lm_sensors logrotate mbox mime mng mpeg msn multilib mxdatetime mysql ncurses new-hpcups nls nptl ogg openldap openmp openssl oscar pam pcre pdf perl php pie png posix postgres ppds python readline samba sasl scanner seccomp session slang split-usr ssl ssp stream subversion svg theora threads tiff truetype unicode usb vhosts vorbis xattr xml xtpax zlib" ABI_X86="64" ADA_TARGET="gnat_2019" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="actions alias auth_basic auth_digest authn_anon authn_dbd authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock dbd deflate dir disk_cache env expires ext_filter file_cache filter headers ident imagemap include info log_config logio mem_cache mime mime_magic negotiation proxy proxy_ajp proxy_balancer proxy_connect proxy_http rewrite setenvif so speling status unique_id userdir usertrack vhost_alias slotmem_shm authn_core authz_core unixd socache_shmcb slotmem_shm authn_core authz_core unixd socache_shmcb" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-3 php7-4" POSTGRES_TARGETS="postgres12 postgres13" PYTHON_SINGLE_TARGET="python3_9" PYTHON_TARGETS="python3_9" RUBY_TARGETS="ruby26" USERLAND="GNU" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq proto steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, RUSTFLAGS


Example output from Jenkins. Same happens from console so not a Jenkins problem:
Code:
04:06:35  [Pipeline] withDockerContainer
04:06:35  Jenkins does not seem to be running inside a container
04:06:35  $ docker run -t -d -u 116:998 -w /var/lib/jenkins/home/jobs/dragengine_build_linux_64bit/workspace -v /var/lib/jenkins/home/jobs/dragengine_build_linux_64bit/workspace:/var/lib/jenkins/home/jobs/dragengine_build_linux_64bit/workspace:rw,z -v /var/lib/jenkins/home/jobs/dragengine_build_linux_64bit/workspace@tmp:/var/lib/jenkins/home/jobs/dragengine_build_linux_64bit/workspace@tmp:rw,z -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** compile-ubuntu cat
04:09:35  ERROR: Timeout after 180 seconds

_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
spica
Apprentice
Apprentice


Joined: 04 Jun 2021
Posts: 282

PostPosted: Fri Sep 24, 2021 11:16 am    Post subject: Reply with quote

Need to collect as much info as possible, the provided is not enough. Now we can only see that jenkins plugin did not get a response from docker daemon during timeout, but hard to say why. You will need to do a research on your side.
Try running the image manually on the same host (recheck in jenkins logs if this is master or worker), use arguments as much similar as jenkins does. Does it get stuck? Can you enter into this container?
What happens in /var/log/docker.log, /var/log/messages, dmesg?
How the disk behave?
Try cleaning all the cached images (docker rmi), does it allow to remove them?
Try pulling the same image after local storage is empty. How much time it take?
Did the problem appear after some change?
Check for orphaned containers, images, networks; sometimes the docker plugin leave them running after a job is completed.
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Mon Sep 27, 2021 10:38 am    Post subject: Reply with quote

It happened last night again.

Jenkins:
Code:
04:06:19  [Pipeline] withDockerContainer
04:06:20  Jenkins does not seem to be running inside a container
04:06:20  $ docker run -t -d -u 116:998 -w /var/lib/jenkins/home/jobs/dragengine_build_linux_64bit/workspace -v /var/lib/jenkins/home/jobs/dragengine_build_linux_64bit/workspace:/var/lib/jenkins/home/jobs/dragengine_build_linux_64bit/workspace:rw,z -v /var/lib/jenkins/home/jobs/dragengine_build_linux_64bit/workspace@tmp:/var/lib/jenkins/home/jobs/dragengine_build_linux_64bit/workspace@tmp:rw,z -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** compile-ubuntu cat
04:09:21  ERROR: Timeout after 180 seconds


Here is the relevant content of docker.log around time 4:06 -> 4:09:
Code:
time="2021-09-26T04:08:35.347264213+02:00" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers: [nameserver 8.8.8.8 nameserver 8.8.4.4]"
time="2021-09-26T04:08:35.359899609+02:00" level=info msg="IPv6 enabled; Adding default IPv6 external servers: [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]"
time="2021-09-26T04:08:37.607271094+02:00" level=info msg="starting signal loop" namespace=moby path=/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/d7cb9fb6bd305c3251f37d09140068aa6f0156b02592bd269dc2123029902dcc pid=5916
time="2021-09-26T04:15:52.656958955+02:00" level=info msg="Container d7cb9fb6bd305c3251f37d09140068aa6f0156b02592bd269dc2123029902dcc failed to exit within 1 seconds of signal 15 - using the force"
time="2021-09-26T04:15:52.879063574+02:00" level=info msg="ignoring event" container=d7cb9fb6bd305c3251f37d09140068aa6f0156b02592bd269dc2123029902dcc module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
time="2021-09-26T04:15:52.879239831+02:00" level=info msg="shim disconnected" id=d7cb9fb6bd305c3251f37d09140068aa6f0156b02592bd269dc2123029902dcc
time="2021-09-26T04:15:52.889624711+02:00" level=error msg="copy shim log" error="read /proc/self/fd/14: file already closed"
time="2021-09-26T04:15:52.949913831+02:00" level=warning msg="Failed to delete conntrack state for 172.17.0.2: invalid argument"
time="2021-09-26T04:17:15.290366711+02:00" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers: [nameserver 8.8.8.8 nameserver 8.8.4.4]"
time="2021-09-26T04:17:15.303787278+02:00" level=info msg="IPv6 enabled; Adding default IPv6 external servers: [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]"
time="2021-09-26T04:17:15.784504013+02:00" level=info msg="starting signal loop" namespace=moby path=/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/08a549ec43342ca76687e82fcc20d595fc857e7b6bfb19b52131a0e51d207cfb pid=28241
time="2021-09-26T04:27:32.966969763+02:00" level=info msg="Container 08a549ec43342ca76687e82fcc20d595fc857e7b6bfb19b52131a0e51d207cfb failed to exit within 1 seconds of signal 15 - using the force"
time="2021-09-26T04:27:33.226676567+02:00" level=info msg="ignoring event" container=08a549ec43342ca76687e82fcc20d595fc857e7b6bfb19b52131a0e51d207cfb module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
time="2021-09-26T04:27:33.226746673+02:00" level=info msg="shim disconnected" id=08a549ec43342ca76687e82fcc20d595fc857e7b6bfb19b52131a0e51d207cfb
time="2021-09-26T04:27:33.226793902+02:00" level=error msg="copy shim log" error="read /proc/self/fd/14: file already closed"
time="2021-09-26T04:27:33.227969749+02:00" level=warning msg="Failed to delete conntrack state for 172.17.0.2: invalid argument"
time="2021-09-26T04:27:59.655159124+02:00" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers: [nameserver 8.8.8.8 nameserver 8.8.4.4]"
time="2021-09-26T04:27:59.655195430+02:00" level=info msg="IPv6 enabled; Adding default IPv6 external servers: [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]"
time="2021-09-26T04:28:00.539504764+02:00" level=info msg="starting signal loop" namespace=moby path=/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/de310ffecc36dabaa717270d7d3ab07b4e0b4f4c6d7c1884fc59ca726ec8069e pid=14798
time="2021-09-26T04:28:42.249053851+02:00" level=info msg="Container de310ffecc36dabaa717270d7d3ab07b4e0b4f4c6d7c1884fc59ca726ec8069e failed to exit within 1 seconds of signal 15 - using the force"
time="2021-09-26T04:28:42.438215818+02:00" level=info msg="ignoring event" container=de310ffecc36dabaa717270d7d3ab07b4e0b4f4c6d7c1884fc59ca726ec8069e module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
time="2021-09-26T04:28:42.438244456+02:00" level=info msg="shim disconnected" id=de310ffecc36dabaa717270d7d3ab07b4e0b4f4c6d7c1884fc59ca726ec8069e
time="2021-09-26T04:28:42.438293636+02:00" level=error msg="copy shim log" error="read /proc/self/fd/14: file already closed"
time="2021-09-26T04:28:42.439198693+02:00" level=warning msg="Failed to delete conntrack state for 172.17.0.2: invalid argument"


The file /var/log/messages does not exist on my system. I have /var/log/everything/current though. It contains no relevant data in that time frame

Now to answer some questions:

- Try running the image manually on the same host (recheck in jenkins logs if this is master or worker), use arguments as much similar as jenkins does. Does it get stuck?
> Sometimes, like with Jenkins

- Can you enter into this container?
> Yes, when it does not hang

- Try cleaning all the cached images (docker rmi), does it allow to remove them?
> Yes

- Try pulling the same image after local storage is empty. How much time it take?
> It's from a docker-file. The base image though downloads in normal speed

- Did the problem appear after some change?
> More like gradually becoming worse. I'm doing GenToo updates regularly so I can't say when exactly the problem really started.

- Check for orphaned containers, images, networks; sometimes the docker plugin leave them running after a job is completed.
> I see none
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21490

PostPosted: Mon Sep 27, 2021 4:16 pm    Post subject: Reply with quote

Is there anything relevant in the kernel logs? Do the disk(s) containing these images report any issues in their SMART data?
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Mon Oct 18, 2021 11:39 am    Post subject: Reply with quote

Hu wrote:
Is there anything relevant in the kernel logs? Do the disk(s) containing these images report any issues in their SMART data?

No, nothing I can see. Looking at a nightly run which worked shows this:
Code:
04:05:55  Jenkins does not seem to be running inside a container
04:05:55  $ docker run -t -d -u 116:998 -w /var/lib/jenkins/home/jobs/dragengine_build_linux_64bit/workspace -v /var/lib/jenkins/home/jobs/dragengine_build_linux_64bit/workspace:/var/lib/jenkins/home/jobs/dragengine_build_linux_64bit/workspace:rw,z -v /var/lib/jenkins/home/jobs/dragengine_build_linux_64bit/workspace@tmp:/var/lib/jenkins/home/jobs/dragengine_build_linux_64bit/workspace@tmp:rw,z -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** compile-ubuntu cat
04:08:39  $ docker top ddb67fad69b4fa37b9ea09b3c182c6a1ff920675da94e76d2d57b27036f2dbc0 -eo pid,comm


It's nearly taking 3 minutes to run the image. 3 minutes is the hard-coded timeout of Jenkins so I guess sometimes it is slightly above 3 minutes and then it fails. So pretty much 3 minutes to run an image. I'm out of ideas.
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 887
Location: Richmond Hill, Canada

PostPosted: Mon Oct 18, 2021 2:09 pm    Post subject: Reply with quote

More questions,

1. Was the docker container run inside a VM?
2. beside the docker container was anything else running on the VM/Host?
3. was backup running at the time when docker container startup?
4. In past when I run in to this kind of problem usually it is the storage sub-system having problem. So in dmesg do you see any disk I/O error?
5. From command line (not jenkins) How long does it take to run
Code:
docker -it --rm --entry-point=/bin/bash compile-ubuntu
until you see bash prompt?
6. how long does it take to run
Code:
docker pull compile-ubuntu

7. After point 5 you will be at container's bash prompt, Please show us what is running inside container, i.e
Code:
ps -ef


8. It seems you only got occational failure, So after each failure what action(s) (if any) taken to reset the condition? (for example reboot?)

Point 5 is trying to understand whether create container take very long time.
Point 6 is to verify if docker spent time download image.
Point 7 is to understand within the container if there are something unexpected happening.
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Mon Oct 18, 2021 2:35 pm    Post subject: Reply with quote

pingtoo wrote:
More questions,

1. Was the docker container run inside a VM?
2. beside the docker container was anything else running on the VM/Host?
3. was backup running at the time when docker container startup?
4. In past when I run in to this kind of problem usually it is the storage sub-system having problem. So in dmesg do you see any disk I/O error?
5. From command line (not jenkins) How long does it take to run
Code:
docker -it --rm --entry-point=/bin/bash compile-ubuntu
until you see bash prompt?
6. how long does it take to run
Code:
docker pull compile-ubuntu

7. After point 5 you will be at container's bash prompt, Please show us what is running inside container, i.e
Code:
ps -ef


8. It seems you only got occational failure, So after each failure what action(s) (if any) taken to reset the condition? (for example reboot?)

Point 5 is trying to understand whether create container take very long time.
Point 6 is to verify if docker spent time download image.
Point 7 is to understand within the container if there are something unexpected happening.


1. No
2. No
3. No
4. No
5. Using "time" 391.58 seconds, hence 6.5 minutes.
6. It's a self build image so there is no pulling involved to run it

7. The result of "ps -ef" is this:
UID PID PPID C STIME TTY TIME CMD
buildus+ 1 0 1 14:30 pts/0 00:00:00 /bin/bash
buildus+ 9 1 0 14:30 pts/0 00:00:00 ps -ef

8. None. The failure is due to Jenkins killing the process if takes longer to produce any output for more than 3 minutes. As seen in the test above the docker image would run but it takes way too long and will be killed.
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 887
Location: Richmond Hill, Canada

PostPosted: Mon Oct 18, 2021 3:40 pm    Post subject: Reply with quote

Dragonlord wrote:
1. No
2. No
3. No
4. No
5. Using "time" 391.58 seconds, hence 6.5 minutes.
6. It's a self build image so there is no pulling involved to run it

7. The result of "ps -ef" is this:
UID PID PPID C STIME TTY TIME CMD
buildus+ 1 0 1 14:30 pts/0 00:00:00 /bin/bash
buildus+ 9 1 0 14:30 pts/0 00:00:00 ps -ef

8. None. The failure is due to Jenkins killing the process if takes longer to produce any output for more than 3 minutes. As seen in the test above the docker image would run but it takes way too long and will be killed.


From your answers I assume the container host is a dedicated machine to run the container without anything else at the moment you run the docker ... --entry-point=/bin/bash ... so at that moment the load of the machine should be under 1.0, Am my assumption correct?

Because cannot sit in from of the machine to observe so I have few more actions require to understand the condition when docker create the container,

1. run the point 5 command and on another window run uptime to see in the duration of creating container did system load gone up?
2. Is /var/lib/docker from local storage or it is mount over network?
3. what is content of /etc/docker/daemon.json if there is such file?
4. if during the point 1 above system load did not gone up, that would indicate system is waiting some sort of locking mechanism to release, So maybe try restart docker daemon
Code:
rc-service docker stop
rc-service docker start

5. Another debug strategy is to try docker events. This require use two windows, one window run the docker events command and to observe what is going on in dockerd and the other window run the docker ... --entry-point=/bin/bash ... command to create the container

Point 5 above will help to understand where did the time spent on container creation. So please share the result of docker events
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Mon Oct 18, 2021 8:43 pm    Post subject: Reply with quote

pingtoo wrote:

From your answers I assume the container host is a dedicated machine to run the container without anything else at the moment you run the docker ... --entry-point=/bin/bash ... so at that moment the load of the machine should be under 1.0, Am my assumption correct?

Because cannot sit in from of the machine to observe so I have few more actions require to understand the condition when docker create the container,

1. run the point 5 command and on another window run uptime to see in the duration of creating container did system load gone up?
2. Is /var/lib/docker from local storage or it is mount over network?
3. what is content of /etc/docker/daemon.json if there is such file?
4. if during the point 1 above system load did not gone up, that would indicate system is waiting some sort of locking mechanism to release, So maybe try restart docker daemon
Code:
rc-service docker stop
rc-service docker start

5. Another debug strategy is to try docker events. This require use two windows, one window run the docker events command and to observe what is going on in dockerd and the other window run the docker ... --entry-point=/bin/bash ... command to create the container

Point 5 above will help to understand where did the time spent on container creation. So please share the result of docker events

It's a server machine. It is doing things like serving web pages, mail server and stuff like this but the load is low. It certainly is not going to cause docker to run slow.
1. Before running the command the load is below 0.05. After starting the image the load slowly climbs up to 3.2 (like over the course of a minute) then stays at that value until the bash prompt arrives.
2. Local directory
3. This file does not exist
4. I tried doing this a couple of times but the result is the same.

5. This commands prints out nothing until shortly before the bash prompt arrives:
Code:
2021-10-18T22:40:45.506754269+02:00 container create 2d6bccfeb11ffe78f5d8b1c949be6362460e34e6b52d68e10456b54c18c3e228 (image=compile-ubuntu, name=eloquent_bell)
2021-10-18T22:40:45.593525967+02:00 container attach 2d6bccfeb11ffe78f5d8b1c949be6362460e34e6b52d68e10456b54c18c3e228 (image=compile-ubuntu, name=eloquent_bell)
2021-10-18T22:40:46.690252647+02:00 network connect 3ed2b473424e1d7986e734a7da5466761867c2a280f78d271dd112cb3c729978 (container=2d6bccfeb11ffe78f5d8b1c949be6362460e34e6b52d68e10456b54c18c3e228, name=bridge, type=bridge)
2021-10-18T22:40:50.575914161+02:00 container start 2d6bccfeb11ffe78f5d8b1c949be6362460e34e6b52d68e10456b54c18c3e228 (image=compile-ubuntu, name=eloquent_bell)
2021-10-18T22:40:50.719294588+02:00 container resize 2d6bccfeb11ffe78f5d8b1c949be6362460e34e6b52d68e10456b54c18c3e228 (height=40, image=compile-ubuntu, name=eloquent_bell, width=152)

_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 887
Location: Richmond Hill, Canada

PostPosted: Tue Oct 19, 2021 1:16 pm    Post subject: Reply with quote

Dragonlord wrote:
5. This commands prints out nothing until shortly before the bash prompt arrives:
Code:
2021-10-18T22:40:45.506754269+02:00 container create 2d6bccfeb11ffe78f5d8b1c949be6362460e34e6b52d68e10456b54c18c3e228 (image=compile-ubuntu, name=eloquent_bell)
2021-10-18T22:40:45.593525967+02:00 container attach 2d6bccfeb11ffe78f5d8b1c949be6362460e34e6b52d68e10456b54c18c3e228 (image=compile-ubuntu, name=eloquent_bell)
2021-10-18T22:40:46.690252647+02:00 network connect 3ed2b473424e1d7986e734a7da5466761867c2a280f78d271dd112cb3c729978 (container=2d6bccfeb11ffe78f5d8b1c949be6362460e34e6b52d68e10456b54c18c3e228, name=bridge, type=bridge)
2021-10-18T22:40:50.575914161+02:00 container start 2d6bccfeb11ffe78f5d8b1c949be6362460e34e6b52d68e10456b54c18c3e228 (image=compile-ubuntu, name=eloquent_bell)
2021-10-18T22:40:50.719294588+02:00 container resize 2d6bccfeb11ffe78f5d8b1c949be6362460e34e6b52d68e10456b54c18c3e228 (height=40, image=compile-ubuntu, name=eloquent_bell, width=152)


I wonder if you are hitting libseccomp slowdown issue and somehow amplified by your kernel version.

since CPU is busy doing something for docker start but time spent are prior to container setup so it is not network issue, usually this time are spent in environment setup which mean cgroup/security. The above libseccomp slowdown issue resolved in v2.5, so what is your sys-libs/libseccomp version?

I have a guess (not sure if will prove one way or another) but if you don't mind could you try
Code:
docker -it --rm --entry-point=/bin/bash --security-opt seccomp=unconfiged compile-ubuntu
If my guess are correct this should bypass the seccomp setup process and if you were hitting the libseccomp slowdown issue this should show mush fast startup.
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Tue Oct 19, 2021 2:46 pm    Post subject: Reply with quote

pingtoo wrote:
Dragonlord wrote:
5. This commands prints out nothing until shortly before the bash prompt arrives:
Code:
2021-10-18T22:40:45.506754269+02:00 container create 2d6bccfeb11ffe78f5d8b1c949be6362460e34e6b52d68e10456b54c18c3e228 (image=compile-ubuntu, name=eloquent_bell)
2021-10-18T22:40:45.593525967+02:00 container attach 2d6bccfeb11ffe78f5d8b1c949be6362460e34e6b52d68e10456b54c18c3e228 (image=compile-ubuntu, name=eloquent_bell)
2021-10-18T22:40:46.690252647+02:00 network connect 3ed2b473424e1d7986e734a7da5466761867c2a280f78d271dd112cb3c729978 (container=2d6bccfeb11ffe78f5d8b1c949be6362460e34e6b52d68e10456b54c18c3e228, name=bridge, type=bridge)
2021-10-18T22:40:50.575914161+02:00 container start 2d6bccfeb11ffe78f5d8b1c949be6362460e34e6b52d68e10456b54c18c3e228 (image=compile-ubuntu, name=eloquent_bell)
2021-10-18T22:40:50.719294588+02:00 container resize 2d6bccfeb11ffe78f5d8b1c949be6362460e34e6b52d68e10456b54c18c3e228 (height=40, image=compile-ubuntu, name=eloquent_bell, width=152)


I wonder if you are hitting libseccomp slowdown issue and somehow amplified by your kernel version.

since CPU is busy doing something for docker start but time spent are prior to container setup so it is not network issue, usually this time are spent in environment setup which mean cgroup/security. The above libseccomp slowdown issue resolved in v2.5, so what is your sys-libs/libseccomp version?

I have a guess (not sure if will prove one way or another) but if you don't mind could you try
Code:
docker -it --rm --entry-point=/bin/bash --security-opt seccomp=unconfiged compile-ubuntu
If my guess are correct this should bypass the seccomp setup process and if you were hitting the libseccomp slowdown issue this should show mush fast startup.

"unconfiged" is not a known profile. Some googling suggest "unconfined" so I tried this one. Time spend for running docker image did not change. Took now 414.87 seconds. libseccomp on my system is version 2.5.1
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
spica
Apprentice
Apprentice


Joined: 04 Jun 2021
Posts: 282

PostPosted: Tue Oct 19, 2021 4:01 pm    Post subject: Reply with quote

I guess the problem is either with resources or something wrong with image.

Could you please check how much time it takes to get bash prompt with some other huge image which is publicly available (please pull the image before the test)
For example, selenium/standalone-firefox:4.0.0-20211013 is a big image with a lot of layers
Code:
docker pull selenium/standalone-firefox:4.0.0-20211013
time docker run --rm -ti --entrypoint true selenium/standalone-firefox:4.0.0-20211013

On my laptop the result number is 0m2.928s.
if it takes the same long time the problem is something related to resources like storage (if this is EC2 then it worth checking IOPS credits usage for EBS), cpu, memory, network issues.
If it does not take too much time to start, then the problem with image, and if possible, we need details about compile-ubuntu image, is it publicly available? Can we see the Dockerfile or get the image to try reproduce locally?
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 887
Location: Richmond Hill, Canada

PostPosted: Tue Oct 19, 2021 4:34 pm    Post subject: Reply with quote

Dragonlord wrote:
"unconfiged" is not a known profile. Some googling suggest "unconfined" so I tried this one. Time spend for running docker image did not change. Took now 414.87 seconds. libseccomp on my system is version 2.5.1


Sorry, you are correct, it should be unconfined, look at my notes it is also unconfined, not sure what was typing this morning :-)

I am about running out of ideas, my guess is this problem may have to do with docker version vs your kernel version. Not sure if prior to this problem your docker version lower than v20.x? could it be v19.x?

if you wish to continue we can try debugging in daemon. Please use three sessions, one for running dockerd in debug mode, one for docker event and the last session run docker run ...

dockerd in debug mode
Code:
## rc-service docker stop
## as root
docker -D -l debug


docker event session
Code:
## after dockerd started, no need to to root
docker event


docker run session
Code:
## after docker started, no need to be root
docker run -it --rm --entry-point=/bin/bash compile-ubuntu


Please log above 3 sessions output and share with us (in here or any pastebin)

And for completeness, you tried with other images and got same slow result?
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Tue Oct 19, 2021 6:23 pm    Post subject: Reply with quote

spica wrote:
I guess the problem is either with resources or something wrong with image.

Could you please check how much time it takes to get bash prompt with some other huge image which is publicly available (please pull the image before the test)
For example, selenium/standalone-firefox:4.0.0-20211013 is a big image with a lot of layers
Code:
docker pull selenium/standalone-firefox:4.0.0-20211013
time docker run --rm -ti --entrypoint true selenium/standalone-firefox:4.0.0-20211013

On my laptop the result number is 0m2.928s.
if it takes the same long time the problem is something related to resources like storage (if this is EC2 then it worth checking IOPS credits usage for EBS), cpu, memory, network issues.
If it does not take too much time to start, then the problem with image, and if possible, we need details about compile-ubuntu image, is it publicly available? Can we see the Dockerfile or get the image to try reproduce locally?

Running the image reports 0m33.016s
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Tue Oct 19, 2021 6:38 pm    Post subject: Reply with quote

pingtoo wrote:
Dragonlord wrote:
"unconfiged" is not a known profile. Some googling suggest "unconfined" so I tried this one. Time spend for running docker image did not change. Took now 414.87 seconds. libseccomp on my system is version 2.5.1


Sorry, you are correct, it should be unconfined, look at my notes it is also unconfined, not sure what was typing this morning :-)

I am about running out of ideas, my guess is this problem may have to do with docker version vs your kernel version. Not sure if prior to this problem your docker version lower than v20.x? could it be v19.x?

if you wish to continue we can try debugging in daemon. Please use three sessions, one for running dockerd in debug mode, one for docker event and the last session run docker run ...

dockerd in debug mode
Code:
## rc-service docker stop
## as root
docker -D -l debug


docker event session
Code:
## after dockerd started, no need to to root
docker event


docker run session
Code:
## after docker started, no need to be root
docker run -it --rm --entry-point=/bin/bash compile-ubuntu


Please log above 3 sessions output and share with us (in here or any pastebin)

And for completeness, you tried with other images and got same slow result?


Here it is https://pastebin.com/zePBA85Z . For the other image see the post above

For the kernel some uname for you:
Code:
Linux server 4.19.66-gentoo #1 SMP Sun Aug 18 21:34:18 CEST 2019 x86_64 Intel(R) Xeon(R) CPU E3-1220 v6 @ 3.00GHz GenuineIntel GNU/Linux

_________________
DragonDreams: Leader and Head Programmer


Last edited by Dragonlord on Tue Oct 19, 2021 7:19 pm; edited 1 time in total
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 887
Location: Richmond Hill, Canada

PostPosted: Tue Oct 19, 2021 7:10 pm    Post subject: Reply with quote

Dragonlord wrote:
Here it is https://pastebin.com/zePBA85Z. For the other image see the post above


Not familiar with pastebin.com, use the URL I got page not found. Do I need to setup something in order to access?
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 887
Location: Richmond Hill, Canada

PostPosted: Tue Oct 19, 2021 7:19 pm    Post subject: Reply with quote

Dragonlord wrote:
spica wrote:
I guess the problem is either with resources or something wrong with image.

Could you please check how much time it takes to get bash prompt with some other huge image which is publicly available (please pull the image before the test)
For example, selenium/standalone-firefox:4.0.0-20211013 is a big image with a lot of layers
Code:
docker pull selenium/standalone-firefox:4.0.0-20211013
time docker run --rm -ti --entrypoint true selenium/standalone-firefox:4.0.0-20211013

On my laptop the result number is 0m2.928s.
if it takes the same long time the problem is something related to resources like storage (if this is EC2 then it worth checking IOPS credits usage for EBS), cpu, memory, network issues.
If it does not take too much time to start, then the problem with image, and if possible, we need details about compile-ubuntu image, is it publicly available? Can we see the Dockerfile or get the image to try reproduce locally?

Running the image reports 0m33.016s


spica's idea give me something to think. can you paste in here output of
Code:
docker info
Just want to confirm what is exact setting of dockerd, I maybe under wrong assumption of how your docker environment setup.

Also how old is ubuntu-compile image? Maybe rebuild this image would help.
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Tue Oct 19, 2021 7:20 pm    Post subject: Reply with quote

pingtoo wrote:
Dragonlord wrote:
Here it is https://pastebin.com/zePBA85Z. For the other image see the post above


Not familiar with pastebin.com, use the URL I got page not found. Do I need to setup something in order to access?

Forum problem. It considered the period part of the URL. I fixed the post.
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Tue Oct 19, 2021 7:21 pm    Post subject: Reply with quote

pingtoo wrote:
Dragonlord wrote:
spica wrote:
I guess the problem is either with resources or something wrong with image.

Could you please check how much time it takes to get bash prompt with some other huge image which is publicly available (please pull the image before the test)
For example, selenium/standalone-firefox:4.0.0-20211013 is a big image with a lot of layers
Code:
docker pull selenium/standalone-firefox:4.0.0-20211013
time docker run --rm -ti --entrypoint true selenium/standalone-firefox:4.0.0-20211013

On my laptop the result number is 0m2.928s.
if it takes the same long time the problem is something related to resources like storage (if this is EC2 then it worth checking IOPS credits usage for EBS), cpu, memory, network issues.
If it does not take too much time to start, then the problem with image, and if possible, we need details about compile-ubuntu image, is it publicly available? Can we see the Dockerfile or get the image to try reproduce locally?

Running the image reports 0m33.016s


spica's idea give me something to think. can you paste in here output of
Code:
docker info
Just want to confirm what is exact setting of dockerd, I maybe under wrong assumption of how your docker environment setup.

Also how old is ubuntu-compile image? Maybe rebuild this image would help.


I rebuild the image some weeks ago so pretty new.

Here the info:
Code:
root@server:/home/roland> docker info
Client:
 Context:    default
 Debug Mode: false

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 4
 Server Version: 20.10.7
 Storage Driver: vfs
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: d71fcd7d8303cbf684402823e425e9dd2e99285d
 runc version: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
 init version: de40ad007797e0dcd8b7126f27bb87401d224240
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.19.66-gentoo
 Operating System: Gentoo/Linux
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 7.629GiB
 Name: server
 ID: N4GM:6XXG:7CHH:3LCD:G3MP:VZ7P:OOC3:LQI7:NF3U:CID6:QZPU:LTKS
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support
WARNING: No blkio throttle.read_bps_device support
WARNING: No blkio throttle.write_bps_device support
WARNING: No blkio throttle.read_iops_device support
WARNING: No blkio throttle.write_iops_device support

_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3104

PostPosted: Tue Oct 19, 2021 8:05 pm    Post subject: Reply with quote

Quote:
This problem started to happen some month ago. I have the suspicion that some update on packages causes this. Basically I've got a Jenkins running building nightlies using docker images. Docker images take longer and longer to start up causing Jenkins to randomly fail builds due to docker timeout.

I've seen this behavior caused by loads and loads of cruft accumulated over running/building loads of containers. This particular issue is easily fixed with "docker system prune"

No, I'm not sure it is going to solve your problem. It's easy enough to try though.
Warning: this will delete everything you're not using at the moment, so make sure the containers you care for either are running or can be pulled again.
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 887
Location: Richmond Hill, Canada

PostPosted: Tue Oct 19, 2021 8:09 pm    Post subject: Reply with quote

Dragonlord wrote:
Here the info:
Code:
root@server:/home/roland> docker info
Client:
 Context:    default
 Debug Mode: false

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 4
 Server Version: 20.10.7
 Storage Driver: vfs  <------------------ This surprise me, why use *vfs*?
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: d71fcd7d8303cbf684402823e425e9dd2e99285d
 runc version: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
 init version: de40ad007797e0dcd8b7126f27bb87401d224240
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.19.66-gentoo
 Operating System: Gentoo/Linux
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 7.629GiB
 Name: server
 ID: N4GM:6XXG:7CHH:3LCD:G3MP:VZ7P:OOC3:LQI7:NF3U:CID6:QZPU:LTKS
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support
WARNING: No blkio throttle.read_bps_device support
WARNING: No blkio throttle.write_bps_device support
WARNING: No blkio throttle.read_iops_device support
WARNING: No blkio throttle.write_iops_device support


Although vfs may not be the culprit but it is known to be slow I suggest you switch to overlay2.

What is
Code:
docker history ubuntu-compile
Let's see how many layer this image have.
Back to top
View user's profile Send private message
spica
Apprentice
Apprentice


Joined: 04 Jun 2021
Posts: 282

PostPosted: Tue Oct 19, 2021 8:27 pm    Post subject: Reply with quote

Dragonlord wrote:
Running the image reports 0m33.016s
This is too much... I can be wrong but think this is a problem somewhere on the host... It's not Jenkins too. Docker has an experimental flag --squash doc which melts layers together, this can help to make compile-ubuntu image a bit smaller. This will not solve the problem, but postpone it.. When a container starts, there should be a lot of file operations, which are copy-on-write under the hood... there might be a swapping problem, but according to previous logs swap is turned off..
The "top" command shows several CPU counters, us sy id wa hi si st... Interesting to know, when the container starts, what is the value of "wa" (CPU cycles waiting for IO) and "st" (CPU cycles stolen by hypervisor).
I never used vfs, first time seeing a docker daemon using this. I use overlay2 only
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Wed Oct 20, 2021 5:45 pm    Post subject: Reply with quote

pingtoo wrote:
Dragonlord wrote:
Here the info:
Code:
root@server:/home/roland> docker info
Client:
 Context:    default
 Debug Mode: false

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 4
 Server Version: 20.10.7
 Storage Driver: vfs  <------------------ This surprise me, why use *vfs*?
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: d71fcd7d8303cbf684402823e425e9dd2e99285d
 runc version: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
 init version: de40ad007797e0dcd8b7126f27bb87401d224240
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.19.66-gentoo
 Operating System: Gentoo/Linux
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 7.629GiB
 Name: server
 ID: N4GM:6XXG:7CHH:3LCD:G3MP:VZ7P:OOC3:LQI7:NF3U:CID6:QZPU:LTKS
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support
WARNING: No blkio throttle.read_bps_device support
WARNING: No blkio throttle.write_bps_device support
WARNING: No blkio throttle.read_iops_device support
WARNING: No blkio throttle.write_iops_device support


Although vfs may not be the culprit but it is known to be slow I suggest you switch to overlay2.

What is
Code:
docker history ubuntu-compile
Let's see how many layer this image have.


This is the output:

Code:
roland@server:~> docker history compile-ubuntu
IMAGE          CREATED         CREATED BY                                      SIZE      COMMENT
b62eade0ec9b   5 weeks ago     /bin/sh -c #(nop)  ENV HOME=/home/builduser     0B       
<missing>      5 weeks ago     /bin/sh -c #(nop)  USER builduser               0B       
<missing>      5 weeks ago     /bin/sh -c export UNAME=$UNAME UID=1000 GID=…   3.73kB   
<missing>      5 weeks ago     /bin/sh -c apt update  && apt -y install bui…   2.44GB   
<missing>      3 months ago    /bin/sh -c #(nop)  ENV UNAME=builduser          0B       
<missing>      3 months ago    /bin/sh -c #(nop)  ENV DEBIAN_FRONTEND=nonin…   0B       
<missing>      12 months ago   /bin/sh -c #(nop)  CMD ["/bin/bash"]            0B       
<missing>      12 months ago   /bin/sh -c mkdir -p /run/systemd && echo 'do…   7B       
<missing>      12 months ago   /bin/sh -c [ -z "$(apt-get indextargets)" ]     0B       
<missing>      12 months ago   /bin/sh -c set -xe   && echo '#!/bin/sh' > /…   811B     
<missing>      12 months ago   /bin/sh -c #(nop) ADD file:435d9776fdd3a1834…   72.9MB   

_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Wed Oct 20, 2021 5:48 pm    Post subject: Reply with quote

spica wrote:
Dragonlord wrote:
Running the image reports 0m33.016s
This is too much... I can be wrong but think this is a problem somewhere on the host... It's not Jenkins too. Docker has an experimental flag --squash doc which melts layers together, this can help to make compile-ubuntu image a bit smaller. This will not solve the problem, but postpone it.. When a container starts, there should be a lot of file operations, which are copy-on-write under the hood... there might be a swapping problem, but according to previous logs swap is turned off..
The "top" command shows several CPU counters, us sy id wa hi si st... Interesting to know, when the container starts, what is the value of "wa" (CPU cycles waiting for IO) and "st" (CPU cycles stolen by hypervisor).
I never used vfs, first time seeing a docker daemon using this. I use overlay2 only

I never configured anything so this is Gentoo default value from wherever it originated from. I do not mind changing that value if it helps. How do you change this value?
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Wed Oct 20, 2021 5:51 pm    Post subject: Reply with quote

This docker thing is quite volatile. This time around the figures are like this:
Code:
real 137.85
user 0.05
sys 0.00


So docker randomly uses between 120s and 600s. I find this quite a large amount.
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum