Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Docker image takes minutes to load
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2  
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Wed Oct 20, 2021 5:55 pm    Post subject: Reply with quote

spica wrote:
Dragonlord wrote:
Running the image reports 0m33.016s
This is too much... I can be wrong but think this is a problem somewhere on the host... It's not Jenkins too. Docker has an experimental flag --squash doc which melts layers together, this can help to make compile-ubuntu image a bit smaller. This will not solve the problem, but postpone it.. When a container starts, there should be a lot of file operations, which are copy-on-write under the hood... there might be a swapping problem, but according to previous logs swap is turned off..
The "top" command shows several CPU counters, us sy id wa hi si st... Interesting to know, when the container starts, what is the value of "wa" (CPU cycles waiting for IO) and "st" (CPU cycles stolen by hypervisor).
I never used vfs, first time seeing a docker daemon using this. I use overlay2 only

"wa" is constantly above 50 sometimes going up to 70. "st" is constantly 0.
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Wed Oct 20, 2021 6:45 pm    Post subject: Reply with quote

First I think I owe you apology, I though I know the problem but kind of going to wrong direction. which lead to lots of time wasted on invalid check. So very sorry about taking much of your time.

Thanks to spica point out we should use another image to test which lead to discover of using vfs, Now I am much confident that the problem are cause by heavy I/O during container creation, your dockerd -D -l debug also show long delay after client submit configuration until container create. In your case vfs will copy your 2+GB image to create a container r/w layer this lead to the delay. Do you by any chance /var/lib/docker is on SSD? if it is SSD you may need trim the file system. The large variation of start time lead to this guess.
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Wed Oct 20, 2021 7:04 pm    Post subject: Reply with quote

in order to switch to overlay2 we may need several steps

  1. verify is kernel configured support overlayfs, Please use check-config.sh to check.
  2. app-emulation/docker is build with overlay flag, if not, you should first make sure step 1 is complete then rebuild app-emulation/docker

  3. to migrate existing docker images stored on vfs we will need to backup then delete /var/lib/docker then restore docker images from backup. This is another multiple steps procedure. We can work on detail once we are here.
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Wed Oct 20, 2021 9:03 pm    Post subject: Reply with quote

I don't have overlay support in the kernel. I need to recompile the kernel but this is not going to happen today. The /var/lib/docker I can fully delete. All images I use I build from my own docker files so that's not a problem there

EDIT: And BTW, no problem because of wasting time. I'm a professional developer. I've searched the cause of certain bugs for way longer than this ;)
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
spica
Apprentice
Apprentice


Joined: 04 Jun 2021
Posts: 287

PostPosted: Wed Oct 20, 2021 11:14 pm    Post subject: Reply with quote

pingtoo, everything is good, no problem, we did follow the theory of investigation together, which says that in order to find a problem in a complicated software we need to start with cutting the potential problem places, and if only one possible place left, we can say that we have localized where the problem is. Docker/Moby is a complicated project, and the investigation takes a lot of time.

Dragonlord, optionally, you can check if it is possible to rebuild compile-ubuntu with the "--squash" option. This will not solve the problem, but it can help to win some additional time for running the build pipeline without interruptions, while we investigate what caused this bad IO performance.
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Thu Oct 21, 2021 11:56 am    Post subject: Reply with quote

Dragonlord wrote:
I don't have overlay support in the kernel. I need to recompile the kernel but this is not going to happen today. The /var/lib/docker I can fully delete. All images I use I build from my own docker files so that's not a problem there


delete /var/lib/docker can be done later when kernel/docker updated.

if /var/lib/docker is on SSD then it is possible the file system need trim which can be quick workaround.

Thanks for understanding.
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Thu Oct 21, 2021 7:22 pm    Post subject: Reply with quote

@spica: I can try the --squash version. Is this an option for the docker build command or inside the dockerfile somewhere?

@pinktoo: There is no SSD in the server.
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
spica
Apprentice
Apprentice


Joined: 04 Jun 2021
Posts: 287

PostPosted: Thu Oct 21, 2021 7:58 pm    Post subject: Reply with quote

Dragonlord wrote:
@spica: I can try the --squash version. Is this an option for the docker build command or inside the dockerfile somewhere?

It's an option for the docker build command, please see the documentation here https://docs.docker.com/engine/reference/commandline/image_build/
It merges the new layers together, and the result image is smaller -> less read/write operations to disk -> less IO -> container starts faster. It is not a solution, but an attempt to minimize symptoms.
As for now, I think the disk is a bottleneck, and need to investigate why.
What that disk physically is? I have seen the numbers like 70wa before, when MMC card was used as a disk, and also observed such numbers in cloud providers, when block storage volume has IO per second credits exhausted and the cloud provider limited the bandwidth to the disk.
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Fri Oct 22, 2021 9:25 am    Post subject: Reply with quote

spica wrote:
Dragonlord wrote:
@spica: I can try the --squash version. Is this an option for the docker build command or inside the dockerfile somewhere?

It's an option for the docker build command, please see the documentation here https://docs.docker.com/engine/reference/commandline/image_build/
It merges the new layers together, and the result image is smaller -> less read/write operations to disk -> less IO -> container starts faster. It is not a solution, but an attempt to minimize symptoms.
As for now, I think the disk is a bottleneck, and need to investigate why.
What that disk physically is? I have seen the numbers like 70wa before, when MMC card was used as a disk, and also observed such numbers in cloud providers, when block storage volume has IO per second credits exhausted and the cloud provider limited the bandwidth to the disk.

The main disc is a "TOSHIBA MG04ACA1". So classic design. But it is the same disk that worked some time ago. So once upon time this docker image started on the same hardware in 3-5 seconds... now it takes 3-5minutes so I'm pretty sure it's a software problem.
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
spica
Apprentice
Apprentice


Joined: 04 Jun 2021
Posts: 287

PostPosted: Fri Oct 22, 2021 11:29 am    Post subject: Reply with quote

Dragonlord wrote:
I'm pretty sure it's a software problem.

It can be a software problem. For example, the disk IO is occupied by some another process, and docker is left behind.
You can use tools like iotop to find IO consumers and dd to measure the disk performance.
Code:
dd if=/dev/zero of=/var/lib/docker/test.img bs=1G count=1 oflag=direct status=progress
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Fri Oct 22, 2021 2:06 pm    Post subject: Reply with quote

spica wrote:
Dragonlord wrote:
I'm pretty sure it's a software problem.

It can be a software problem. For example, the disk IO is occupied by some another process, and docker is left behind.
You can use tools like iotop to find IO consumers and dd to measure the disk performance.
Code:
dd if=/dev/zero of=/var/lib/docker/test.img bs=1G count=1 oflag=direct status=progress


@Dragonlord, I guess your reference to software you mean it is either linux kernel I/O sub-system or docker graphdriver(storage). I can assure you there are no other software invoke during container creation. Review your pastebin output line 346-347 that's when client submit request until container r/w layer ready and mounted, it is about 2.2 minutes there is where vfs driver copy from parent layer (image) to create a container (r/w) layer. in vfs storage drive there are no compress/decompress process once image pulled into local storage.

Off head I don't recall how to discover the image layer location on disk but maybe you can run
Code:
docker inspect ubuntu-compile
might share some location we can later test copy to see how long it take. The vfs copy file by file from parent layer to container layer very much like tar copy with all file attributes. So I think in your case ubuntu-compile have very large amount of files so it take large amount of I/O time.
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Fri Oct 22, 2021 4:22 pm    Post subject: Reply with quote

spica wrote:
pingtoo, everything is good, no problem, we did follow the theory of investigation together, which says that in order to find a problem in a complicated software we need to start with cutting the potential problem places, and if only one possible place left, we can say that we have localized where the problem is. Docker/Moby is a complicated project, and the investigation takes a lot of time.

Dragonlord, optionally, you can check if it is possible to rebuild compile-ubuntu with the "--squash" option. This will not solve the problem, but it can help to win some additional time for running the build pipeline without interruptions, while we investigate what caused this bad IO performance.

Squash command line is not working. Tells me I need an experimental docker daemon for this.
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Fri Oct 22, 2021 4:24 pm    Post subject: Reply with quote

pingtoo wrote:
spica wrote:
Dragonlord wrote:
I'm pretty sure it's a software problem.

It can be a software problem. For example, the disk IO is occupied by some another process, and docker is left behind.
You can use tools like iotop to find IO consumers and dd to measure the disk performance.
Code:
dd if=/dev/zero of=/var/lib/docker/test.img bs=1G count=1 oflag=direct status=progress


@Dragonlord, I guess your reference to software you mean it is either linux kernel I/O sub-system or docker graphdriver(storage). I can assure you there are no other software invoke during container creation. Review your pastebin output line 346-347 that's when client submit request until container r/w layer ready and mounted, it is about 2.2 minutes there is where vfs driver copy from parent layer (image) to create a container (r/w) layer. in vfs storage drive there are no compress/decompress process once image pulled into local storage.

Off head I don't recall how to discover the image layer location on disk but maybe you can run
Code:
docker inspect ubuntu-compile
might share some location we can later test copy to see how long it take. The vfs copy file by file from parent layer to container layer very much like tar copy with all file attributes. So I think in your case ubuntu-compile have very large amount of files so it take large amount of I/O time.

Here is the output of the tool:
Code:
roland@server:~> docker inspect compile-ubuntu
[
    {
        "Id": "sha256:b62eade0ec9bad8e24fbda24dd3087f78485d5eae661f90c24fd100ee527efc6",
        "RepoTags": [
            "compile-ubuntu:latest"
        ],
        "RepoDigests": [],
        "Parent": "",
        "Comment": "",
        "Created": "2021-09-09T12:51:08.492542893Z",
        "Container": "7c16cd51536a2a94eb5bcde3a0ac3a81735b7876ae5f2288409d039ac029edb2",
        "ContainerConfig": {
            "Hostname": "7c16cd51536a",
            "Domainname": "",
            "User": "builduser",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "DEBIAN_FRONTEND=noninteractive",
                "UNAME=builduser",
                "HOME=/home/builduser"
            ],
            "Cmd": [
                "/bin/sh",
                "-c",
                "#(nop) ",
                "ENV HOME=/home/builduser"
            ],
            "ArgsEscaped": true,
            "Image": "sha256:ae9fa2c85eabcb00fd91d18f5e1b2f4f3f285ea3bbb330202c53e264ba7edea9",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": {}
        },
        "DockerVersion": "20.10.7",
        "Author": "",
        "Config": {
            "Hostname": "",
            "Domainname": "",
            "User": "builduser",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "DEBIAN_FRONTEND=noninteractive",
                "UNAME=builduser",
                "HOME=/home/builduser"
            ],
            "Cmd": [
                "/bin/bash"
            ],
            "ArgsEscaped": true,
            "Image": "sha256:ae9fa2c85eabcb00fd91d18f5e1b2f4f3f285ea3bbb330202c53e264ba7edea9",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": null
        },
        "Architecture": "amd64",
        "Os": "linux",
        "Size": 2512724112,
        "VirtualSize": 2512724112,
        "GraphDriver": {
            "Data": null,
            "Name": "vfs"
        },
        "RootFS": {
            "Type": "layers",
            "Layers": [
                "sha256:47dde53750b4a8ed24acebe52cf31ad131e73a9611048fc2f92c9b6274ab4bf3",
                "sha256:0c2689e3f9206b1c4adfb16a1976d25bd270755e734588409b31ef29e3e756d6",
                "sha256:cc9d18e90faad04bc3893cfaa50b7846ee75f48f5b8377a213fa52af2189095c",
                "sha256:8dfbe290015bce2e62091994f432b0ca6fec7302973e9bf1594ae71e636dbf8b",
                "sha256:f1800cff1007ee09cbdcc06302b4e747107e5c874aa725518089aec1111e6ba0"
            ]
        },
        "Metadata": {
            "LastTagTime": "0001-01-01T00:00:00Z"
        }
    }
]

_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Fri Oct 22, 2021 5:23 pm    Post subject: Reply with quote

Dragonlord wrote:
Code:
        "RootFS": {
            "Type": "layers",
            "Layers": [
                "sha256:47dde53750b4a8ed24acebe52cf31ad131e73a9611048fc2f92c9b6274ab4bf3",
                "sha256:0c2689e3f9206b1c4adfb16a1976d25bd270755e734588409b31ef29e3e756d6",
                "sha256:cc9d18e90faad04bc3893cfaa50b7846ee75f48f5b8377a213fa52af2189095c",
                "sha256:8dfbe290015bce2e62091994f432b0ca6fec7302973e9bf1594ae71e636dbf8b",
                "sha256:f1800cff1007ee09cbdcc06302b4e747107e5c874aa725518089aec1111e6ba0"
            ]
        },
If you cd in to /var/lib/docker/vfs/dir/ then
Code:
du -xsh ./*
You should be able to find some directory that are around 2GB or more, we can try copy one of 2GB+ directory and see how long it take to do the copy.

The test copy process is intent to find out how much time it will take on the storage, so the read and write should be on same disk to mimic docker vfs driver create container layer.
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Fri Oct 22, 2021 8:46 pm    Post subject: Reply with quote

pingtoo wrote:
Dragonlord wrote:
Code:
        "RootFS": {
            "Type": "layers",
            "Layers": [
                "sha256:47dde53750b4a8ed24acebe52cf31ad131e73a9611048fc2f92c9b6274ab4bf3",
                "sha256:0c2689e3f9206b1c4adfb16a1976d25bd270755e734588409b31ef29e3e756d6",
                "sha256:cc9d18e90faad04bc3893cfaa50b7846ee75f48f5b8377a213fa52af2189095c",
                "sha256:8dfbe290015bce2e62091994f432b0ca6fec7302973e9bf1594ae71e636dbf8b",
                "sha256:f1800cff1007ee09cbdcc06302b4e747107e5c874aa725518089aec1111e6ba0"
            ]
        },
If you cd in to /var/lib/docker/vfs/dir/ then
Code:
du -xsh ./*
You should be able to find some directory that are around 2GB or more, we can try copy one of 2GB+ directory and see how long it take to do the copy.

The test copy process is intent to find out how much time it will take on the storage, so the read and write should be on same disk to mimic docker vfs driver create container layer.


Here is the timing:
Code:
real 50.21
user 0.34
sys 4.59


That would be 50s. A lot but not enough to justify between 3 and 6 minutes.

About "overlay2"... would it actually remove the copy process or can this process not be skipped? If so I might try alpine to see if I get the image smaller. But let's look at things one step at the time.
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Fri Oct 22, 2021 9:07 pm    Post subject: Reply with quote

Dragonlord wrote:
Here is the timing:
Code:
real 50.21
user 0.34
sys 4.59


That would be 50s. A lot but not enough to justify between 3 and 6 minutes.

About "overlay2"... would it actually remove the copy process or can this process not be skipped? If so I might try alpine to see if I get the image smaller. But let's look at things one step at the time.
All we been doing is to try to prove your I/O system is causing the slowdown. Using "overlay2" just one possibility. Do you have another disk can be use? If you switch to alpinelinux you should see some improvement right away but since alpinelinux use musl instead glibc your end result may not match your expectation.

Since each disk have fixed throughput and usually will not degrade very much overtime. So if it is not because the disk are dying than it mean something else occupied the disk bandwidth at the time you are build. We can do quick test, Since your test at the moment is 50s then if you do your normal job should be able start and not timeout at this time.

So either you can introduce another disk or find out if at the timeout moment what else is using disk bandwidth, Or use smartmontools to example the disk to see if it is dying.
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Sat Oct 23, 2021 12:14 am    Post subject: Reply with quote

pingtoo wrote:
Dragonlord wrote:
Here is the timing:
Code:
real 50.21
user 0.34
sys 4.59


That would be 50s. A lot but not enough to justify between 3 and 6 minutes.

About "overlay2"... would it actually remove the copy process or can this process not be skipped? If so I might try alpine to see if I get the image smaller. But let's look at things one step at the time.
All we been doing is to try to prove your I/O system is causing the slowdown. Using "overlay2" just one possibility. Do you have another disk can be use? If you switch to alpinelinux you should see some improvement right away but since alpinelinux use musl instead glibc your end result may not match your expectation.

Since each disk have fixed throughput and usually will not degrade very much overtime. So if it is not because the disk are dying than it mean something else occupied the disk bandwidth at the time you are build. We can do quick test, Since your test at the moment is 50s then if you do your normal job should be able start and not timeout at this time.

So either you can introduce another disk or find out if at the timeout moment what else is using disk bandwidth, Or use smartmontools to example the disk to see if it is dying.

The disk and system is the same since month. So why should IO performance degrade by a factor of 10 and more? I don't think that's the problem because then I would have had these kind of problems since years.

At the time of measuring up to 6 minutes on docker starting the image nothing had been using the disk. And even then 50s can be attributed to the disk at best but according to the logs starting the image began 3 minutes later. This leaves over 2 minutes unexplained. I guess that is the real culprit.

Now maybe I get some time soon to test the overlay2.
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Sat Oct 23, 2021 1:35 pm    Post subject: Reply with quote

Dragonload wrote:
And even then 50s can be attributed to the disk at best but according to the logs starting the image began 3 minutes later. This leaves over 2 minutes unexplained.
Can you help me understand how did you tested got that 2 minutes unexplained conclusion? I am under impression that a test out side of docker was perform in docker's vfs location that simply measure disk I/O by copy directory on same disk, So this test will not tell when docker container creation finish

You start out this thread by
Dragonlord wrote:
This problem started to happen some month ago.
And Your image history
Code:
roland@server:~> docker history compile-ubuntu
IMAGE          CREATED         CREATED BY                                      SIZE      COMMENT
b62eade0ec9b   5 weeks ago     /bin/sh -c #(nop)  ENV HOME=/home/builduser     0B       
<missing>      5 weeks ago     /bin/sh -c #(nop)  USER builduser               0B       
<missing>      5 weeks ago     /bin/sh -c export UNAME=$UNAME UID=1000 GID=…   3.73kB   
<missing>      5 weeks ago     /bin/sh -c apt update  && apt -y install bui…   2.44GB   
<missing>      3 months ago    /bin/sh -c #(nop)  ENV UNAME=builduser          0B       
<missing>      3 months ago    /bin/sh -c #(nop)  ENV DEBIAN_FRONTEND=nonin…   0B       
<missing>      12 months ago   /bin/sh -c #(nop)  CMD ["/bin/bash"]            0B       
<missing>      12 months ago   /bin/sh -c mkdir -p /run/systemd && echo 'do…   7B       
<missing>      12 months ago   /bin/sh -c [ -z "$(apt-get indextargets)" ]     0B       
<missing>      12 months ago   /bin/sh -c set -xe   && echo '#!/bin/sh' > /…   811B     
<missing>      12 months ago   /bin/sh -c #(nop) ADD file:435d9776fdd3a1834…   72.9MB   
show that 5 weeks ago a 2GB+ layer created, Could this be something new?

Through out this thread we seen different time spin for the image startup, sometime under 3 minutes, sometime over 6 minutes this indicate your runtime environment is not consistent as you assume. in our test duration your kernel did not change, your docker version did not change, as a programmer you surely understand once a logic are coded it does not change too, so docker container creation procedure is always same. Therefor some other factors in your runtime environment is influence your docker startup time. If you sure your hardware is not the cause than there have to be some other actions are going on during the container creation. What other duties you server serve? NFS perhaps? Samba? database?
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Mon Oct 25, 2021 11:13 am    Post subject: Reply with quote

pingtoo wrote:
Dragonload wrote:
And even then 50s can be attributed to the disk at best but according to the logs starting the image began 3 minutes later. This leaves over 2 minutes unexplained.
Can you help me understand how did you tested got that 2 minutes unexplained conclusion? I am under impression that a test out side of docker was perform in docker's vfs location that simply measure disk I/O by copy directory on same disk, So this test will not tell when docker container creation finish

You start out this thread by
Dragonlord wrote:
This problem started to happen some month ago.
And Your image history
Code:
roland@server:~> docker history compile-ubuntu
IMAGE          CREATED         CREATED BY                                      SIZE      COMMENT
b62eade0ec9b   5 weeks ago     /bin/sh -c #(nop)  ENV HOME=/home/builduser     0B       
<missing>      5 weeks ago     /bin/sh -c #(nop)  USER builduser               0B       
<missing>      5 weeks ago     /bin/sh -c export UNAME=$UNAME UID=1000 GID=…   3.73kB   
<missing>      5 weeks ago     /bin/sh -c apt update  && apt -y install bui…   2.44GB   
<missing>      3 months ago    /bin/sh -c #(nop)  ENV UNAME=builduser          0B       
<missing>      3 months ago    /bin/sh -c #(nop)  ENV DEBIAN_FRONTEND=nonin…   0B       
<missing>      12 months ago   /bin/sh -c #(nop)  CMD ["/bin/bash"]            0B       
<missing>      12 months ago   /bin/sh -c mkdir -p /run/systemd && echo 'do…   7B       
<missing>      12 months ago   /bin/sh -c [ -z "$(apt-get indextargets)" ]     0B       
<missing>      12 months ago   /bin/sh -c set -xe   && echo '#!/bin/sh' > /…   811B     
<missing>      12 months ago   /bin/sh -c #(nop) ADD file:435d9776fdd3a1834…   72.9MB   
show that 5 weeks ago a 2GB+ layer created, Could this be something new?

Through out this thread we seen different time spin for the image startup, sometime under 3 minutes, sometime over 6 minutes this indicate your runtime environment is not consistent as you assume. in our test duration your kernel did not change, your docker version did not change, as a programmer you surely understand once a logic are coded it does not change too, so docker container creation procedure is always same. Therefor some other factors in your runtime environment is influence your docker startup time. If you sure your hardware is not the cause than there have to be some other actions are going on during the container creation. What other duties you server serve? NFS perhaps? Samba? database?

The server does various things. But I also had to time using "top" next to it to check system load (and IO load for that matter). There had been no load at the time the test has been done yet the results are inconsistent. As mentioned I had no problems until some time ago. I can not tell exactly when but it certainly has been when docker updated. Before the same container could start in 10s even if the server had tasks to do and now up to 6 minutes while doing nothing. I'm pretty sure docker is the culprit here but I can not directly proof it.

To come back to the "overlay2"... what exactly does it do differently than "vfs"? Does it require experimental settings like "--squash" which I can not use?

EDIT: On my development machine I've got GenToo too and the same docker image. There it starts in
real 0m1.809s

So I'm pretty sure something is up with the docker installation. Maybe an update failed or broke something but I'm not deep enough into docker to diagnose this

EDIT: EDIT: Interestingly this installation reports "Storage Driver: overlay2". I update both systems the same so I'm a bit astonished why docker in one place choses "vfs" and in another "overlay2".
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
pingtoo
l33t
l33t


Joined: 10 Sep 2021
Posts: 926
Location: Richmond Hill, Canada

PostPosted: Mon Oct 25, 2021 12:52 pm    Post subject: Reply with quote

Dragonlord wrote:
The server does various things. But I also had to time using "top" next to it to check system load (and IO load for that matter). There had been no load at the time the test has been done yet the results are inconsistent. As mentioned I had no problems until some time ago. I can not tell exactly when but it certainly has been when docker updated. Before the same container could start in 10s even if the server had tasks to do and now up to 6 minutes while doing nothing. I'm pretty sure docker is the culprit here but I can not directly proof it.

To come back to the "overlay2"... what exactly does it do differently than "vfs"? Does it require experimental settings like "--squash" which I can not use?


overlay2 use a unionfs strategy which mount parent directories as under layer read-only storage and create a blank directory as upper layer read/write for container contents. Whereas vfs copy patent to a new directory an use the new directory as container tead/write layer.

Dragonload wrote:
EDIT: On my development machine I've got GenToo too and the same docker image. There it starts in
real 0m1.809s

So I'm pretty sure something is up with the docker installation. Maybe an update failed or broke something but I'm not deep enough into docker to diagnose this

EDIT: EDIT: Interestingly this installation reports "Storage Driver: overlay2". I update both systems the same so I'm a bit astonished why docker in one place choses "vfs" and in another "overlay2".


Assume your *other* you mean the quick load development machine, it is likely the reason overlay2 doing its work. Docker ebuild use exclude strategy choose which storage driver not included in build, So since your non-develop machine kernel not include overlay2fs than it will not have the overlay2 storage driver. I am guessing that is the reason it start with vfs driver.

After review the ebuild I realize you don't need to rebuild docker, you just need to rebuild kernel (or borrow the development machine kernel) you can test with new kernel. I think you will need to cleanup /var/lib/docker before test.
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Mon Oct 25, 2021 1:19 pm    Post subject: Reply with quote

Pfft... recompiled kernel with 5.10.27-gentoo . docker info shows now overlay2 but at the same time delete **all** images and containers. I need to first rebuild them all. I'll tell how it goes once I'm done.
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
Dragonlord
Guru
Guru


Joined: 22 Aug 2004
Posts: 446
Location: Switzerland

PostPosted: Mon Oct 25, 2021 2:02 pm    Post subject: Reply with quote

Okay... after rebuilding the image starting the container seems to work in 3s. That's quite a difference. Looks like the main culprit has been "vfs" driver in docker. "overlay2" works way faster
_________________
DragonDreams: Leader and Head Programmer
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21635

PostPosted: Mon Oct 25, 2021 3:56 pm    Post subject: Reply with quote

That seems reasonable. VFS needs to make a full copy of the container's files so that any edits to them do not impact the reference copy. Overlay2 can expose the reference copy's files for reading, and create copies only when the container tries to write to files. This gives you at or nearly constant time input/output load for starting an overlay2 container, versus linear (in the size of the image) input/output load for a VFS container. This is an especially significant win for containers that make few or no edits to the files (such as a container that exists solely to do network operations).
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum