Purpose
Most of my current work involves a project with a PR test pipeline that takes 5 minutes on average, with the majority of that time building the image for the test container.
The purpose of this experiment:
- Test my intuition of where time is spent in this image build
- Find reliable ways to measure/analyse build times
- Marginally speed up the pipeline without significant output changes
Hypothesis
Given an unoptimised Docker image build (i.e. I assume optimisations have not
been performed) on an AWS based build runner, in which the build installs
software via apt-get
:
If I change
apt-get
mirror configuration for AWS local mirrors, the build time will consistently improve (by at least 10%).
Tests
Preconditions
To establish a testing “bench”:
# I set up the security group via the console, allows SSH in
$ aws ec2 run-instances \
--count=1 \
--key-name=docker-apt-experiment-pair \
--image-id=ami-0b8b10b5bf11f3a22 \
--instance-type=m5.large \
--region=ap-southeast-2 \
--instance-initiated-shutdown-behavior=terminate \
--security-group-id=sg-f797a492
{
"Groups": [],
"Instances": [
{
"ImageId": "ami-0b8b10b5bf11f3a22",
"InstanceId": "i-xxxxxxxxxxxxxxxxx",
"InstanceType": "m5.large",
"KeyName": "docker-apt-experiment-pair",
"StateReason": {
"Code": "pending",
"Message": "pending"
},
...
}
],
...
}
$ sleep 120
$ aws ec2 describe-instances --instance-id=i-xxxxxxxxxxxxxxxxx --query="Reservations[0].Instances[0].PublicIpAddress"
"xxx.xxx.xxx.xxx"
# access via:
$ ssh ec2-user@xxx-xxx-xxx-xxx
ec2-user$ sudo yum update -y \
&& sudo amazon-linux-extras install docker -y \
&& sudo service docker start \
&& sudo usermod -a -G docker ec2-user \
&& exit
$ ssh ec2-user@xxx-xxx-xxx-xxx
ec2-user$ cat <<'DOCKERFILE' >Dockerfile
...
DOCKERFILE
ec2-user$ DOCKER_BUILDKIT=1 docker build . --progress=plain
#5 [2/3] RUN apt-get update && apt-get install -y curl gnupg2 && curl -...
#5 digest: sha256:882acf9407b8212a91cbf59b1307c5a749b0c267a0f515382761732937768ba2
#5 name: "[2/3] RUN ..."
#5 started: 2020-02-05 03:28:47.354765107 +0000 UTC
#5 completed: 2020-02-05 03:28:47.354765107 +0000 UTC
#5 duration: 0s
#5 cached: true
ec2-user$ docker tag e9fc75e8ae42 base-test-image:latest
Procedure
AWS Local Mirrors
- Check initial
apt
mirror configuration:
ec2-user$ docker run --rm -it base-test-image:latest cat /etc/apt/sources.list
# deb http://snapshot.debian.org/archive/debian/20200130T000000Z buster main
deb http://deb.debian.org/debian buster main
# deb http://snapshot.debian.org/archive/debian-security/20200130T000000Z buster/updates main
deb http://security.debian.org/debian-security buster/updates main
# deb http://snapshot.debian.org/archive/debian/20200130T000000Z buster-updates main
deb http://deb.debian.org/debian buster-updates main
- Copy files from the app (
Gemfile
,Gemfile.lock
,vendor/cache
directory) - Create (control) test Dockerfile, a stripped down copy of the original:
FROM base-test-image:latest
RUN apt-get update -y \
&& apt-get install -y curl gnupg2 ca-certificates \
&& echo "deb http://nginx.org/packages/mainline/debian `awk '/VERSION=/ { print $2 }' /etc/os-release | tr -d '\"()'` nginx" \
| tee /etc/apt/sources.list.d/nginx.list \
&& curl -fsSL https://nginx.org/keys/nginx_signing.key | apt-key add - \
&& apt-get update -y \
&& apt-get install -y nginx \
&& apt-get clean
RUN gem list --installed bundler -v 1.17.3 || gem install bundler -v 1.17.3 --no-document
COPY Gemfile $APP_HOME/Gemfile
COPY Gemfile.lock $APP_HOME/Gemfile.lock
COPY vendor/cache $APP_HOME/vendor/cache
RUN bundle install --jobs=8 --deployment
ec2-user$ DOCKER_BUILDKIT=1 docker build . --progress=plain | tee log.txt
#6 [2/7] RUN apt-get update -y && apt-get install -y curl gnupg2 ca-cer...
#6 digest: sha256:f7310a936dc93e9b447cbcc8e02a5bb3478dc788fca217fcde3a063a9a4ff5b9
#6 name: "[2/7] RUN apt-get update -y && apt-get install -y curl gnupg2 ca-certificates && echo \"deb http://nginx.org/packages/mainline/debian `awk '/VERSION=/ { print $2 }' /etc/os-release | tr -d '\"()'` nginx\"
# | tee /etc/apt/sources.list.d/nginx.list && curl -fsSL https://nginx.org/keys/nginx_signing.key | apt-key add - && apt-get update -y && apt-get install -y nginx && apt-get clean"
#6 started: 2020-02-05 04:00:32.401271015 +0000 UTC
#6 completed: 2020-02-05 04:00:43.43985234 +0000 UTC
#6 duration: 12.038887614s
ec2-user$ grep -Ee ' duration: (.*)' log.txt
#5 duration: 666.359µs
#5 duration: 421.233µs
#2 duration: 54.33µs
#2 duration: 11.87551ms
#1 duration: 40.827µs
#1 duration: 16.065335ms
#3 duration: 381.784µs
#4 duration: 216.107µs
#4 duration: 838.391µs
#8 duration: 47.895µs
#8 duration: 354.487512ms
#4 duration: 1.830249432s
#6 duration: 12.038887614s
#7 duration: 1.168140122s
#9 duration: 382.25184ms
#10 duration: 370.907128ms
#11 duration: 449.935383ms
#12 duration: 59.634452659s
#13 duration: 1.837259766s
The apt-get
step took approximately 12 seconds.
Clean all images except the base image:
ec2-user$ docker run --name=placeholder base-test-image:latest \
&& docker image prune --all -f \
&& docker builder prune -f \
&& docker rm -f placeholder
Repeat the process, with the test Dockerfile:
FROM base-test-image:latest
# use cloudfront debian mirrors
RUN printf 'deb http://cloudfront.debian.net/debian buster main\ndeb http://security.debian.org/debian-security buster/updates main\ndeb http://cloudfront.debian.net/debian buster-updates main' > /etc/apt/sources.list
RUN apt-get update -y \
&& apt-get install -y curl gnupg2 ca-certificates \
&& echo "deb http://nginx.org/packages/mainline/debian `awk '/VERSION=/ { print $2 }' /etc/os-release | tr -d '\"()'` nginx" \
| tee /etc/apt/sources.list.d/nginx.list \
&& curl -fsSL https://nginx.org/keys/nginx_signing.key | apt-key add - \
&& apt-get update -y \
&& apt-get install -y nginx \
&& apt-get clean
RUN gem list --installed bundler -v 1.17.3 || gem install bundler -v 1.17.3 --no-document
COPY Gemfile $APP_HOME/Gemfile
COPY Gemfile.lock $APP_HOME/Gemfile.lock
COPY vendor/cache $APP_HOME/vendor/cache
RUN bundle install --jobs=8 --deployment
ec2-user$ DOCKER_BUILDKIT=1 docker build . --progress=plain | tee log.txt
#6 [3/7] RUN apt-get update -y && apt-get install -y curl gnupg2 ca-cer...
#7 digest: sha256:3028bb4cbad3496c5efd0bef1a84d077938633daaa0c44e0044ea4c38f1d8ced
#7 name: "[3/8] RUN apt-get update -y && apt-get install -y curl gnupg2 ca-certificates && echo \"deb http://nginx.org/packages/mainline/debian `awk '/VERSION=/ { print $2 }' /etc/os-release | tr -d '\"()'` nginx\" | tee /etc/apt/sources.list.d/nginx.list && curl -fsSL https://nginx.org/keys/nginx_signing.key | apt-key add - && apt-get update -y && apt-get install -y nginx && apt-get clean"
#7 started: 2020-02-05 04:24:11.935575415 +0000 UTC
#7 completed: 2020-02-05 04:24:24.240061429 +0000 UTC
#7 duration: 12.304486014s
ec2-user$ grep -Ee ' duration: (.*)' log.txt
#2 duration: 11.595249ms
#3 duration: 319.704µs
#5 duration: 818.323µs
#5 duration: 402.762µs
#4 duration: 691.54µs
#4 duration: 34.544µs
#1 duration: 17.267575ms
#9 duration: 65.526µs
#9 duration: 345.609248ms
#6 duration: 739.33537ms
#7 duration: 12.304486014s
#8 duration: 1.055141355s
#10 duration: 513.281193ms
#11 duration: 349.507442ms
#12 duration: 461.50172ms
#13 duration: 59.460601654s
#14 duration: 1.826289245s
Conclusion
-
The default Debian mirrors in the base Ruby image perform equivalently to the Cloudfront mirrors. In terms of total image build time, any timing differences are imperceptible.
-
The
apt-get
installation step takes less than 20% of the entire image build time: native compilation of Ruby gems takes the majority of build time, even though the gems are vendored. (Because vendored gems don’t have compiled native extensions, or because the vendored gems were vendored on a macOS machine?) -
While I deliberately chose not to profile the build before forming and testing a hypothesis, this is a great demonstration of how my intuition can differ from reality!
Other findings
DOCKER_BUILDKIT=1
makes it much easier to see how a build unfolds.- Unfortunately, Buildkit isn’t supported via
docker-compose
.
- Unfortunately, Buildkit isn’t supported via