2 years ago
#70321
Igor Brites
Docker multi-stage builds stuck between layers when using Docker-in-Docker with Jenkins and Kubernetes
Big title, I know, but it is a very specific issue.
I'm creating a new Jenkins cluster, and trying to use Docker-in-Docker containers to build images, differently from the current Jenkins cluster that uses that ugly-as-hell /var/run/docker.sock
. The context of the things being built is a monorepo with some Dockerfile
s, with builds running in parallel.
The problem is, when building huge layers (for example, after an yarn install
that downloads half of the internet), the step hangs in that Done in XX.XXs
and does not goes to the next step, whatever it is.
Sometimes the build passes successfully (generally when I change something in the cluster), but the next ones hangs forever. When it passes, I can build 8 nodejs images in ~28min, but the next ones times out after 60min.
Here follows some code to show how I'm doing this. All the other images have the same template than the provided one.
Jenkins pod template:
apiVersion: "v1" kind: "Pod" metadata: labels: name: "jnlp" jenkins/jenkins-jenkins-agent: "true" spec: containers: - env: - name: "DOCKER_HOST" value: "tcp://localhost:2375" image: "12345678910.dkr.ecr.us-east-1.amazonaws.com/kubernetes-agent:2.0" # internal image imagePullPolicy: "IfNotPresent" name: "jnlp" resources: limits: cpu: "1000m" memory: "1Gi" requests: cpu: "500m" memory: "500Mi" tty: true volumeMounts: - mountPath: "/home/jenkins" name: "workspace-volume" readOnly: false workingDir: "/home/jenkins" - args: - "--tls=false" env: - name: "DOCKER_BUILDKIT" value: "1" - name: "DOCKER_TLS_CERTDIR" value: "" - name: "DOCKER_DRIVER" value: "overlay2" image: "docker:20.10.12-dind-alpine3.15" imagePullPolicy: "IfNotPresent" name: "docker" resources: limits: memory: "4Gi" cpu: "2" requests: memory: "1Gi" cpu: "500m" securityContext: privileged: true tty: true volumeMounts: - mountPath: "/var/lib/docker" name: "docker" readOnly: false - mountPath: "/home/jenkins" name: "workspace-volume" readOnly: false workingDir: "/home/jenkins" nodeSelector: spot: "true" restartPolicy: "Never" volumes: - emptyDir: medium: "" name: "docker" - emptyDir: medium: "" name: "workspace-volume"
Dockerfile
# We don't use alpine image due to dependency issues FROM node:12.14.1-stretch-slim as base RUN apt-get update \ && DEBIAN_FRONTEND=noninteractive apt-get -y install --no-install-recommends \ apt-utils build-essential bzip2 ca-certificates cron curl g++ git libfontconfig make python \ && update-ca-certificates \ && apt-get autoremove -y \ && apt-get clean \ && rm -rf /tmp/* /var/tmp/* \ && rm -f /var/log/alternatives.log /var/log/apt/* \ && rm -rf /var/lib/apt/lists/* \ && rm /var/cache/debconf/*-old ENV NODE_ENV development # Put here, to optimize caching EXPOSE 8043 WORKDIR /opt/app RUN chown -R node:node /opt/app USER node COPY --chown=node:node package.json yarn.lock .yarnclean /opt/app/ COPY 100-wkhtmltoimage-special.conf /etc/fonts/conf.d/ RUN yarn config set network-timeout 600000 -g && \ yarn --frozen-lockfile && \ yarn autoclean --force && \ yarn cache clean FROM base as dev # --debug and inspect port EXPOSE 5858 9229 COPY --chown=node:node . /opt/app RUN npx gulp build && sh ./app-ssl FROM base as prod COPY --from=dev /opt/app /opt/app # Like `npm prune --production` RUN yarn --production --ignore-scripts --prefer-offline CMD ["yarn", "start"]
The command:
docker build \ --network host --force-rm \ --build-arg BUILDKIT_INLINE_CACHE=1 \ --cache-from 12345678910.dkr.ecr.us-east-1.amazonaws.com/name-of-my-image:latest \ --cache-from 12345678910.dkr.ecr.us-east-1.amazonaws.com/name-of-my-image:latest-dev \ --cache-from 12345678910.dkr.ecr.us-east-1.amazonaws.com/name-of-my-image:${VERSION} \ --cache-from 12345678910.dkr.ecr.us-east-1.amazonaws.com/name-of-my-image:${VERSION}-dev \ --tag 12345678910.dkr.ecr.us-east-1.amazonaws.com/name-of-my-image:${VERSION}-dev \ --tag 12345678910.dkr.ecr.us-east-1.amazonaws.com/name-of-my-image:latest-dev \ --target dev .
The end of the log:
... [2022-01-18T19:37:19.928Z] [4/5] Building fresh packages... [2022-01-18T19:37:19.928Z] [5/5] Cleaning modules... [2022-01-18T19:37:34.774Z] Done in 486.04s. [2022-01-18T19:37:34.774Z] yarn autoclean v1.21.1 [2022-01-18T19:37:34.774Z] [1/1] Cleaning modules... [2022-01-18T19:37:46.952Z] info Removed 0 files [2022-01-18T19:37:46.952Z] info Saved 0 MB. [2022-01-18T19:37:46.952Z] Done in 12.85s. [2022-01-18T19:37:46.952Z] yarn cache v1.21.1 [2022-01-18T19:38:13.453Z] success Cleared cache. [2022-01-18T19:38:13.453Z] Done in 24.21s. [2022-01-18T20:28:51.170Z] make: *** [Makefile:21: build-dev] Terminated <=== Pipeline reaches timeout! Look how long it hangs from the previous line. script returned exit code 2
If anyone needs any more information, please let me know. Thanks!
node.js
docker
jenkins
kubernetes
docker-in-docker
0 Answers
Your Answer