Optimize Docker Images For Production

·

4 min read

Docker images contain all we need to deploy and run our application.

Docker image is read Only.

Docker layers are fundamental building blocks ie; of the filesystem/files

Base image:- OS and supporting libraries

Dependencies:- needed to support your application to run

Application:- A final step where we give our application jar/war /ear files

Problem statement:-

Solution:-

Use Minimal Base Image:-

Use a Base image with a minimal OS footprint, such as Alpine

Multistage Builds:-

A multi-stage build is a process that allows you to break the steps in building a Docker image into multiple stages. This will enable you to create images that include only the dependencies that are necessary for the desired functionality of the final application, cutting down on both time and space

With multi-stage builds, you use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don't want in the final image

Minimize the number of layers:-

Reduce the number of layers. Each RUN, COPY, ADD instructions add a new layer and each layer increases the build execution time and storage

Instead of the above , try the below one

You may not see a huge difference, but a small difference also makes better production

Save Space when installing Dependencies:-

The above packages will be installed by the package manager, and also dependencies recommended by the package manager also get installed automatically (which is not needed for our application). Use --no-install-recommends and also clean the directories using apt-get clean and rm -rf /var/lib/apt/lists/* (delete all those unnecessary packages)

Docker Cache:-

Docker stores each layer of a build in the cache. each layer/instruction initially checks in the cache and if it's not present in the cache, then it executes the instruction and stores it in the cache ,otherwise it fetches from cache which eventually saves time

Copy files:-

From the above, we can see it's taking everything from the cache and which are not available in the cache, it executes that instruction.

Utilising Docker cache effectively:-

if the layer is already available, the image will take from the cache and build the image

When it builds a layer, it directly does not build, the layer u r executing depends on the previous layer, since this layer has been modified(COPY . . Copying all folders/files instead of needed files), the layer or instruction has to be re-executed, so it once again going to build and not use the cache. So, the placement of instruction makes a huge difference.

It's not efficient usage of the docker cache. So, any instruction which you feel getting modified frequently should be at the end and instruction which will not need any modifications should be at the beginning

This will help in using the docker cache saving time and effort while creating a docker image

Explore Image Layer:-

To make it more optimized, you should have the ability to drill down and analyze layer by layer. docker history command just tells you how many layers are there. It will not tell you the actual contents of it. Then you have potential to understand where you can save with those instructions

For the same, You have an open-source tool called DIVE. DIVE is an open-source tool for exploring a docker image and its layer contents.You can integrate with CI workflow

Use Dive as below

Copy Only needed files:-

In earlier we used below command ,

COPY . .

which copies entire files/folders which may eventually cause security vulnerability issues, so similar to the .gitignore file we have .dockerignore where u can place all files to be ignored

Squashing Image Layers:-

Use DIVE tool, and squash or remove unnecessary instructions. For the same, we have docker-squash tool to squash unnecessary lines as below

Resources:-

Kindly watch below videos for more clear explanations which are really useful.Thanks Adam for the wonderful material

https://www.youtube.com/watch?v=dhQDHh2RtJE

https://www.youtube.com/watch?v=WyZ_rNPxR9I&list=RDCMUCs4dFR-31isXFkNuYSBHgLg&index=11

Conclusion:-

In Conclusion, Docker has become an essential tool in software development and deployment. Containers provide a lightweight and portable way to package applications and their dependencies. So its really important to know how to build efficient docker images