Optimize Docker Images For Production
Table of contents
Docker images contain all we need to deploy and run our application.
Docker image is read Only.
Docker layers are fundamental building blocks ie; of the filesystem/files
Base image:- OS and supporting libraries
Dependencies:- needed to support your application to run
Application:- A final step where we give our application jar/war /ear files
Problem statement:-
Solution:-
Use Minimal Base Image:-
Use a Base image with a minimal OS footprint, such as Alpine
Multistage Builds:-
A multi-stage build is a process that allows you to break the steps in building a Docker image into multiple stages. This will enable you to create images that include only the dependencies that are necessary for the desired functionality of the final application, cutting down on both time and space
With multi-stage builds, you use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don't want in the final image
Minimize the number of layers:-
Reduce the number of layers. Each RUN, COPY, ADD instructions add a new layer and each layer increases the build execution time and storage
Instead of the above , try the below one
You may not see a huge difference, but a small difference also makes better production
Save Space when installing Dependencies:-
The above packages will be installed by the package manager, and also dependencies recommended by the package manager also get installed automatically (which is not needed for our application). Use --no-install-recommends and also clean the directories using apt-get clean and rm -rf /var/lib/apt/lists/* (delete all those unnecessary packages)
Docker Cache:-
Docker stores each layer of a build in the cache. each layer/instruction initially checks in the cache and if it's not present in the cache, then it executes the instruction and stores it in the cache ,otherwise it fetches from cache which eventually saves time
Copy files:-
From the above, we can see it's taking everything from the cache and which are not available in the cache, it executes that instruction.
Utilising Docker cache effectively:-
if the layer is already available, the image will take from the cache and build the image
When it builds a layer, it directly does not build, the layer u r executing depends on the previous layer, since this layer has been modified(COPY . . Copying all folders/files instead of needed files), the layer or instruction has to be re-executed, so it once again going to build and not use the cache. So, the placement of instruction makes a huge difference.
It's not efficient usage of the docker cache. So, any instruction which you feel getting modified frequently should be at the end and instruction which will not need any modifications should be at the beginning
This will help in using the docker cache saving time and effort while creating a docker image
Explore Image Layer:-
To make it more optimized, you should have the ability to drill down and analyze layer by layer. docker history command just tells you how many layers are there. It will not tell you the actual contents of it. Then you have potential to understand where you can save with those instructions
For the same, You have an open-source tool called DIVE. DIVE is an open-source tool for exploring a docker image and its layer contents.You can integrate with CI workflow
Use Dive as below
Copy Only needed files:-
In earlier we used below command ,
COPY . .
which copies entire files/folders which may eventually cause security vulnerability issues, so similar to the .gitignore file we have .dockerignore where u can place all files to be ignored
Squashing Image Layers:-
Use DIVE tool, and squash or remove unnecessary instructions. For the same, we have docker-squash tool to squash unnecessary lines as below
Resources:-
Kindly watch below videos for more clear explanations which are really useful.Thanks Adam for the wonderful material
https://www.youtube.com/watch?v=dhQDHh2RtJE
https://www.youtube.com/watch?v=WyZ_rNPxR9I&list=RDCMUCs4dFR-31isXFkNuYSBHgLg&index=11
Conclusion:-
In Conclusion, Docker has become an essential tool in software development and deployment. Containers provide a lightweight and portable way to package applications and their dependencies. So its really important to know how to build efficient docker images