Improve Build Times with Compressed Artifacts
This guest blog post is from Philipp Dolder, Software Architect and Continuous Integration Expert at bbv Software Services AG, a new JetBrains Training and Consulting Partner. Philipp is an enthusiastic software architect, agile coach, continuous integration expert and joint-owner of open source projects FakeItEasy and Appccelerate. Follow him on Twitter @philippdolder.
The views expressed in this blog post are those of the individual, and do not necessarily represent the views of JetBrains.
Do you work with build chains and artifact dependencies? Do you wonder why it takes so much more build time when your artifacts get bigger and bigger?
You probably have the same potential for improvement as I had in my current project. Read on to learn how you can get your builds faster and waste less time waiting for your builds to complete. A fast feedback loop is of major importance, especially in agile software development.
In the project I’m currently working on, we have a fairly complex build process with many artifact dependencies between the various stages in our build chain. At some time we felt the build started slowing down. In the next retrospective meeting my team decided that I should analyze the build process and implement improvements to get us back to fast build times.
In this post I’ll only write about how I implemented the improvements. Let’s have a look at our build chain first:
In the above diagram you see the 4 stages in our build chain: build, test, installer, publish. You can see that the artifacts of the Integration-build (.NET) are used in four dependent builds: 3 test runs and the installer. These artifacts are about 1.2 GB in size and consist of thousands of dll’s and other small files. Now, keep in mind that the TeamCity system has to copy those artifacts 5 times per build (e.g. 6 GB in small files). Why 5? – You got it. Once from the Integration-build (.NET) build agent to the TeamCity server (artifact publishing) and 4 times to distribute them to the 3 test run and the create installer agents (artifact resolving).
Remember what you learned about reading and writing very small files from and to a hard disk? It is very slow compared to writing one large file with the same total size. Let’s have some numbers to see the impact on our build times: Resolving the build artifacts takes about 20 minutes for each of the 4 builds. So, it takes 80 minutes of CPU time just to resolve the artifacts in our build, which is triggered on every commit. As we are running these 4 builds in parallel, we actually “only” lose 20 minutes with artifact resolving, but it’s still too much. And it causes a higher CPU consumption on our virtualized server environment.
How can we improve and get our build faster?
Obviously we can save CPU time when we change our artifacts to a single large file. Don’t start changing your build script and use 7-zip command line to achieve that, yet!
Let’s check out what TeamCity has to offer first. You probably read about it once you created your build configurations for the first time. At least that happened to me. I remember having read about zipping artifacts before but never gave a thought about it until now. Let’s get started.
In my project we build everything into a folder called results inside our working copy. For continuous integration builds we always build in Release mode; therefore, we also have a subfolder Release in our structure. Every C# project (.csproj) builds into its own subfolder which leads to the following folder structure:
How it was before
Our build artifacts for the Integration-build (.NET) should consist of everything inside the results folder. Initially our artifact paths in the General Settings looks like this:
That means everything inside the results folder and its subfolders will be published as build artifact recursively (excluding the results folder itself).
That’s how my build configurations looked like before I started improving build times.
How it is now
We simply need to change the artifact paths of our Integration-build (.NET) build configuration to the following:
This will put everything inside the results folder into the results.zip file.
And in the build configurations depending on the artifacts we simply change the Artifacts rules to the following:
And you’re done.
What is the impact?
When it comes to the build time, we improved the artifact resolving part of our build chain drastically. It was definitely more than I would have imagined. Having about 20 minutes to resolve artifacts for one dependent build before, we are down to about 6 to 7 minutes now. So we just cut the resolving time to a third by such a simple change.
At the same time our artifact zip file shrank to a third of its initial size resulting in about 400 MB now. We also saved space and network traffic on our server.
We saved a lot of CPU time and network bandwidth with such a simple change in our build configuration without having to change our build script at all. But this is only the beginning of improving our continuous integration builds regarding the time it takes us for a complete cycle.
There is even more
This is just an incomplete list of additional options you should consider when further optimizing your build times.
Chain your build configurations properly…
to make sure you benefit from maximum parallelization of builds to have the shortest possible feedback time. Add additional build agents if necessary.
Filter unnecessary files from your artifacts…
when publishing them to minimize overall artifact size. TeamCity documentation provides more details for the configuration of artifact paths and dependencies.
Ensure good network connectivity and use fast hard disks…
to have shortest file transfer times possible.
Ensure fast unit and integration tests…
to avoid unnecessary long build times caused by long running tests that connect to real databases, unit tests that access the file system, etc.
Use TeamCity performance monitor…
to find bottlenecks, understand possible hardware upgrade impact, discover suspicious periods in the build when nothing happens and so on. See Maarten Balliauw’s blog post post for more details.