TeamCity Take on Build Pipelines

Although it is possible to setup  build pipelines in TeamCity, you will not find the term “build pipeline” in the TeamCity web interface or even in our documentation. The reason is that we have our own term: “build chain“, which we feel describes TeamCity’s build-pipeline-related features better.

First of all, while the word “pipeline” does not impose sequential execution, it still implies it. At the same time, in TeamCity a build chain is a DAG – directed acyclic graph, and as such a traditional build pipeline is just a special case of the TeamCity build chain. But there are other differences as well.

Source code consistency

Source code consistency implies that all builds of a build pipeline will use the same source code revision. Traditionally, when build pipelines are described, not much attention is paid to this problem. After all, if someone needs consistency, it can be achieved by sharing data between builds using a file system or by tagging the source code in the repository and then fetching the same tag in all builds.

But from our experience in the majority of cases users need consistency for all of the builds participating in the pipeline. For instance, if you build parts of your project on different machines in parallel and then combine them in the last build, you must be sure that all of the parts have used the same source code. Actually, any time when you need to combine results of execution of several parallel builds, you need to be sure all these builds have used the same source code.

You can spend time and achieve this consistency somehow via build scripts, but since many developers face the same problem, obviously this is the area where CI tools should help. TeamCity provides source code consistency for all builds in the chain automatically. It does it even if the build chain uses different repositories and even if the builds use repositories of different types – Git, HG, SVN, Perforce, TFS.

Triggering

We also thought that traditional approach to triggering – when the next build of the pipeline is triggered after the previous one is finished –  is limiting, especially in terms of optimizations that CI system are capable of. If the whole build pipeline is known from the beginning, then it is possible to reduce build time by making some optimizations. But if the builds in the pipeline are triggered one by one, then it will always take the same time to finish completely.

For instance, say we have a pipeline C -> B -> A, where C is the first build and A is the last:

cba_chain

If we trigger the whole pipeline at once, all builds (C, B and A) will be added into the queue and, if the system already has a finished build which used the same source code revision as build C will, the build in the queue can be substituted with such finished build reducing the total build time of the pipeline. This is exactly what TeamCity does for all build chains while they sit in the queue.

But if we trigger the builds one by one, i.e C is the first, then upon finishing C we start B, and so on, then there are much fewer opportunities for a CI system to optimize something. If something triggered C, the CI system must obey it , because this is how the user configured it. Even if there are finished builds identical to C (run on the same sources revision), at this point  the CI system cannot avoid running C, as upon finishing C, build B and eventually A must be triggered as well. If we decide to not run C,  we will not run the whole pipeline, which is definitely not what the user expects.

So this is the reason why we always recommend to avoid triggering builds one by one. If the build chain is fully automated (which is what we all try to achieve these days), then ideally there should be only one trigger – at the very last build of the chain (in terms of TeamCity – the top build, build A in the example above), which will trigger all the builds down the pipeline. Fortunately, TeamCity triggers are quite advanced, you can configure them to trigger the whole chain if a change was detected in a part of it. This has an important benefit as the set of triggers that you need to maintain can be drastically decreased.

According to the data collected from our own build server, due to this and other optimizations performed in the build queue, TeamCity greatly reduces the amount of work performed by agents daily:
queue_stats

Note that our server produces about 2500-3000 build hours per day, so if there were no optimizations like this, we’d have to add more agents, or our builds would be delayed significantly.

Data sharing

Another important aspect is how you pass data from build to build in a build chain. Obviously you can use artifacts for this task: the previous build produces artifacts and the next one uses them as input. But note that in many cases you cannot rely on the fact that the next build will be executed on the same machine as the previous one. Even if it does, a few other builds could be executed on this machine and they could remove the results that you wanted to share. Fortunately, publishing artifacts to TeamCity and then using artifact dependencies to retrieve them solves all these problems.

Besides,  in TeamCity you can also share parameters between builds. All builds in TeamCity publish their parameters upon finishing: system properties, environment variables, the parameters set in their build configuration at the moment of the build start, the parameters produced by the build with the help of service messages, etc. And all of them can be used in the subsequent builds of the build chain. For instance, this makes it possible to use the same build number for the whole build chain.

Find out more on sharing parameters.

Build pipelines view

The traditional approach to build pipelines implies that there is a dashboard showing all pipelines and changes which triggered them. Given that some of our customers have build chains consisting of hundreds of builds (we’ve seen chains with up to 400 builds) and in large projects there can be hundreds or even thousands of source code changes per day, it is obvious that a simple dashboard will not work.

TeamCity has the Build Chains tab on the build configuration level displaying an aggregated view of all of the chains where builds of this build configuration participated. It can surely be used as a dashboard to some extent, but with large build chains it quickly becomes unusable.

build_chains

Fortunately in TeamCity each build of the chain also shows the state of all of the builds it depends on. So by opening the build results of the last build in the chain (the top build) you can see the state of the whole chain: which builds are running, which builds failed, which artifacts were produced by each build, which tests failed for the whole chain, etc.

build_results_deps

Summary

Hopefully this article sheds some lights on to how build pipelines can be configured and used in TeamCity and why they work this way. To sum up:

  • a build pipeline in TeamCity is called a build chain
  • source code synchronization in TeamCity comes for free, you don’t need to do anything to achieve it
  • the data from build to build can be passed with the help of artifacts or parameters
  • if possible, avoid one-by-one triggering: the bigger the parts you trigger, the better
  • monitor the build chain results using the results page of the last build in the chain or using the Build Chains tab
This entry was posted in Blogroll, Features, FYI, How-To's, Tips&Tricks. Bookmark the permalink.

6 Responses to TeamCity Take on Build Pipelines

  1. maxkir says:

    Here is an older blog post about dependencies between TeamCity builds and how to configure them: https://blog.jetbrains.com/teamcity/2012/04/teamcity-build-dependencies-2/

  2. Alex says:

    I generally agree with the points raised in this article, however speaking with direct experience (we have build chains with over 100 builds) there are areas of TeamCity that could be improved to make build chains more effective and easier to navigate.

    Source code consistency is good, but enforcing it too strictly can also lead to substantial inefficiencies. For example if a build modifies and check-ins code, a new build chain is triggered, even when all the information is available within the build itself for the next step in the build chain to continue.
    We have a few dozen .Net projects referencing other components via NuGet and one of the steps in the build is to verify and potentially update these NuGet references, when this happens (and this could happen several times a day) we have TeamCity to check-in the updated *.csproj and packages.config files so that everyone gets the updates via NuGet restore, and in doing so a new (and unnecessary) build chain is triggered occupying valuable build time.
    Having a way for TeamCity to identify and override the default behaviour in these special cases would be definitely welcome.

    With regards to the Build pipelines view, I don’t think the Snapshot dependency list view mentioned makes it that much easier to see the status of the builds in the pipeline as it requires frequent scrolling up and down and opening the logs of failed builds to try to figure out the order of the dependencies.

    A way to improve this in my opinion would be to group builds under the same project in collapsible panels, with a simple green/red indicator for each build and an overall status. “Green” project would then be shown by default collapsed, “Red” projects expanded, greatly reducing the “noise” on screen.

  3. Hello Pavel,
    Great post, thank you!
    One thing that I would recommend is to add VCS information for each build in the build chain graph with all required information. You can have a look at GOCD implementation.
    Also, as Alex mentioned, it would be nice to somehow group builds or better add user possibility to customize view of the chain graph.

  4. Jody Shumaker says:

    Are there any plans to improve this? It’s lacking some major functionality present in Jenkins.

    1. You can’t define the pipeline in code and handle changes to the pipeline through normal code-review + merge processes.

    You can store project settings in source control, but this specifically does not handle snapshot-dependencies. Additionally, while it stores the settings, they’re not friendly to read/edit as there is no documentation about all possible values and what they mean. You’re stuck editing in the UI, but the UI can only edit things that go to the default branch.

    The update to support Kotlin is nice in that it gives a way to error check and tab completion, but I still can’t find any decent documentation on what the objects/properties actually do. Sometimes it’s clear from the name, but often it is not. You’re still stuck editing something in the UI in a test project, exporting it to Kotlin format, and examining it to figure out what are valid values or doing trial runs to determine actual behavior.

    2. The pipeline is a byproduct of a series of dependencies, it’s not a first-class item. You have to know to dig into specific configurations and then have to dig through multiple tabs to get information on the pipeline. it’s also a byproduct of multiple configurations requiring coordinated editing across them, which again requires making changes in the GUI, not in source control.

    3. Painful to do mixed branches. If you have a set of merge requests in multiple repositories for different parts of the pipeline, it’s impossible to do a build using those unless you manually kick off each individual part of the process and force it to use specific dependencies. It only works if what you want is default + a branch named identically in all repositories.

    Because TeamCity doesn’t have these, especially #1, we’re considering other options and may be migrating away from TeamCity. Thus far it seems unlikely that any of these will be added to TeamCity any time soon.

    If a future update allowed for defining a complete pipeline in code similar to Jenkins and there was full documentation on that format, then I’d think TeamCity was competing on pipelines.

    • Pavel Sher says:

      1. I’d recommend to have a look at this blog post: https://blog.jetbrains.com/teamcity/2017/02/kotlin-configuration-scripts-extending-the-teamcity-dsl/

      If you want to define pipelines in the code, this blog post shows how you can do that with help of Kotlin.

      2. Right, because TeamCity solves more general task – dependent builds, of which pipelines are only a special case. It is true however, that configured pipeline is still buried too deep in TeamCity web UI. But we’re looking for ways to improve it.

      3. Do you want a pipeline consisting of builds from different branches? Don’t you think this will introduce a lot of confusion?

Comments are closed.